CN103310210B - Character recognition device, recognition dictionary generate device and method for normalizing - Google Patents

Character recognition device, recognition dictionary generate device and method for normalizing Download PDF

Info

Publication number
CN103310210B
CN103310210B CN201310027353.6A CN201310027353A CN103310210B CN 103310210 B CN103310210 B CN 103310210B CN 201310027353 A CN201310027353 A CN 201310027353A CN 103310210 B CN103310210 B CN 103310210B
Authority
CN
China
Prior art keywords
mentioned
image
mesh point
value
profile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310027353.6A
Other languages
Chinese (zh)
Other versions
CN103310210A (en
Inventor
三好利升
永崎健
新庄广
堤庸昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Information and Telecommunication Engineering Ltd
Original Assignee
Hitachi Information and Telecommunication Engineering Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Information and Telecommunication Engineering Ltd filed Critical Hitachi Information and Telecommunication Engineering Ltd
Publication of CN103310210A publication Critical patent/CN103310210A/en
Application granted granted Critical
Publication of CN103310210B publication Critical patent/CN103310210B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The present invention provides character recognition device, in order to improve the precision of Text region, different size of multiple input pictures is normalized, so that the deviation of the shape between same word diminishes.The pretreatment for reducing interference factor is performed according to input picture, extract the profile performing above-mentioned pretreated image, the image of the profile performing pretreated image and extract is synthesized, by the scope of the center as the scope close to given size of the center of gravity of image synthesized and the pixel-expansion of the image synthesized as in the way of the scope of given size, generate the image that synthesizes to the mapping of the image after the normalization of given size, perform pretreated image according to the mapping pair generated to be normalized, image after normalization is transformed to the vector value on vector space, judge that vector value is any word based on the recognition dictionary of storage in storage device, the result that output judges.

Description

Character recognition device, recognition dictionary generate device and method for normalizing
Technical field
The present invention relates to the recognition dictionary of Text region and generate device and character recognition device, particularly to the method for normalizing of character image.
Background technology
Character recognition device uses recognition dictionary, it is determined that the word classification write in input picture, and exports result of determination.At this, when such as numeral identifies, word classification is 0~9 these 10 kinds of numerals.Generate device by recognition dictionary and make recognition dictionary.
Character recognition device contains pretreatment, normalization, feature extraction from receiving the input picture handling process to output result of determination, identifies these four process steps.
Fig. 2 indicates that the flow chart of the process of conventional character recognition device execution.
In character image input portion 201, user or the program that performed by arithmetic unit carry out input picture.
Pretreatment portion 202 such as performs following steps: carry out the denoising of input picture, Fuzzy Processing (Pot か reason) smooth (smoothing) etc., remove as much as possible and become the interference factor that Text region hinders.
It follows that perform following steps in normalization portion 203: the pretreated image of various sizes is received as input, and makes the consistent size of these images.It is possible to the process after unification.
It follows that perform following steps in feature extraction unit 209: the image after normalization is received as input, and is transformed to the vector value on vector space.Above-mentioned vector space is called feature space, above-mentioned vector value is called characteristic vector.As feature extracting method, the method (non-patent literature 1) such as extensive known extraction pixel characteristic, contour feature, Gradient Features, gal cypress feature (ガ ボ Le special).If it is required, then use the dimension reduction method such as principal component analysis or linear discriminant analysis, the dimension in compressive features space, and cut down the dimension (non-patent literature 2) of feature space.
By process before this, input picture shows as the vector value (characteristic vector) on feature space.
It follows that process below performing in identification part 210: judge the word classification belonging to characteristic vector with recognition dictionary 214.Recognition dictionary 214 preserves the information belonging to which word classification for judging characteristic each point spatially.Non-patent literature 1 or non-patent literature 2 describe the detailed description of judgement about using recognition dictionary 214.
In output portion 211, to output result of determination such as display device or file such as display.
In order to carry out high-precision Text region, each process of above-mentioned pretreatment portion 202, normalization portion 203, feature extraction unit 209 and identification part 210 has important effect.Therefore, it is important for processing, with each, the process carrying out being suitable to Text region.
Environment during as the word in the input picture identifying object according to Writing utensil equipped, amanuensis, font, scanning, the preservation state of papery or paper etc. and different, even same text type, size, shape or impairment grade also have deviation.In normalization, except make input picture in the same size except, also have the purpose of the deviation of shape between the same literal type suppressing in this input picture.Thereby, it is possible to improve the discrimination of character recognition device.
In the method for normalizing of existing character image, linear normalization method (linearnormalizationmethod), non-linear normalizing method (nonlinearnormalizationmethod), square normalization method (momentnormalizationmethod), double, two square normalization methods (bi-momentnormalizationmethod), CBA method (centroid-boundaryalignmentmethod: barycenter boundary alignment method), MCBA method (modifiedcentroid-boundaryalignmentmethod), LDPF method (linedensityprojectionfittingmethod) etc..Deliver (non-patent literature 3) square normalization method in the middle of these methods with paper etc. and not double; two square normalization method has had the benchmark result of high this Text region of discrimination.
Fig. 5 is the explanation figure by square normalization method and the example of the image of not double; two square normalization method generation.Specifically, the normalized image 502 by using square normalization method to generate in the input picture 501 of normalized, input picture 501 and the normalized image 503 by using not double; two square normalization method to generate in input picture 501 are represented in Figure 5.
As it has been described above, it is known that the normalization method such as moments method and Moment invariants has high identification ability.But, owing to these methods directly use the pixel value of original image to carry out computing square, so easily affecting the fineness of character stroke.Therefore, different according to the fineness of word, the value of square differs widely, and therefore, the position of the word in normalized image is different according to the fineness of word.
Fig. 6 is the explanation figure of the example of the different fonts of same word, specifically, represents the image 601 of the word " T " of different fonts.As shown in Figure 6, judging on word, the fineness of word is not essence.Therefore, the difference of the fineness of word identification is disadvantageous by the deviation (position of word or size etc.) of the word of the normalized image between the same word classification produced.
Contour feature amount square normalization method (patent documentation 1, non-patent literature 4, non-patent literature 5) is to extract literary composition glyph the method that is normalized of the square based on text profile.It is effective to reduce due to the deviation the method between the length of word and the word of fineness, and there is high discrimination in font Text region.
Figure 10 is the explanation figure of the image after being normalized with square normalization method and contour feature amount square normalization method (contourfeaturemomentnormalizationmethod).
The original image 1001 illustrated in Figure 10 is the character image of different multiple " T " of the fineness of horizontal line respectively.And, expression by square normalization method by the image 1002 after this original image 1001 normalization.It can be seen that in normalized image 1002, along with horizontal line is thicker, the center of image offsets to the top of T, and the overall decline downwards in position of word.It addition, in normalized image 1002, original image 1001 also produces in the fineness of the ordinate of the T of identical fineness deviation.Such deviation shows as the deviation of the vector point on the feature space after feature extraction, and is the reason of discrimination reduction.On the other hand, by contour feature amount square normalization method by the image 1003 after original image 1001 normalization, these deviations decrease.
Prior art literature
Patent documentation
Patent documentation 1: Japanese Unexamined Patent Publication 2010-108113 publication
Non-patent literature
Non-patent literature 1:MohammedCheriet, NawwafKharma, ChenglinLiu, andChingSuen, " CharacterRecognitionSystems:AGuideforStudentsandPractiti oners ", Wiley-Interscience, 2007.
The strong youth of non-patent literature 2: Shi Jing, upper Tian Xiugong, Maeda Hide work, village Lai Yang, " Figure recognition ", bureau of publication of ohm company, in August, 1998
Non-patent literature 3:Cheng-LinLiu, KazukiNakashima, HiroshiSako, andHiromichiFujisawa, " Handwrittendigitrecognition:investigationofnormalization andfeatureextractiontechniques ", PatternRecognition, Vol.37, No.2, pp.265_279,2004.
Non-patent literature 4:ToshinoriMiyoshi, TakeshiNagasaki, andHiroshiShinjo, " CharacterNormalizationMethodsusingMomentsofGradientFeatu resandNormalizationCooperatedFeatureExtraction ", Proceedingsofthe2009ChineseConferenceonPatternRecognitio nandtheFirstCJKJointWorkshoponPatternRecognition, pp.934-938,2009.
The good profit of non-patent literature 5: three, forever rugged strong, Xin Zhuanguang, " using the word normalization method of the square of Gradient Features amount ", and electronic information communication association technical research report, PRMU, Figure recognition/media understand 108 (432), pp.187-192, and 2009.
Contour feature amount square normalization method is to extract literary composition glyph the method that is normalized of the square value based on the text profile portion extracted.The deviation of the method fineness or length for suppressing word is effective, especially in the identification of font word effectively.But, in the font word of handwriting and a part, literary composition glyph is lost sometimes.
Figure 13 indicates that the explanation figure of the example of the character image losing a profile part.Character image 1301 and 1302 shown in Figure 13 is all the handwriting image of word " ".On the other hand, character image 1303 and 2304 is the image of the profile extracted from character image 1301 and 1302 respectively.In character image 1302, due to the abrasion of word, a part for profile disappears.In this case, contour feature amount square becomes unstable.
Summary of the invention
Following is the representational example of the present invention.That is, a kind of character recognition device, it is characterised in that have: arithmetic unit, containing processor and storage device;Input equipment, is connected on above-mentioned arithmetic unit;And output device, it is connected on above-mentioned arithmetic unit, above-mentioned arithmetic unit performs following steps: first step, according to the input picture of storage in the input picture inputted by above-mentioned input equipment or above-mentioned storage device, perform for reducing the pretreatment becoming the interference factor that Text region hinders;Second step, is normalized performing above-mentioned pretreated image;Third step, is transformed to the vector value on vector space by the image after above-mentioned normalization;Based on the recognition dictionary of storage in above-mentioned storage device, 4th step, judges which word above-mentioned vector value is;5th step, exports the result of above-mentioned judgement by above-mentioned output device, and above-mentioned second step contains: the 6th step, extracts the profile performing above-mentioned pretreated image;7th step, will perform the image synthesis of the profile that above-mentioned pretreated image arrives with said extracted;8th step, by the scope as the pixel-expansion close to the center of scope of above-mentioned given size and the above-mentioned image synthesized of the center of gravity of the above-mentioned image synthesized as in the way of the scope of above-mentioned given size, generation is from the above-mentioned image synthesized to the mapping of the image after the normalization of above-mentioned given size;And the 9th step, perform above-mentioned pretreated image according to above-mentioned generated mapping pair and be normalized.
The effect of invention
According to an embodiment of the present invention, by being normalized based on the composograph of text profile image Yu original image, it is possible to reduce normalized unstability when text profile is lost, it is possible to increase font and the discrimination in handwriting.
Accompanying drawing explanation
Fig. 1 indicates that the block diagram of an example of the hardware configuration of the character recognition device of embodiments of the present invention.
Fig. 2 indicates that the flow chart of the process that conventional character recognition device performs.
Fig. 3 indicates that the flow chart of the summary of the Text region process performed by the arithmetic unit of embodiments of the present invention.
Fig. 4 is the explanation figure of the identifying processing performed by the arithmetic unit of embodiments of the present invention.
Fig. 5 is the explanation figure by square normalization method and the example of the image of not double; two square normalization method generation.
Fig. 6 is the explanation figure of the example of the different fonts of same word.
Fig. 7 is the explanation figure on the center of gravity of character image and the border determined by square normalization method.
Fig. 8 is the explanation figure of the first case of the extracting method of the text profile used by the arithmetic unit of embodiments of the present invention.
Fig. 9 is in embodiments of the present invention in order to extract the explanation figure of the pixel of the profile of character image and reference.
Figure 10 is the explanation figure by square normalization method and the contour feature amount normalized image of square normalization method.
Figure 11 is the explanation figure of the filter used to extract the profile of character image in embodiments of the present invention.
Figure 12 is the explanation figure of the example of the contour images extracted by the arithmetic unit of embodiments of the present invention.
Figure 13 indicates that the explanation figure of the example of the word of the part losing profile.
Symbol description
101 input equipment
102 arithmetic units
103,214 recognition dictionary
104 display devices
105 D graphics B
201 character image input portions
202 pretreatment portions
203,301 normalization portion
204,302 text profile extraction unit
206,304 metric calculation portion
207 normalized mapping generating units
208 normalized image generating units
209 feature extraction unit
210 identification parts
211 output portions
212 character image DB
213 recognition dictionary study portions
303 composograph generating units
Detailed description of the invention
Fig. 1 indicates that the block diagram of an example of the hardware configuration of the character recognition device of embodiments of the present invention.
The character recognition device of the present invention has: input equipment 101, arithmetic unit 102, recognition dictionary 103, display device 104 and graphic data base (DB) 105.
Input equipment 101 is for inputting the keyboard of order etc. or mouse and the device such as scanning device for image input.
Arithmetic unit 102 reads the image inputted, and judges the word in input picture.Arithmetic unit 102 has: CPU (CentralProcessingUnit: CPU), internal memory and storage device etc..
Recognition dictionary 103 is the dictionary database preserving recognition dictionary.
Display device 104 is the device of the process content of output arithmetic unit 102, for instance be the devices such as display.Display device 104 can also not be had, it is also possible to replace with the output device beyond display device as required when process content need not be shown.
The D graphics B105 storage figure of input equipment 101 input.
Recognition dictionary 103 and D graphics B105 can also be stored in the storage device in arithmetic unit 102.
The arithmetic unit 102 of embodiments of the present invention has: word recognition unit.Specifically, for instance, the program of storage in internal memory or storage device that performed by the CPU in arithmetic unit 102 realizes word recognition unit.
It follows that the explanation of the handling process transferred in embodiments of the present invention.
Fig. 3 indicates that the flow chart of the summary of the Text region process performed by the arithmetic unit 102 of embodiments of the present invention.
By arithmetic unit 102 (namely character image input portion 201 shown in Fig. 3, pretreatment portion 202, normalization portion 301, feature extraction unit 209, identification part 210, output portion 211 and recognition dictionary study portion 213 are, the program of storage in internal memory etc. is performed by CPU) function that realizes, it is respectively equivalent to the process step performed by arithmetic unit 102 in other words.Text profile extraction unit 302 contained by normalization portion 301, composograph generating unit 303, metric calculation portion 304, normalized mapping generating unit 207 and normalized image generating unit 208 are also same.
Character recognition device reads the image inputted, it is determined that the word in input picture, and exports result of determination.Described in as has been described, the flow chart that the Text region of the contour feature amount square normalization method that Fig. 2 is those that have previously been processes.In the middle of the Text region process that the character recognition device of present embodiment performs, text profile extraction unit 302 and processing of composograph generating unit 303 in normalization portion 301 process different from conventional Text region.
In character image input portion 201, become the image identifying object by user or inputted by the program performed by arithmetic unit 102.Such as, the scanning device contained by input equipment 101 reads text, and the data of the character image thus obtained are stored in internal memory or storage device by arithmetic unit 102.It addition, when being previously stored with the data of character image in storage device etc., it is also possible to it can be used as identification object to use.
Pretreatment portion 202 is by implementing denoising, Fuzzy Processing etc. to input picture, it is possible to reduce the interference factor becoming obstacle on noise or the word in process decision chart picture such as fuzzy.Such as, the isolated point of the size of below certain threshold value is removed with denoising.The input picture implementing pretreatment can also be stored temporarily in storage device.
Each is implemented pretreated input picture and is transformed to the image of preassigned fixed size by normalization portion 301.Image after conversion is called normalized image.One of normalized main purpose is in that, by the input picture of various sizes is transformed to the image of fixed dimension, the process after unifying.It addition, another main purpose normalized is in that, in order to make the deviation of word shape between same word diminish, variously-shaped input picture is transformed to the image of fixed dimension.Thereby, it is possible to the deviation reduced between the character image of same literal type, and contribute to the raising of accuracy of identification.Describe in detail aftermentioned.The normalized image generated with normalization portion 301 can also be stored temporarily in storage device.
The normalized image generated by normalization portion 301 is received by feature extraction unit 209 as input, and the normalized image inputted is transformed to the vector value on vector space.The vector space changing target is called feature space, and the vector value after conversion is called characteristic vector.Sometimes the dimension of feature space is cut down also by dimension compression.At this moment, remove, from feature space, the composition that the contribution identified is little as far as possible, and characteristic vector is showed as the characteristic vector on the feature space of more low-dimensional.
Identification part 210 recognition dictionary 214 judges the word classification belonging to characteristic vector.Recognition dictionary 214 keeps the information for feature space is divided into the region shared by each word classification.Thus, the word classification corresponding with the region belonging to characteristic vector is returned as result of determination.
Fig. 4 is the explanation figure of the identifying processing performed by the arithmetic unit 102 of embodiments of the present invention.
As an example, represent that classification A, classification B and classification C distinguish shared region 402A, 402B and 402C in feature space 401 in the diagram.Of all categories correspond to a word.In this instance, Unknown worm (characteristic vector of the normalized image namely inputted) 403 is not included in the other region of any sort.At this moment, the classification A corresponding with the region 402A closest to Unknown worm 403 can also be judged as the classification belonging to Unknown worm by identification part 210.Or, identification part 210 can also be judged as that Unknown worm 403 is not belonging to arbitrary classification, makes the judgement abandoned.The result (such as " classification A " or " abandoning ") that identification part 210 output judges.
Referring again to Fig. 3.Output portion 211 exports the result of determination of identification part 210 to the display devices such as display 104 or storage device etc..
It follows that before the explanation transferring to the process in normalization portion 301 of the present invention, the process in the normalization portion 203 of contour feature amount square normalization method is described.
If through pretreatment portion 202 and be input to text profile extraction unit 204 original image f (x, size y) be width W0, height H0.At this, 0≤x < W0,0≤y < H0, if representing that x and y of each mesh point is integer value, left several kth 1, lower several 2 mesh points of kth pixel value be expressed as f (k1-1, k2-1).Illustrate to be normalized into this original image the example of the image size of width L, height L.
When using contour feature amount square normalization method, first, text profile extraction unit 204 extract original image f (x, the contour images fc of word y) (and x, y).Hereinafter enumerate two examples of the extracting method of profile.
Enumerate the first case of the extracting method of text profile.First, according to character image f (x, y) extract profile horizontal ingredient f x (x, y) and longitudinal ingredient f y (x, y).
Fig. 8 is the explanation figure of the first case of the extracting method of the text profile used by the arithmetic unit of embodiments of the present invention.
Illustrate the contour images 804 of input picture 801, contour images 802, horizontal contour images 803 and longitudinal direction in fig. 8 as an example.At this, input picture 801 is the image of word " B ", and the contour images 804 of contour images 802, horizontal contour images 803 and longitudinal direction is all the example of the contour images extracted from input picture 801.Input picture 801 corresponding to f (x, y), horizontal contour images 803 corresponding to fx (x, y), longitudinal contour images 804 corresponding to fy (x, y).
At first, text profile extraction unit 204 set fx (x, y)=0, fy (x, y)=0.It follows that text profile extraction unit 204 selects input picture f (x, mesh point y), and for the feature of each grid nodes extraction contour direction in order.Incline direction counts in both direction in length and breadth.When focusing on the pixel of mesh point be black pixel, namely (x, when y)=1, feature is extracted from the information of the pixel being arranged near the pixel p shown in Fig. 9 in text profile extraction unit 204 formula (1)~(3) to p=f.
[formula 1]
[formula 2]
[formula 3]
Fig. 9 is in embodiments of the present invention in order to extract the explanation figure of the pixel of the profile of character image and reference.
Specifically, represent in fig .9 the pixel p of certain mesh point with and the position relationship 901 of pixel d1 to d7 of mesh point of its surrounding adjoined.Such as, when the coordinate of the mesh point of pixel p is (x, y) time, pixel d1, d2, d3, d4, d5, d6 and d7 the coordinate of mesh point be (x+1, y+1), (x respectively, y+1), (x-1, y+1), (x-1, y), (x-1, y-1), (x, and (x+1, y-1) y-1).
Thus, generate profile horizontal ingredient f x (x, y) and longitudinal ingredient f y (x, y).By calculate fc (x, y)=fx (and x, y)+fy (x, y) (this=be substitute into) obtain contour images fc (x, y).
Enumerate the second example of the extracting method of text profile.First, text profile extraction unit 204 be set as fx (x, y)=0, fy (x, y)=0.It follows that text profile extraction unit 204 selects input picture f (x, mesh point y), and for each grid nodes extraction feature in order.Incline direction counts in both direction in length and breadth.With formula (4), to mesh point, (x y) extracts feature to text profile extraction unit 204.
[formula 4]
f x ( x , y ) = f ( x + 1 , y - 1 ) + 2 f ( x + 1 , y ) + f ( x + 1 , y + 1 ) - f ( x - 1 , y - 1 ) - 2 f ( x - 1 , y ) - f ( x - 1 , y + 1 ) , f y ( x , y ) = f ( x - 1 , y + 1 ) + 2 f ( x , y + 1 ) + f ( x + 1 , y + 1 ) - f ( x - 1 , y - 1 ) - 2 f ( x , y - 1 ) - f ( x + 1 , y - 1 ) - - - ( 4 )
Figure 11 is the explanation figure of the filter used to extract the profile of character image in embodiments of the present invention.(filter 1102 is corresponding to fx (x, computing formula y) for x, computing formula y) corresponding to the fy of formula (4) for the filter 1101 of Figure 11.
By calculate fc (x, y)=fx (and x, y)+fy (x, y) (this=be substitute into) obtain contour images fc (x, y).
Figure 12 is the explanation figure of the example of the contour images extracted by the arithmetic unit 102 of embodiments of the present invention.Such as, contour images 1202 is extracted from the original image 1201 of literal type " " " " "or" " foxtail millet " and the handwriting of " making rich ".
It follows that metric calculation portion 206 calculates contour images fc (x, square value y).At this, by the value of the center of gravity (xc, yc) shown in formula (5) and formula (7) computing formula (6) and δ x, the δ y shown in formula (8).This δ x and δ y indicates that the parameter of the scope of the pixel-expansion of original image, uses to determine the border of original image described later.
[formula 5]
mpq=∑xyxpyqfc(x, y) ... (5)
[formula 6]
xc=m10/m00, yc=m01/mm00...(6)
[formula 7]
&mu; 20 = &Sigma; x &Sigma; y ( x - x c ) 2 f c ( x , y ) , &mu; 02 = &Sigma; x &Sigma; y ( y - y c ) 2 f c ( x , y ) . - - - ( 7 )
[formula 8]
&delta; x = &alpha; &mu; 20 / m 00 , &delta; y = &alpha; &mu; 02 / m 00 , ... ( 8 )
It follows that normalized mapping generating unit 207 generates the mapping for original image is painted into normalization plane [0, L] × [0, L].In contour feature amount square normalization method, expanded by the region of the width by having horizontal δ x, longitudinal δ y centered by the center of gravity calculated with metric calculation portion 206 (xc, yc) or dwindle into the size of L × L to generate normalized image.That is, the part of [xc-δ x/2, the xc+ δ x/2] in original image × [yc-δ y/2, yc+ δ y/2] is mapped to normalization plane [0, L] × [0, L].This mapping used is represented with formula (9).
[formula 9]
u ( x ) = L ( x - x c ) / &delta; x + L / 2 , &upsi; ( y ) = L ( y - y c ) / &delta; y + L / 2. - - - ( 9 )
It follows that the relation formula generation normalized image f ' of normalized image generating unit 208 use formula (10) (x ', y ').When this example, as described like that, by the partial enlargement of [xc-δ x/2, the xc+ δ x/2] in original image × [yc-δ y/2, yc+ δ y/2] is dwindled into the size of L × L to obtain normalized image.
[formula 10]
f &prime; ( x &prime; , y &prime; ) = f ( x , y ) x &prime; = u ( x ) , y &prime; = &upsi; ( y ) , - - - ( 10 )
As it has been described above, contour feature amount square normalization method is from original image f, (x, (x, y), and (x, square y) determines center of gravity and the border of character image with contour images fc y) to extract contour images fc.
When using square normalization method namely based on the normalization method of the square value of original image itself as in the past, with the center as the scope close to normalized image of the center of gravity of the pixel of original image, and the mode as the scope close to normalized image of the scope of the pixel-expansion of original image generates the mapping from original image to normalized image.
Fig. 7 is the explanation figure on the center of gravity of character image and the border determined by square normalization method.
Specifically, representing pretreated image (i.e. original image in described above) 701 and character image 702 in the figure 7, character image 702 comprises the display on center of gravity and the border determined based on pretreated image.Such as, according to literal type " 0 " corresponding original image 701A determines center of gravity 703A and border 704A.At this, border 704A is in original image 701A, shows the region of the word being equivalent to literal type " 0 " and the border in the region beyond it, is equivalent to the scope of the pixel-expansion of the word suitable with literal type " 0 " in other words.When using square normalization method, the second moment value δ x that use formula (8) calculates, the parameter that δ y is represented as the scope of the pixel-expansion to word, in order to have the region of the width of horizontal δ x, longitudinal δ y to define border 704A centered by center of gravity 703A.
By using the mapping generated as described above to be normalized, even if the size of the character image inputted and shape have deviation, if these are the images of same literal type, the deviation suppressing the characteristic quantity of normalized character image can be expected.
But, when using square normalization method as above, it is easy to the variation according to the fineness of the line of the character image inputted, the deviation of generation normalized image as shown in the normalized image 1002 of Figure 10.This is because, the variation etc. by the impact of fineness of the line of the word of original image of the position of the center of gravity of the pixel of original image, therefore square value becomes unstable, and the mapping thus generated also changes.
On the other hand, when using contour feature amount square normalization method (when carrying out based on the normalization of the square value of contour images), the center as the scope close to normalized image with the center of gravity of the pixel of the profile of original image, and make the mode as the scope close to normalized image of the scope of the pixel-expansion of the profile of original image, generate the mapping from original image to normalized image.At this moment, owing to deleting in the middle of original image the pixel of the part beyond profile, so the position of the center of gravity of the pixel of profile is susceptible to the impact of the fineness of the line of the word of original image.Therefore, the fineness of the mapping of square value and generation and the line of word is independently stable, and as shown in the normalized image 1003 of Figure 10, the deviation of normalized image becomes to be difficult to produce.
But, as shown in the example of the character image 1302 and 1304 of Figure 13, when the structure of literary composition glyph is lost, a part for profile cannot be extracted.Owing to a part for profile is lost thus the position change of center of gravity of pixel of profile, so when the structure of profile is lost, the value of the square calculated becomes unstable, and the deviation of the normalized image of generation becomes big between same literal type.Such deviation shows as the deviation of the vector point on the feature space after feature extraction, and is the reason of discrimination reduction.
It follows that the normalization that the normalization portion 301 of embodiments of the present invention performs is described.
Text profile extraction unit 302 can with the method same with text profile extraction unit 204 (such as above-mentioned first or second case) extract text profile image fc (x, y), it is also possible to by other method.At this, as the example of other the method extracting text profile, the 3rd and the 4th example is described.
Initial explanation the 3rd example.First, text profile extraction unit 302 for whole white pixel p=(x, y) be set to g0 (p)=g1 (p)=...=g7 (p).It follows that text profile extraction unit 302 is for all of black pixel p=(x, y) with formula (11) computing g0 (p), g1 (p) ..., g7 (p).
[formula 11]
g k + 1 ( p ) = 1 i f f ( d k ) = 0 a n d f ( d k + 1 ) = 1 0 o t h e r w i s e g ( k + 2 ) % 8 ( p ) = 1 i f f ( d k ) = f ( d k + 1 ) = 0 a n d f ( d ( k + 2 ) % 8 ) = 1 0 o t h e r w i s e - - - ( 11 )
As it is shown in figure 9, d0, d1 ..., d7 are the neighbouring pixels of pixel p.Text profile extraction unit 302 with fc (x, y)=Σ gk (and x, y) generate contour images fc (x, y).At this, k=0,1 ..., 7 scope in computing Σ gk (x, y).
Next 4th example is described.First, text profile extraction unit 302 is for whole pixel p=(x, y) with formula (12) computing gx (p), gy (p).It follows that text profile extraction unit 302 formula (13) generation contour images fc (x, y).At this, as it is shown in figure 9, d0, d1 ..., d7 are the neighbouring pixels of pixel p.
[formula 12]
g x ( p ) = &lsqb; f ( d 1 ) + 2 f ( d 0 ) + f ( d 7 ) - f ( d 3 ) - 2 f ( d 4 ) - f ( d 5 ) &rsqb; / 8 , g y ( p ) = &lsqb; f ( d 1 ) + 2 f ( d 2 ) + f ( d 3 ) - f ( d 5 ) - 2 f ( d 6 ) - f ( d 7 ) &rsqb; / 8 , - - - ( 12 )
[formula 13]
f c ( x , y ) = g x 2 ( x , y ) + g y 2 ( x , y ) ... ( 1.3 )
Above-mentioned first~the 4th example is the example of the method for the profile extracting character image, and text profile extraction unit 302 can also extract the profile of character image by the method beyond above-mentioned illustrated method.As mentioned above, the profile of character image can be extracted with following methods: the pixel value of the mesh point around the mesh point of original image increases the pixel value (being equivalent to above-mentioned first case and third example) of the image of the profile in the mesh point of this original image when meeting the condition of regulation, or be multiplied by the pixel value (being equivalent to above-mentioned second case and the 4th example) etc. that the value after the coefficient of regulation is added up to the image of profile in the mesh point calculating this original image by the pixel value of the mesh point around the mesh point to original image.
So far, make the explanation of text profile extraction unit 302 terminate, it follows that the process after synthetic images generating unit 303 is explained.Composograph generating unit 303 formula (14) generates the text profile image fc (x of each mesh point generated in text profile extraction unit 302, y) with the original image f (x of each mesh point from pretreatment portion 202 output, composograph fs y) (x, y).
[formula 14]
fs(x, y)=γ1F (x, y)+γ2fc(x, y) ... (14)
At this, γ 1 and γ 2 is positive number, and meets γ 1+ γ 2=1.This composograph is the image emphasizing original image outline portion, in other words, it is simply that be equivalent to come the image of weighting in the way of the pixel value of original image outline portion is bigger than the pixel value of the part beyond it.
With fs, (x, (x y) calculates square value y) to replace fc in metric calculation portion 304.That is, metric calculation portion 206 formula (15) replaces formula (5) to come the center of gravity (xc, yc) shown in computing formula (6) and the value of the δ x shown in formula (8), δ y.
[formula 15]
mpqxyxpyqfs(x, y) ... (15)
Next, the normalized mapping generating unit 207 of the present invention generates normalized mapping based on the square value calculated with formula (15) etc., and the normalized image generating unit 208 of the present invention normalized mapping generated generates normalized image (formula (10)).
In the embodiment of the invention described above, determine the scope (i.e. border) of the pixel-expansion of composograph based on the second moment value δ x and δ y of composograph.This scope is not necessarily consistent with the profile of the pixel of composograph.But, based on method of determining an only example of the scope of above-mentioned square value, in the present invention, it is also possible to determine the scope of the pixel-expansion of character image by method other than the above.Such as, arithmetic unit 102 can also replace calculating square value in metric calculation portion 304, and the rectangle scope circumscribed with the profile of the pixel of composograph is determined as the scope of the pixel-expansion of composograph.
Explanation so far relates to character recognition device, but character recognition device can also generate device as recognition dictionary uses.At this moment, the storage device of arithmetic unit 102 keeps character image DB212 (Fig. 3), and pretreatment portion 202 carries out pretreatment for the character image of storage in character image DB212.The process of normalization portion 301 and feature extraction unit 209 is same with above-mentioned character recognition device.Recognition dictionary study portion 213 is identified the study of dictionary based on the characteristic quantity extracted by feature extraction unit 209, and its result stores recognition dictionary 214 (being equivalent to the recognition dictionary 103 of Fig. 1).It addition, in the same manner as normalization portion 301 grade, recognition dictionary study portion 213 is the function realized with arithmetic unit 102.
As it has been described above, by embodiments of the present invention, carry out the normalization based on original image Yu the square value of the composograph of contour images.That is, calculate the square value of composograph, and generate the mapping from original image to normalized image based on this.By synthesizing, the pixel value of the outline portion of character image becomes bigger than the pixel value of the part beyond it.Its result is, compared with when carrying out based on the normalization of the square value of original image itself, owing to adding the weight of the pixel of outline portion, it is possible to alleviate the impact of the fineness of the line of word, and, compared with when carrying out based on the normalization of the square value of contour images, owing to also using the pixel of the part beyond profile, it is possible to alleviate the impact that profile disappears.So, according to present embodiment, any one in disappearing even for the fineness of line and profile also is able to realize stable normalization, thereby, it is possible to improve the discrimination of font and handwriting.
It addition, in order to maximize the effect above, it is desirable to coefficient gamma 1 and γ 2 are carried out optimization.Optimum coefficient gamma 1 and the value of γ 2 can depend on the various conditions such as the extracting method of profile, but the value that the pixel value that needs to select the outline portion of the character image in synthesized image is bigger than the pixel value of the part beyond it.Such as, the composograph generating unit 303 of present embodiment can also use the γ 1 and γ 2 that meet γ 1 < γ 2.

Claims (12)

1. a character recognition device, it is characterised in that have:
Arithmetic unit, containing processor and storage device;
Input equipment, is connected on above-mentioned arithmetic unit;And
Output device, is connected on above-mentioned arithmetic unit,
Above-mentioned arithmetic unit performs following steps:
First step, according to the input picture of storage in the input picture inputted by above-mentioned input equipment or above-mentioned storage device, is performed for reducing the pretreatment becoming the interference factor that Text region hinders;
Second step, is normalized performing above-mentioned pretreated image;
Third step, is transformed to the vector value on vector space by the image after above-mentioned normalization;
4th step, based on the recognition dictionary of storage in above-mentioned storage device, it is determined that above-mentioned vector value is any word;And
5th step, exports the result of above-mentioned judgement by above-mentioned output device,
Above-mentioned second step contains:
6th step, extracts the profile performing above-mentioned pretreated image;
7th step, will perform above-mentioned pretreated image and the image synthesis of profile that said extracted arrives;
8th step, by the scope as the pixel-expansion close to the center of scope of given size and the above-mentioned image synthesized of the center of gravity of the above-mentioned image synthesized as in the way of the scope of above-mentioned given size, generation is from the above-mentioned image synthesized to the mapping of the image after the normalization of above-mentioned given size;And
9th step, according to above-mentioned generated mapping, is normalized performing above-mentioned pretreated image,
Above-mentioned 7th step has following steps: be added by the value after the value after the pixel value performing above-mentioned pretreated image in each mesh point is multiplied by the first coefficient is multiplied by the second coefficient with the pixel value of the image to the above-mentioned profile in each mesh point, calculates the pixel value of the above-mentioned image synthesized in each mesh point.
2. character recognition device as claimed in claim 1, it is characterised in that
Above-mentioned second step is possibly together with following steps: calculate the square value parameter as the scope of the pixel-expansion representing the above-mentioned image synthesized of the above-mentioned image synthesized,
Above-mentioned 8th step contains following steps: generate the mapping expanding or reducing the above-mentioned image synthesized according to above-mentioned square value.
3. character recognition device as claimed in claim 1, it is characterised in that
Above-mentioned 6th step contains following steps: when the pixel value of the mesh point around each mesh point performing above-mentioned pretreated image meets the condition of regulation, makes the pixel value of the image of the profile in above-mentioned each mesh point increase.
4. character recognition device as claimed in claim 1, it is characterised in that
Above-mentioned 6th step contains following steps: by the value after the pixel value of the mesh point performed around each mesh point of above-mentioned pretreated image is multiplied by the coefficient of regulation being added up to, and calculates the pixel value of the image of profile in above-mentioned each mesh point.
5. a recognition dictionary generates device, it is characterised in that having arithmetic unit, this arithmetic unit contains:
Processor;And
Storage device, is connected on above-mentioned processor, and storage has character image,
Above-mentioned arithmetic unit performs following steps:
First step, performs for reducing the pretreatment becoming the interference factor that Text region hinders according to the character image of storage in above-mentioned storage device;
Second step, is normalized performing above-mentioned pretreated image;
Third step, is transformed to the vector value on vector space by the image after above-mentioned normalization;
4th step, learns the recognition dictionary used in Text region based on above-mentioned vector value;And
5th step, stores above-mentioned storage device by the result of above-mentioned study,
Above-mentioned second step contains:
6th step, extracts the profile performing above-mentioned pretreated image;
7th step, will perform above-mentioned pretreated image and the image synthesis of profile that said extracted arrives;
8th step, by the scope as the pixel-expansion close to the center of scope of given size and the above-mentioned image synthesized of the center of gravity of the above-mentioned image synthesized as in the way of the scope of above-mentioned given size, generation is from the above-mentioned image synthesized to the mapping of the image after the normalization of above-mentioned given size;And
9th step, according to above-mentioned generated mapping, is normalized performing above-mentioned pretreated image,
Above-mentioned 7th step contains following steps: be added by the value after the value after the pixel value performing above-mentioned pretreated image in each mesh point is multiplied by the first coefficient is multiplied by the second coefficient with the pixel value of the image to the above-mentioned profile in each mesh point, calculates the pixel value of the above-mentioned image synthesized in each mesh point.
6. recognition dictionary as claimed in claim 5 generates device, it is characterised in that
Above-mentioned second step is possibly together with following steps: calculate the square value parameter as the scope of the pixel-expansion representing the above-mentioned image synthesized of the above-mentioned image synthesized,
Above-mentioned 8th step contains following steps: generate the mapping expanding or reducing the above-mentioned image synthesized according to above-mentioned square value.
7. recognition dictionary as claimed in claim 5 generates device, it is characterised in that
Above-mentioned 6th step contains following steps: when the pixel value of the mesh point around each mesh point performing above-mentioned pretreated image meets the condition of regulation, makes the pixel value of the image of the profile in above-mentioned each mesh point increase.
8. recognition dictionary as claimed in claim 5 generates device, it is characterised in that
Above-mentioned 6th step contains following steps: by the value after the pixel value of the mesh point performed around each mesh point of above-mentioned pretreated image is multiplied by the coefficient of regulation being added up to, and calculates the pixel value of the image of profile in above-mentioned each mesh point.
9. a method for normalizing, is performed by arithmetic unit, and this arithmetic unit contains processor and is connected to the storage device on above-mentioned processor,
This method for normalizing is characterised by having:
First step, above-mentioned arithmetic unit extracts the profile of the original image of storage in above-mentioned storage device;
Second step, the image synthesis of the profile that above-mentioned original image and said extracted are arrived by above-mentioned arithmetic unit;
Third step, above-mentioned arithmetic unit by the scope of the center as the scope close to given size of the center of gravity of the above-mentioned image synthesized and the pixel-expansion of the above-mentioned image synthesized as in the way of the scope of above-mentioned given size, generation is from the above-mentioned image synthesized to the mapping of the image after the normalization of above-mentioned given size;And
4th step, above-mentioned arithmetic unit is normalized according to above-mentioned the generated above-mentioned original image of mapping pair, and its result is stored above-mentioned storage device,
Above-mentioned second step contains following steps: be added by the value after the value after the pixel value of the above-mentioned original image in each mesh point is multiplied by the first coefficient is multiplied by the second coefficient with the pixel value of the image to the above-mentioned profile in each mesh point, calculates the pixel value of the above-mentioned image synthesized in each mesh point.
10. method for normalizing as claimed in claim 9, it is characterised in that
Above-mentioned method for normalizing also has following steps: calculate the square value parameter as the scope of the pixel-expansion representing the above-mentioned image synthesized of the above-mentioned image synthesized,
Above-mentioned third step contains following steps: generate the mapping expanding or reducing the above-mentioned image synthesized according to above-mentioned square value.
11. method for normalizing as claimed in claim 9, it is characterised in that
Above-mentioned first step contains following steps: when the pixel value of the mesh point around each mesh point of above-mentioned original image meets the condition of regulation, makes the pixel value of the image of the profile in above-mentioned each mesh point increase.
12. method for normalizing as claimed in claim 9, it is characterised in that
Above-mentioned first step contains following steps: added up to by the value after the pixel value of the mesh point around each mesh point to above-mentioned original image is multiplied by the coefficient of regulation, calculates the pixel value of the image of profile in above-mentioned each mesh point.
CN201310027353.6A 2012-03-13 2013-01-24 Character recognition device, recognition dictionary generate device and method for normalizing Active CN103310210B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2012055638A JP5769029B2 (en) 2012-03-13 2012-03-13 Character recognition device, recognition dictionary generation device, and normalization method
JP2012-055638 2012-03-13

Publications (2)

Publication Number Publication Date
CN103310210A CN103310210A (en) 2013-09-18
CN103310210B true CN103310210B (en) 2016-06-29

Family

ID=49135406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310027353.6A Active CN103310210B (en) 2012-03-13 2013-01-24 Character recognition device, recognition dictionary generate device and method for normalizing

Country Status (2)

Country Link
JP (1) JP5769029B2 (en)
CN (1) CN103310210B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6170860B2 (en) * 2014-03-25 2017-07-26 株式会社日立情報通信エンジニアリング Character recognition device and identification function generation method
CN107274345A (en) * 2017-06-07 2017-10-20 众安信息技术服务有限公司 A kind of Chinese printable character image combining method and device
CN107194378B (en) * 2017-06-28 2020-11-17 深圳大学 Face recognition method and device based on mixed dictionary learning
CN113569859B (en) * 2021-07-27 2023-07-04 北京奇艺世纪科技有限公司 Image processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101286202A (en) * 2008-05-23 2008-10-15 中南民族大学 Multi-font multi- letter size print form charater recognition method based on 'Yi' character set
US7965293B2 (en) * 2000-09-04 2011-06-21 Minolta Co., Ltd. Image processing device, image processing method, and image processing program for reconstructing data
CN102169542A (en) * 2010-02-25 2011-08-31 汉王科技股份有限公司 Method and device for touching character segmentation in character recognition

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3052464B2 (en) * 1991-07-31 2000-06-12 日本ビクター株式会社 Contour point extraction method using multi-valued data
JP3301467B2 (en) * 1993-12-02 2002-07-15 日本電信電話株式会社 Image pattern identification and recognition method
DE69427677T2 (en) * 1993-12-02 2002-05-16 Nippon Telegraph & Telephone Image pattern identification / recognition method
JPH07160815A (en) * 1993-12-02 1995-06-23 Hitachi Eng Co Ltd Method and device for image binarization processing by contour enphasis
JP2002230481A (en) * 2001-01-30 2002-08-16 Oki Electric Ind Co Ltd Optical character reader
JP5268563B2 (en) * 2008-10-29 2013-08-21 日立コンピュータ機器株式会社 Character recognition device and recognition dictionary generation device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7965293B2 (en) * 2000-09-04 2011-06-21 Minolta Co., Ltd. Image processing device, image processing method, and image processing program for reconstructing data
CN101286202A (en) * 2008-05-23 2008-10-15 中南民族大学 Multi-font multi- letter size print form charater recognition method based on 'Yi' character set
CN102169542A (en) * 2010-02-25 2011-08-31 汉王科技股份有限公司 Method and device for touching character segmentation in character recognition

Also Published As

Publication number Publication date
CN103310210A (en) 2013-09-18
JP5769029B2 (en) 2015-08-26
JP2013190911A (en) 2013-09-26

Similar Documents

Publication Publication Date Title
US11227147B2 (en) Face image processing methods and apparatuses, and electronic devices
Mustafa et al. Binarization of document images: A comprehensive review
US20100329562A1 (en) Statistical Online Character Recognition
CN103310210B (en) Character recognition device, recognition dictionary generate device and method for normalizing
CN111242196B (en) Differential privacy protection method for interpretable deep learning
CN109740506A (en) A kind of house type image-recognizing method and device
CN107784288A (en) A kind of iteration positioning formula method for detecting human face based on deep neural network
US20050152604A1 (en) Template matching method and target image area extraction apparatus
CN110751195B (en) Fine-grained image classification method based on improved YOLOv3
US20130027419A1 (en) Image processing device and method
EP3220315A1 (en) System, method, and program for predicting information
US20110158519A1 (en) Methods for Image Characterization and Image Search
Prasad et al. Polygonal representation of digital curves
US20200202514A1 (en) Image analyzing method and electrical device
CN110060260A (en) A kind of image processing method and system
CN113392854A (en) Image texture feature extraction and classification method
CN115984662A (en) Multi-mode data pre-training and recognition method, device, equipment and medium
CN115131803A (en) Document word size identification method and device, computer equipment and storage medium
CN112801092B (en) Method for detecting character elements in natural scene image
CN114511862B (en) Form identification method and device and electronic equipment
JP5268563B2 (en) Character recognition device and recognition dictionary generation device
Sun et al. Contextual models for automatic building extraction in high resolution remote sensing image using object-based boosting method
Ukasha Arabic letters compression using new algorithm of trapezoid method
Ma et al. The study of binarization algorithm about digital rubbings image based on threshold segmentation and morphology
CN110472601B (en) Remote sensing image target object identification method, device and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: HITACHI INFORMATION COMMUNICATION ENGINEERING CO.,

Free format text: FORMER OWNER: HITACHI COMP PERIPHERALS CO.

Effective date: 20130924

C10 Entry into substantive examination
C41 Transfer of patent application or patent right or utility model
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20130924

Address after: Kanagawa

Applicant after: HITACHI INFORMATION AND TELECOMMUNICATION ENGINEERING, LTD.

Address before: Kanagawa

Applicant before: Hitachi Comp Peripherals Co.

C14 Grant of patent or utility model
GR01 Patent grant