Embodiment
Efficient and the precision of the embodiment of the invention in order to improve character recognition provides a kind of character identifying method and device.
Below in conjunction with Figure of description, the present invention is elaborated.
Fig. 1 is a character recognition process synoptic diagram provided by the invention, and this process may further comprise the steps:
S101: the picture to comprising character information to be identified carries out binary conversion treatment; The border of character to be identified in this picture after the identification binary conversion treatment; According to the border of confirming; Intercepting character zone to be identified in comprising the samples pictures of character information to be identified, to the character zone of this intercepting carry out binaryzation, normalization is handled.
S102: in the character zone to be identified after normalization is handled; The saltus step of identification pixel value; The position of white pixel point during according to saltus step; The position assignment 255 of respective pixel point in the character edge hum pattern, the position of other pixels are composed other value, wherein character zone equal and opposite in directions after this character edge hum pattern and this normalization.
S103: the pixel value of discerning each pixel in this character edge hum pattern; When the pixel value of pixel is 255 in recognizing this character edge hum pattern; Calculate the Grad of this pixel; And confirm the ownership direction, and adopting should belong to the direction value to relevant position assignment in the edge gradient array, other position assignment of this array are-1.
S104: each template that this edge gradient array is corresponding with each character of preservation is mated, and confirms matching distance, with matching distance minimum value corresponding characters as recognition result.
Be directed against each character to also comprising before the character recognition; Create a plurality of templates of this character; When creating each template of each character; Also need carry out normalization and handle, confirm the character edge hum pattern, and the value of each numerical value in definite edge gradient array, promptly the constructive process of template is identical with the step of in character recognition process, carrying out.After the assignment of relevant position in the edge gradient array of having confirmed the template correspondence; Adopt identical method to confirm the assignment of each relevant position in the corresponding edge gradient array of character to be identified; According to the coupling between character to be identified corresponding edge gradient array and the template; Confirm matching distance, according to the matching distance identification character.
Because in the present invention owing to gradient direction according to each pixel in the character; As the respective value in the edge gradient array; And gradient direction has stronger antijamming capability, so this character identifying method has stronger robustness, and each template corresponding with each character mated in the process of identification; According to matching distance; Matching distance minimum value corresponding characters as recognition result, therefore can be avoided the problem of the poor robustness of character single mode plate coupling, improved the scope of application of this matching process.
Below by concrete embodiment, character recognition process of the present invention is elaborated.
In order to improve the accuracy of character recognition, and improve the efficient of character recognition, need preserve a plurality of templates to each character, each template needs representative, and the diversity ratio between the template is bigger.When creating and preserve the template of character, need carry out normalization to character zone and handle, and extract the characteristic that the back character zone is handled in this normalization.
Fig. 2 is the process that the normalization in the Character mother plate constructive process provided by the invention is handled, and this process may further comprise the steps:
S201: the samples pictures that will comprise character information is carried out binary conversion treatment.
The samples pictures that generally comprises character information is a colour picture; Before this colour picture is carried out binaryzation; Need at first to convert this colour picture into the gray scale picture; Adopt corresponding binaryzation algorithm that this gray scale picture is carried out binary conversion treatment afterwards, can adopt otsu binaryzation algorithm that the gray scale picture is carried out binary conversion treatment in the present invention.
S202: four borders of the picture after the binary conversion treatment begin, respectively to the inner search of picture; During white pixel point in searching this picture, confirm that this white pixel point is positioned at the border of this character; According to the position of the white pixel point that scans from each boundary direction, confirm the border of this character.
In the present invention in order to detect the zone at character place in the picture after this binary conversion treatment; Four borders of the picture after this binary conversion treatment begin respectively; To the picture inner scanning, promptly the upper and lower, left and right four direction of the picture after this binary conversion treatment is respectively to the picture inner scanning.Concrete when when upper and lower both direction recognizes white pixel point first; Then think the row at place, upper and lower border of behavior character at this white pixel point place; When left and right both direction recognizes white pixel point first, then think the row at the place, left and right border of classifying this character as at this white pixel point place.
S203: according to the border of the character of confirming, intercepting character zone in comprising the samples pictures of character information.
After having confirmed to constitute the row and column of character boundary, can be in comprising the samples pictures of character information the picture of intercepting character zone.
S204: the character zone to intercepting carries out binary conversion treatment, and according to the size that is provided with, the picture after this binary conversion treatment is carried out normalization handle.
When intercepting from colour picture behind this character zone; The picture corresponding to this character zone still be colour picture, converts the colour picture of character zone into the gray scale picture, and this gray scale picture employing otsu binaryzation algorithm is carried out binary conversion treatment; Afterwards according to the size that is provided with; The size that for example should set in the present invention can be for wide by 24, and is high by 48, the picture after this binaryzation carried out normalization handle; And, carry out normalization according to the size of this setting and handle simultaneously to the gray scale picture after this conversion.Thereby obtain binary map and gray-scale map after the normalization.
Fig. 3 is the characteristic extraction procedure in the Character mother plate constructive process provided by the invention, and this process may further comprise the steps:
S301: in the character zone after normalization, the saltus step of identification pixel pixel value.
Character zone after this normalization is binary character figure, and the width of this binary character figure is W, highly is H.The saltus step of identification pixel value in this binary character figure, promptly the pixel value of two neighbor pixels becomes 0 by 1, or becomes 1 situation by 0.
S302: the position of white pixel point during according to the pixel value saltus step; Pixel assignment that will be corresponding with this white pixel point position in the character edge hum pattern is 255; Otherwise assignment is other value, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization.
Character zone equal and opposite in direction after this character edge hum pattern and this normalization, i.e. the equal and opposite in direction of this character edge hum pattern and binary character figure, promptly the number of line number, columns, pixel equates.In confirming this character edge hum pattern during the pixel value of each pixel; What need recognize is somebody's turn to do and the saltus step of pixel value among the binary character figure; The corresponding position of white pixel point during according to saltus step; Pixel assignment that will be corresponding with this white pixel point position in the character edge hum pattern is 255, and the position assignment of other pixels is other values in this character edge hum pattern, for example can be 0.
S303: discern the pixel value of each pixel in this character edge hum pattern, when the pixel value of pixel is 255 in recognizing this character edge hum pattern, calculate the Grad of this pixel, and confirm the ownership direction.
S304: with position assignment corresponding with this pixel position in the template is this gradient direction angle, and other position assignment are-1.
Create equal with a width template with character edge hum pattern height in the present invention, template also can be thought a two-dimensional array that equates with this character edge hum pattern height and width.In confirming this template, during the assignment of each position, scanning this character edge hum pattern, when the pixel value that scans pixel during for other values, for example is 0 o'clock, and the relevant position assignment in this template that then will be corresponding with this pixel position is-1; When the pixel value that scans pixel is 255, when promptly scanning white point, then in the grey chromatic graph after normalization on the position corresponding, according to the Grad of this pixel of computes with this pixel:
Gradient=dy/dx
Wherein, and dy=g (i, j+1)-g (i; J-1), and dx=g (i+1, j)-g (i-1; J), g (i, j) gray-scale value of this pixel correspondence position in the gray level image after normalization; I representes the row at this pixel place, and j representes the row at this pixel place, and Gradient is the Grad of this pixel of calculating.
Spend to angular range 8 five equilibriums of 360 degree 0, the corresponding gradient direction of each equal portions adopts 1 ~ 8 to come mark respectively; Grad according to this pixel that calculates; Calculate the gradient direction angle of this pixel,, confirm the direction of this gradient direction angle ownership according to this gradient direction angle of calculating.
After having created a plurality of templates to each character, each template is kept at position corresponding with each character in the ATL, promptly in ATL, preserved a plurality of templates to each character.
When discerning to character; After having obtained the picture that comprises character to be identified; Constructive process according to above-mentioned template converts this picture into gray-scale map, and adopts corresponding binaryzation algorithm; Picture to after the conversion carries out binary conversion treatment, and this binaryzation algorithm is identical with binaryzation algorithm in the template establishment process.
In the picture after binary conversion treatment; Four direction from picture begins to the picture inner scanning respectively; Discern the position of first white pixel point on each direction,, confirm the border of this character to be identified according to the position of the white pixel point that recognizes on each direction; According to the character boundary of confirming, this character zone to be identified of intercepting from the colour picture of this character to be identified.
Convert the character zone to be identified of intercepting into gray-scale map, and adopt corresponding binaryzation algorithm, this gray-scale map is carried out binary conversion treatment; And according to the size that is provided with, the character zone to be identified after this gray-scale map and the binary conversion treatment is carried out normalization handle, the size that wherein should be provided with; Be provided with in the template establishment process big or small identical; For example all be wide by 24, high 48 etc., and the binaryzation algorithm that adopts here is also identical with the binaryzation algorithm of normalization process employing in the template establishment.
After this character zone to be identified after the binaryzation carried out normalization and handle; Discern the saltus step of the pixel value of pixel after this normalization; Occur from 0 to 1 when recognizing pixel value, perhaps during from 1 to 0 saltus step, the position of white pixel point during according to saltus step; The position assignment of respective pixel point is 255 in the character edge hum pattern, and the position assignment of rest of pixels point is 0.
The pixel value of each pixel in the character edge hum pattern after the identification assignment; When to recognize pixel value be 255 pixel; Be adjacent gray values of pixel points according to this pixel in the gray-scale map after the normalization; Calculate the Grad of this pixel,, confirm the gradient direction of this pixel according to the Grad of this pixel that calculates.
Gradient direction according to this pixel of confirming; Spend in the 360 degree angular ranges in 8 directions of five equilibrium 0; Confirm the direction of this gradient direction ownership; With the direction of its ownership as in the character edge gradient array to be identified to numerical value that should the pixel position, other position assignment of this array are-1.
Each template that this edge gradient array is corresponding with each character of preservation is mated, and confirms the matching distance with each template, specifically when confirming matching distance, according to following formula:
Wherein,
(i is that i is capable in the edge gradient array of character to be identified j) to c, the numerical value of j row; (i is that i is capable in the template j) to t, the numerical value of j row; H is the height of normalization rear pattern plate; W is the width of normalization rear pattern plate, is not equal to-1 number of times in the edge gradient array of the numerical value of S according to character to be identified, and promptly each position of template is not equal to-1 number of times and confirms.
Fig. 4 is the structural representation of character recognition device provided by the invention, and this device comprises:
Normalization module 41; Be used for the picture that comprises character information to be identified is carried out binary conversion treatment; The border of character to be identified in this picture after the identification binary conversion treatment; According to the border of confirming, intercepting character zone to be identified in comprising the samples pictures of character information to be identified, to the character zone of this intercepting carry out binaryzation, normalization is handled;
Marginal information determination module 42; Be used for the character zone to be identified after normalization is handled; The saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in the character edge hum pattern; Other value, wherein character zone equal and opposite in directions after this character edge hum pattern and this normalization are composed in the position of other pixels;
Gradient direction determination module 43; Be used for discerning the pixel value of this each pixel of character edge hum pattern; When the pixel value of pixel is 255 in recognizing this character edge hum pattern, calculate the Grad of this pixel, and confirm the ownership direction; Adopt should belong to the direction value to relevant position assignment in the edge gradient array, other position assignment of this array are-1;
The coupling identification module 44, be used for each template that this edge gradient array is corresponding with each character of preservation and mate, confirm matching distance, with matching distance minimum value corresponding characters as recognition result.
Said normalization module 41 also is used for when drawing template establishment, and the samples pictures that comprises character information is carried out binary conversion treatment, the border of character in this picture after the identification binary conversion treatment; According to the character boundary of confirming, intercepting character zone in comprising the samples pictures of character information, to the character zone of this intercepting carry out binaryzation, normalization is handled;
Said marginal information determination module 42; Also be used for when drawing template establishment, in the character zone after normalization is handled, the saltus step of identification pixel value; The position of white pixel point during according to saltus step; The position assignment 255 of respective pixel point in the character edge hum pattern, the position of other pixels are composed other value, wherein character zone equal and opposite in directions after this character edge hum pattern and this normalization;
Said gradient direction determination module 43; Also be used for when drawing template establishment, discern the pixel value of each pixel in this character edge hum pattern, when the pixel value of pixel is 255 in recognizing this character edge hum pattern; Calculate the Grad of this pixel; And confirm the ownership direction, and be the direction value with position assignment corresponding in the template with this pixel position, other position assignment are-1.
Said normalization module 41, four borders that specifically are used for the picture after the binary conversion treatment begin, respectively to the inner search of picture; During white pixel point in searching this picture, confirm that this white pixel point is positioned at the border of this character; According to the position of the white pixel point that scans from each boundary direction, confirm the border of this character.
Said normalization module 41 also is used for converting the character zone of this intercepting into gray-scale map, and carries out normalization and handle;
Said gradient direction determination module 43 is used for the grey chromatic graph position corresponding with this pixel after normalization, according to the Grad of this pixel of computes:
Gradient=dy/dx
Wherein, and dy=g (i, j+1)-g (i; J-1), and dx=g (i+1, j)-g (i-1; J), g (i, j) gray-scale value of this pixel correspondence position in the gray level image after normalization; I representes the row at this pixel place, and j representes the row at this pixel place, and Gradient is the Grad of this pixel of calculating;
Based on this Grad that calculates, compute gradient deflection;
According to this gradient direction angle, and 8 directions of between 0 to 360 degree, dividing, confirm the direction that this gradient direction angle belongs to.
Said coupling identification module 44 specifically is used for basis
Confirm matching distance, wherein,
(i is that i is capable in the edge gradient array of character to be identified j) to c, the numerical value of j row; (i is that i is capable in the template j) to t, the numerical value of j row; H is the height of normalization rear pattern plate; W is the width of normalization rear pattern plate, is not equal to-1 number of times in the edge gradient array of the numerical value of S according to character to be identified, and promptly each position of template is not equal to-1 number of times and confirms.
The invention provides a kind of character identifying method and device; This method carries out the normalization processing, confirms the character edge hum pattern during to character recognition to be identified and template establishment, and the value of each numerical value in definite edge gradient array; In having confirmed the edge gradient array after the assignment of relevant position; According to the coupling between character to be identified corresponding edge gradient array each template corresponding, confirm matching distance, according to the matching distance identification character with each character.Because in the present invention according to the gradient direction of each pixel in the character; As the respective value in the edge gradient array; And gradient direction has stronger antijamming capability, so this character identifying method has stronger robustness, and each template corresponding with each character mated in the process of identification; According to matching distance; Matching distance minimum value corresponding characters as recognition result, therefore can be avoided the problem of the poor robustness of character single mode plate coupling, improved the scope of application of this matching process.
Above-mentioned explanation illustrates and has described a preferred embodiment of the present invention; But as previously mentioned; Be to be understood that the present invention is not limited to the form that this paper discloses, should do not regard eliminating as, and can be used for various other combinations, modification and environment other embodiment; And can in invention contemplated scope described herein, improve through the technology or the knowledge of above-mentioned design or association area.And change that those skilled in the art carried out and variation do not break away from the spirit and scope of the present invention, then all should be in the protection domain of accompanying claims of the present invention.