CN111967474B - Text line character segmentation method and device based on projection - Google Patents
Text line character segmentation method and device based on projection Download PDFInfo
- Publication number
- CN111967474B CN111967474B CN202010931307.9A CN202010931307A CN111967474B CN 111967474 B CN111967474 B CN 111967474B CN 202010931307 A CN202010931307 A CN 202010931307A CN 111967474 B CN111967474 B CN 111967474B
- Authority
- CN
- China
- Prior art keywords
- text line
- image
- projection
- character
- segmented
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 63
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012937 correction Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims description 29
- 230000001186 cumulative effect Effects 0.000 claims description 15
- 238000007781 pre-processing Methods 0.000 claims description 15
- 238000010606 normalization Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000009286 beneficial effect Effects 0.000 abstract description 6
- 238000005516 engineering process Methods 0.000 abstract description 4
- 238000012015 optical character recognition Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 4
- 230000010354 integration Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1475—Inclination or skew detection or correction of characters or of image to be recognised
- G06V30/1478—Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Abstract
The application belongs to the technical field of image recognition, and particularly relates to a text line character segmentation method and device based on projection. In the technical field of image recognition, the existing optical character recognition technology is urgent to improve the recognition rate and accuracy. The application provides a text line character segmentation method and a text line character segmentation device based on projection, wherein the method determines the actual width and the actual height of a single character through horizontal projection and vertical projection, so that the judgment on the upper boundary and the lower boundary of the character is more accurate, and the robustness is strong; by tilting the font correction, the application range of character segmentation is extended. The character projection data of the application adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and the reliability of character boundary judgment, being beneficial to the accurate and rapid segmentation of characters and being equally effective for the segmentation of slightly adhered characters and adhered special characters.
Description
Technical Field
The application relates to the technical field of image recognition, in particular to a text line character segmentation method and device based on projection.
Background
In the technical field of image recognition, particularly in optical character recognition, since the width and the height of characters are not identical, the characters cannot be segmented with equal width and height, so that the characters need to be accurately segmented for each character to be recognized efficiently and accurately.
How to more accurately and efficiently divide the upper, lower, left and right boundaries of a single character and avoid over-division or under-division of the character has been a challenge in the optical character recognition technology. The currently adopted character segmentation technology mainly comprises methods such as algorithm recognition segmentation, a horizontal projection method, a connected domain analysis method and the like, but the recognition rate and accuracy are needed to be improved.
Disclosure of Invention
The application provides a text line character segmentation method and device based on projection, which are used for solving the problem that the recognition rate and accuracy of the current character segmentation method need to be improved.
The technical scheme adopted by the application is as follows:
in a first aspect of the present application, there is provided a projection-based text line character segmentation method, comprising the steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, after the step of acquiring the image of the text line to be segmented, the method includes:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
Optionally, the step of performing tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Optionally, the step of correcting the inclined font includes a coarse positioning correction process and a fine positioning correction process which are sequentially performed;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Optionally, the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
Optionally, before the step of performing weighted summation on the grayscale image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data, the method further includes:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
Optionally, the character projection data includes maximum data of the character, actual width data of the character and actual height data of the character.
In a second aspect of the present application, there is provided a projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And the character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, the text line image obtaining module to be segmented further includes a preprocessing sub-module, where the preprocessing sub-module is configured to perform rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
Optionally, the character projection data calculation module further comprises an inclined character body correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
The technical scheme of the application has the following beneficial effects:
according to the projection-based text line character segmentation method, the actual width and the actual height of a single character are determined through horizontal projection and vertical projection, the judgment of the upper boundary and the lower boundary of the character is more accurate, the robustness is high, and the application range of character segmentation is expanded through inclined font correction. The method has relatively low algorithm complexity, and quick and accurate character segmentation, and is beneficial to improving the accuracy of character recognition. The character projection data adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and reliability of character boundary judgment, being beneficial to accurate and rapid segmentation of characters and being equally effective for segmentation of slightly adhered characters and adhered special characters.
Drawings
In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a block flow diagram of one embodiment provided by the first aspect of the present application;
FIG. 2 is a schematic diagram of a coarse positioning process and a fine positioning process according to an embodiment of the present application;
FIG. 3 is a schematic diagram of G (θ) and M (θ) in the present application.
In fig. 3, G (θ) is a horizontal gap in which a curve character is projected in the vertical direction; for ease of understanding, fig. 3 shows an image of a text line to be segmented, as shown in fig. 3, since text in the image of the text line to be segmented is not inclined, a length between two vertical dashed lines is G (θ), and when the text in the image of the text line to be segmented is in an inclined state, a value of G (θ) becomes small or even negative, because G (θ) is a horizontal gap between two characters in a vertical projection curve, and when projections of the two characters in the vertical direction overlap, the horizontal gap between the two characters is negative.
For ease of understanding, M (θ) is illustratively labeled in fig. 3, where M (θ) is the pixel cumulative average of the vertical projection curves. M (θ) represents height information data of text in a text line image to be segmented, and accumulation in the vertical direction in a character inclined state is dispersed to a peripheral position, so that the maximum value of projection values varies. Thus, M (θ) is the average value of the pixel integration values of the vertical projection curve, and specifically, the average value of the extremum in a certain percentage range in a single character may be selected, and illustratively, the average value of 5% -20% extremum may be selected, and preferably, 10% is selected, that is, the average value of the pixel integration value extremum of 10% of the characters is selected in all the characters (the extremum of the pixel integration value of any one of the remaining 90% of the characters is smaller than the extremum of the pixel integration value of any one of the 10%).
In fig. 3, in order to screen or distinguish the single characters, a character screening threshold is used, which is illustrated in fig. 3 as a horizontal line-cut vertical projection curve, and two endpoints generated after the vertical projection curve of each single character is cut, and in the opposite way, the two endpoints are single characters, so that the function of screening the single characters is played. When the character screening threshold value screens single characters, the horizontal gap between the characters is determined, and the horizontal gap G (theta) of the characters of the projection curve in the vertical direction is determined.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the application. Merely exemplary of systems and methods consistent with aspects of the application as set forth in the claims.
Referring to fig. 1, a flow diagram of one embodiment of the first aspect of the present application is provided.
In a first aspect of the present application, there is provided a projection-based text line character segmentation method, comprising the steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
In this embodiment, the text line image to be segmented is firstly determined, so as to prevent segmentation deviation and error caused by italics, and facilitate accurate segmentation. And calculating character projection data, and providing reliable data support for accurate segmentation.
Optionally, after the step of acquiring the image of the text line to be segmented, the method includes:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
In this embodiment, the preprocessing can effectively reduce the probability of error recognition, and the preprocessing text image in the horizontal direction is obtained through the rotation correction preprocessing, so that the accuracy and the efficiency of character segmentation can be greatly improved.
Optionally, the step of performing tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
In this embodiment, the horizontal gap G (θ) of the vertical projection curve character and the pixel point cumulative average M (θ) of the vertical projection curve need to be calculated for performing tilt font correction, and referring to fig. 3, the horizontal gap G (θ) of the vertical projection curve character and the pixel point cumulative average M (θ) of the vertical projection curve are exemplarily marked in the drawing, and according to the font tilt angle θ=max (G (θ) ×m (θ)), the rotation angle required for tilt font correction can be quickly and accurately calculated, so as to obtain a corrected text line image to be segmented, so as to provide a good image base for the next operation.
"Rotational deformation" and "rotation" in the present application are the same meaning, and are distinguished from conventional rotational translation about a point or wire, and mean "miscut transformation" or "shear"; miscut transformation, term of art SHEAR MAPPING or shear transformation, chinese translated as "miscut transformation" or "shear"; for the character, each single character has an external parallelogram, and when the single character is not inclined, the outline of the single character, namely the external parallelogram, is rectangular; the rotational deformation is deformation which keeps the length of the upper bottom edge and the lower bottom edge of the parallelogram unchanged, and can keep the bottom edge of the parallelogram fixed for facilitating understanding, so that the upper bottom edge of the parallelogram horizontally translates to drive the other two edges to move together. The distance between the upper bottom edge and the lower bottom edge is always kept unchanged in the process of rotational deformation.
In the present embodiment, the font tilt angle θ=max (G (θ) ×m (θ)), "=" has a meaning different from "equal to", and the font tilt angle θ=max (G (θ) ×m (θ)) has the meaning of: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle. Since only when the value of G (θ) ×m (θ) is the maximum value, the circumscribed parallelogram of the single character is rectangular, and the corresponding θ is the font tilt angle.
Optionally, the step of correcting the inclined font includes a coarse positioning correction process and a fine positioning correction process which are sequentially performed;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Referring to fig. 2, the operation amount can be effectively reduced through the coarse positioning correction process, the efficiency of correcting the inclined fonts is improved, the inclination angle of the inclined fonts is determined through the coarse positioning correction process, a relatively small range of the inclination angles of the fonts is quickly obtained, then the accurate positioning search range is further calculated in the range, the angle corresponding to the maximum value of G (θ) M (θ) is calculated and selected, and the angle is determined as the inclination angle of the fonts, so that the correction of the inclined fonts is completed.
Optionally, the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
In this embodiment, in order to further reduce the error rate of character recognition and segmentation, three projection curve weighted summation modes are selected to obtain character projection data, and the character projection data can more comprehensively and accurately reflect the character characteristics, so that the accuracy of character recognition and segmentation is improved.
Optionally, before the step of performing weighted summation on the grayscale image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data, the method further includes:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
The expansion processing in this embodiment, that is, the expansion operation in morphology, is a basic algorithm in the field of image processing algorithms. The expansion operation here allows the slightly broken curves to be connected together, avoiding over-segmentation of the character.
Optionally, the character projection data includes maximum data of the character, actual width data of the character and actual height data of the character.
In a second aspect of the present application, there is provided a projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And the character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, the text line image obtaining module to be segmented further includes a preprocessing sub-module, where the preprocessing sub-module is configured to perform rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
Optionally, the character projection data calculation module further comprises an inclined character body correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
According to the projection-based text line character segmentation method, the actual width and the actual height of a single character are determined through horizontal projection and vertical projection, the judgment of the upper boundary and the lower boundary of the character is more accurate, the robustness is high, and the application range of character segmentation is expanded through inclined font correction. The method has relatively low algorithm complexity, and quick and accurate character segmentation, and is beneficial to improving the accuracy of character recognition. The character projection data adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and reliability of character boundary judgment, being beneficial to accurate and rapid segmentation of characters and being equally effective for segmentation of slightly adhered characters and adhered special characters.
When the projection technology is used for text line character segmentation, the generation of a projection curve depends on the control of the characteristics of the data in the prior stage and the extraction of key characteristics. The projection curve may be generated by fusing multiple feature images and then generating a projection curve from the feature images, or may be deformed forms of each key feature image projection curve including but not limited to a gray level projection curve, a binary projection curve, an edge feature difference variance projection curve, and weighted summation thereof. Therefore, any calculation mode based on gray features or features showing differences between characters and background and a weighted sum projection curve generation method thereof belong to the technical scope of the application, and the execution strategies are the same and also belong to the protection scope of the application.
The above-provided detailed description is merely a few examples under the general inventive concept and does not limit the scope of the present application. Any other embodiments which are extended according to the solution of the application without inventive effort fall within the scope of protection of the application for a person skilled in the art.
Claims (8)
1. The text line character segmentation method based on projection is characterized by comprising the following steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
according to the normalized character segmentation data, character segmentation is carried out on the text line image to be segmented;
wherein the step of performing the tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves; the meaning of the font tilt angle θ=max (G (θ) ×m (θ)) is: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
2. The projection-based text line character segmentation method according to claim 1, characterized by comprising, after the step of acquiring a text line image to be segmented:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
3. The projection-based text line character segmentation method according to claim 1, wherein the step of performing the oblique font correction includes a coarse positioning correction process and a fine positioning correction process performed sequentially;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
4. A method of projection-based text line character segmentation as claimed in any one of claims 1 to 3, wherein the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
5. The projection-based text line character segmentation method as set forth in claim 4, further comprising, prior to the step of weighting and summing the grayscale image perpendicular projection curve, the binary image perpendicular projection curve, and the edge intensity differential variance projection curve to obtain character projection data:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
6. The method for character segmentation of text lines based on projection according to any one of claims 1 to 3 or 5, wherein the character projection data includes maximum data of characters, actual width data of characters, and actual height data of characters.
7. A projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
The character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data;
the character projection data calculation module further comprises an inclined character correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves; the meaning of the font tilt angle θ=max (G (θ) ×m (θ)) is: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
8. The device for text line character segmentation based on projection of claim 7, wherein the text line image acquisition module further comprises a preprocessing sub-module, and the preprocessing sub-module is used for performing rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010931307.9A CN111967474B (en) | 2020-09-07 | 2020-09-07 | Text line character segmentation method and device based on projection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010931307.9A CN111967474B (en) | 2020-09-07 | 2020-09-07 | Text line character segmentation method and device based on projection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111967474A CN111967474A (en) | 2020-11-20 |
CN111967474B true CN111967474B (en) | 2024-04-26 |
Family
ID=73392538
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010931307.9A Active CN111967474B (en) | 2020-09-07 | 2020-09-07 | Text line character segmentation method and device based on projection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111967474B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112906347B (en) * | 2021-03-22 | 2021-10-15 | 掌阅科技股份有限公司 | Character typesetting method, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102184399A (en) * | 2011-03-31 | 2011-09-14 | 上海名图信息技术有限公司 | Character segmenting method based on horizontal projection and connected domain analysis |
JP2014127161A (en) * | 2012-12-27 | 2014-07-07 | Nidec Sankyo Corp | Character segmentation device, character recognition device, character segmentation method, and program |
CN106529534A (en) * | 2016-11-07 | 2017-03-22 | 湖南源信光电科技有限公司 | Variable-length license plate character segmentation method based on hybrid tilt correction and projection method |
CN107832762A (en) * | 2017-11-06 | 2018-03-23 | 广西科技大学 | A kind of License Plate based on multi-feature fusion and recognition methods |
CN107992869A (en) * | 2016-10-26 | 2018-05-04 | 深圳超多维科技有限公司 | For tilting the method, apparatus and electronic equipment of word correction |
CN108932516A (en) * | 2018-07-11 | 2018-12-04 | 凌云光技术集团有限责任公司 | It is a kind of rotate text image bearing calibration and device |
CN110705488A (en) * | 2019-10-09 | 2020-01-17 | 广州医药信息科技有限公司 | Image character recognition method |
CN111046872A (en) * | 2019-12-12 | 2020-04-21 | 深圳市杰恩世智能科技有限公司 | Optical character recognition method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106446896B (en) * | 2015-08-04 | 2020-02-18 | 阿里巴巴集团控股有限公司 | Character segmentation method and device and electronic equipment |
-
2020
- 2020-09-07 CN CN202010931307.9A patent/CN111967474B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102184399A (en) * | 2011-03-31 | 2011-09-14 | 上海名图信息技术有限公司 | Character segmenting method based on horizontal projection and connected domain analysis |
JP2014127161A (en) * | 2012-12-27 | 2014-07-07 | Nidec Sankyo Corp | Character segmentation device, character recognition device, character segmentation method, and program |
CN107992869A (en) * | 2016-10-26 | 2018-05-04 | 深圳超多维科技有限公司 | For tilting the method, apparatus and electronic equipment of word correction |
CN106529534A (en) * | 2016-11-07 | 2017-03-22 | 湖南源信光电科技有限公司 | Variable-length license plate character segmentation method based on hybrid tilt correction and projection method |
CN107832762A (en) * | 2017-11-06 | 2018-03-23 | 广西科技大学 | A kind of License Plate based on multi-feature fusion and recognition methods |
CN108932516A (en) * | 2018-07-11 | 2018-12-04 | 凌云光技术集团有限责任公司 | It is a kind of rotate text image bearing calibration and device |
CN110705488A (en) * | 2019-10-09 | 2020-01-17 | 广州医药信息科技有限公司 | Image character recognition method |
CN111046872A (en) * | 2019-12-12 | 2020-04-21 | 深圳市杰恩世智能科技有限公司 | Optical character recognition method |
Non-Patent Citations (2)
Title |
---|
一种基于多特征提取的实用车牌识别方法;马爽;樊养余;雷涛;吴鹏;;计算机应用研究(第11期);全文 * |
基于垂直投影和模板匹配的车牌字符分割方法;程广涛;陈雪;张文治;;北华航天工业学院学报(第01期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111967474A (en) | 2020-11-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5410611A (en) | Method for identifying word bounding boxes in text | |
US7636483B2 (en) | Code type determining method and code boundary detecting method | |
US6674919B1 (en) | Method for determining the skew angle of a two-dimensional barcode | |
US20020051575A1 (en) | Method and apparatus for recognizing text in an image sequence of scene imagery | |
CN111046872B (en) | Optical character recognition method | |
CN108133216B (en) | Nixie tube reading identification method capable of realizing decimal point reading based on machine vision | |
CN110533036B (en) | Rapid inclination correction method and system for bill scanned image | |
US9317767B2 (en) | System and method for selecting segmentation parameters for optical character recognition | |
CN115082934B (en) | Method for dividing and identifying handwritten Chinese characters in financial bill | |
US6771842B1 (en) | Document image skew detection method | |
CN112101351B (en) | Text line rotation correction method and device based on projection | |
CN111967474B (en) | Text line character segmentation method and device based on projection | |
CN113392669A (en) | Image information detection method, detection device and storage medium | |
JP3411472B2 (en) | Pattern extraction device | |
CN112419207A (en) | Image correction method, device and system | |
CN113139535A (en) | OCR document recognition method | |
US20150015603A1 (en) | Method for cutting out character, character recognition apparatus using this method, and program | |
JP3099771B2 (en) | Character recognition method and apparatus, and recording medium storing character recognition program | |
CN109858484A (en) | A kind of multi-class transformation VLP correction algorithm based on deflection evaluation | |
CN113537184A (en) | OCR (optical character recognition) model training method and device, computer equipment and storage medium | |
JP2006155126A (en) | Vehicle number recognition device | |
CN108647713B (en) | Embryo boundary identification and laser track fitting method | |
CN114120320A (en) | Image multi-target information identification method, system and medium | |
CN112183574B (en) | File authentication and fake comparison method and device, terminal and storage medium | |
CN114529570A (en) | Image segmentation method, image identification method, user certificate subsidizing method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Address after: 100094 Beijing city Haidian District Cui Hunan loop 13 Hospital No. 7 Building 7 room 701 Applicant after: Lingyunguang Technology Co.,Ltd. Address before: 100094 Beijing city Haidian District Cui Hunan loop 13 Hospital No. 7 Building 7 room 701 Applicant before: Beijing lingyunguang Technology Group Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |