CN111967474B - Text line character segmentation method and device based on projection - Google Patents

Text line character segmentation method and device based on projection Download PDF

Info

Publication number
CN111967474B
CN111967474B CN202010931307.9A CN202010931307A CN111967474B CN 111967474 B CN111967474 B CN 111967474B CN 202010931307 A CN202010931307 A CN 202010931307A CN 111967474 B CN111967474 B CN 111967474B
Authority
CN
China
Prior art keywords
text line
image
projection
character
segmented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010931307.9A
Other languages
Chinese (zh)
Other versions
CN111967474A (en
Inventor
王玉娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Luster LightTech Co Ltd
Original Assignee
Luster LightTech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Luster LightTech Co Ltd filed Critical Luster LightTech Co Ltd
Priority to CN202010931307.9A priority Critical patent/CN111967474B/en
Publication of CN111967474A publication Critical patent/CN111967474A/en
Application granted granted Critical
Publication of CN111967474B publication Critical patent/CN111967474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

The application belongs to the technical field of image recognition, and particularly relates to a text line character segmentation method and device based on projection. In the technical field of image recognition, the existing optical character recognition technology is urgent to improve the recognition rate and accuracy. The application provides a text line character segmentation method and a text line character segmentation device based on projection, wherein the method determines the actual width and the actual height of a single character through horizontal projection and vertical projection, so that the judgment on the upper boundary and the lower boundary of the character is more accurate, and the robustness is strong; by tilting the font correction, the application range of character segmentation is extended. The character projection data of the application adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and the reliability of character boundary judgment, being beneficial to the accurate and rapid segmentation of characters and being equally effective for the segmentation of slightly adhered characters and adhered special characters.

Description

Text line character segmentation method and device based on projection
Technical Field
The application relates to the technical field of image recognition, in particular to a text line character segmentation method and device based on projection.
Background
In the technical field of image recognition, particularly in optical character recognition, since the width and the height of characters are not identical, the characters cannot be segmented with equal width and height, so that the characters need to be accurately segmented for each character to be recognized efficiently and accurately.
How to more accurately and efficiently divide the upper, lower, left and right boundaries of a single character and avoid over-division or under-division of the character has been a challenge in the optical character recognition technology. The currently adopted character segmentation technology mainly comprises methods such as algorithm recognition segmentation, a horizontal projection method, a connected domain analysis method and the like, but the recognition rate and accuracy are needed to be improved.
Disclosure of Invention
The application provides a text line character segmentation method and device based on projection, which are used for solving the problem that the recognition rate and accuracy of the current character segmentation method need to be improved.
The technical scheme adopted by the application is as follows:
in a first aspect of the present application, there is provided a projection-based text line character segmentation method, comprising the steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, after the step of acquiring the image of the text line to be segmented, the method includes:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
Optionally, the step of performing tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Optionally, the step of correcting the inclined font includes a coarse positioning correction process and a fine positioning correction process which are sequentially performed;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Optionally, the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
Optionally, before the step of performing weighted summation on the grayscale image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data, the method further includes:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
Optionally, the character projection data includes maximum data of the character, actual width data of the character and actual height data of the character.
In a second aspect of the present application, there is provided a projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And the character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, the text line image obtaining module to be segmented further includes a preprocessing sub-module, where the preprocessing sub-module is configured to perform rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
Optionally, the character projection data calculation module further comprises an inclined character body correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
The technical scheme of the application has the following beneficial effects:
according to the projection-based text line character segmentation method, the actual width and the actual height of a single character are determined through horizontal projection and vertical projection, the judgment of the upper boundary and the lower boundary of the character is more accurate, the robustness is high, and the application range of character segmentation is expanded through inclined font correction. The method has relatively low algorithm complexity, and quick and accurate character segmentation, and is beneficial to improving the accuracy of character recognition. The character projection data adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and reliability of character boundary judgment, being beneficial to accurate and rapid segmentation of characters and being equally effective for segmentation of slightly adhered characters and adhered special characters.
Drawings
In order to more clearly illustrate the technical solution of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a block flow diagram of one embodiment provided by the first aspect of the present application;
FIG. 2 is a schematic diagram of a coarse positioning process and a fine positioning process according to an embodiment of the present application;
FIG. 3 is a schematic diagram of G (θ) and M (θ) in the present application.
In fig. 3, G (θ) is a horizontal gap in which a curve character is projected in the vertical direction; for ease of understanding, fig. 3 shows an image of a text line to be segmented, as shown in fig. 3, since text in the image of the text line to be segmented is not inclined, a length between two vertical dashed lines is G (θ), and when the text in the image of the text line to be segmented is in an inclined state, a value of G (θ) becomes small or even negative, because G (θ) is a horizontal gap between two characters in a vertical projection curve, and when projections of the two characters in the vertical direction overlap, the horizontal gap between the two characters is negative.
For ease of understanding, M (θ) is illustratively labeled in fig. 3, where M (θ) is the pixel cumulative average of the vertical projection curves. M (θ) represents height information data of text in a text line image to be segmented, and accumulation in the vertical direction in a character inclined state is dispersed to a peripheral position, so that the maximum value of projection values varies. Thus, M (θ) is the average value of the pixel integration values of the vertical projection curve, and specifically, the average value of the extremum in a certain percentage range in a single character may be selected, and illustratively, the average value of 5% -20% extremum may be selected, and preferably, 10% is selected, that is, the average value of the pixel integration value extremum of 10% of the characters is selected in all the characters (the extremum of the pixel integration value of any one of the remaining 90% of the characters is smaller than the extremum of the pixel integration value of any one of the 10%).
In fig. 3, in order to screen or distinguish the single characters, a character screening threshold is used, which is illustrated in fig. 3 as a horizontal line-cut vertical projection curve, and two endpoints generated after the vertical projection curve of each single character is cut, and in the opposite way, the two endpoints are single characters, so that the function of screening the single characters is played. When the character screening threshold value screens single characters, the horizontal gap between the characters is determined, and the horizontal gap G (theta) of the characters of the projection curve in the vertical direction is determined.
Detailed Description
Reference will now be made in detail to the embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The embodiments described in the examples below do not represent all embodiments consistent with the application. Merely exemplary of systems and methods consistent with aspects of the application as set forth in the claims.
Referring to fig. 1, a flow diagram of one embodiment of the first aspect of the present application is provided.
In a first aspect of the present application, there is provided a projection-based text line character segmentation method, comprising the steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
In this embodiment, the text line image to be segmented is firstly determined, so as to prevent segmentation deviation and error caused by italics, and facilitate accurate segmentation. And calculating character projection data, and providing reliable data support for accurate segmentation.
Optionally, after the step of acquiring the image of the text line to be segmented, the method includes:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
In this embodiment, the preprocessing can effectively reduce the probability of error recognition, and the preprocessing text image in the horizontal direction is obtained through the rotation correction preprocessing, so that the accuracy and the efficiency of character segmentation can be greatly improved.
Optionally, the step of performing tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
In this embodiment, the horizontal gap G (θ) of the vertical projection curve character and the pixel point cumulative average M (θ) of the vertical projection curve need to be calculated for performing tilt font correction, and referring to fig. 3, the horizontal gap G (θ) of the vertical projection curve character and the pixel point cumulative average M (θ) of the vertical projection curve are exemplarily marked in the drawing, and according to the font tilt angle θ=max (G (θ) ×m (θ)), the rotation angle required for tilt font correction can be quickly and accurately calculated, so as to obtain a corrected text line image to be segmented, so as to provide a good image base for the next operation.
"Rotational deformation" and "rotation" in the present application are the same meaning, and are distinguished from conventional rotational translation about a point or wire, and mean "miscut transformation" or "shear"; miscut transformation, term of art SHEAR MAPPING or shear transformation, chinese translated as "miscut transformation" or "shear"; for the character, each single character has an external parallelogram, and when the single character is not inclined, the outline of the single character, namely the external parallelogram, is rectangular; the rotational deformation is deformation which keeps the length of the upper bottom edge and the lower bottom edge of the parallelogram unchanged, and can keep the bottom edge of the parallelogram fixed for facilitating understanding, so that the upper bottom edge of the parallelogram horizontally translates to drive the other two edges to move together. The distance between the upper bottom edge and the lower bottom edge is always kept unchanged in the process of rotational deformation.
In the present embodiment, the font tilt angle θ=max (G (θ) ×m (θ)), "=" has a meaning different from "equal to", and the font tilt angle θ=max (G (θ) ×m (θ)) has the meaning of: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle. Since only when the value of G (θ) ×m (θ) is the maximum value, the circumscribed parallelogram of the single character is rectangular, and the corresponding θ is the font tilt angle.
Optionally, the step of correcting the inclined font includes a coarse positioning correction process and a fine positioning correction process which are sequentially performed;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
Referring to fig. 2, the operation amount can be effectively reduced through the coarse positioning correction process, the efficiency of correcting the inclined fonts is improved, the inclination angle of the inclined fonts is determined through the coarse positioning correction process, a relatively small range of the inclination angles of the fonts is quickly obtained, then the accurate positioning search range is further calculated in the range, the angle corresponding to the maximum value of G (θ) M (θ) is calculated and selected, and the angle is determined as the inclination angle of the fonts, so that the correction of the inclined fonts is completed.
Optionally, the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
In this embodiment, in order to further reduce the error rate of character recognition and segmentation, three projection curve weighted summation modes are selected to obtain character projection data, and the character projection data can more comprehensively and accurately reflect the character characteristics, so that the accuracy of character recognition and segmentation is improved.
Optionally, before the step of performing weighted summation on the grayscale image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data, the method further includes:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
The expansion processing in this embodiment, that is, the expansion operation in morphology, is a basic algorithm in the field of image processing algorithms. The expansion operation here allows the slightly broken curves to be connected together, avoiding over-segmentation of the character.
Optionally, the character projection data includes maximum data of the character, actual width data of the character and actual height data of the character.
In a second aspect of the present application, there is provided a projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
And the character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data.
Optionally, the text line image obtaining module to be segmented further includes a preprocessing sub-module, where the preprocessing sub-module is configured to perform rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
Optionally, the character projection data calculation module further comprises an inclined character body correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
According to the projection-based text line character segmentation method, the actual width and the actual height of a single character are determined through horizontal projection and vertical projection, the judgment of the upper boundary and the lower boundary of the character is more accurate, the robustness is high, and the application range of character segmentation is expanded through inclined font correction. The method has relatively low algorithm complexity, and quick and accurate character segmentation, and is beneficial to improving the accuracy of character recognition. The character projection data adopts the weighted summation of the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve, thereby improving the accuracy and reliability of character boundary judgment, being beneficial to accurate and rapid segmentation of characters and being equally effective for segmentation of slightly adhered characters and adhered special characters.
When the projection technology is used for text line character segmentation, the generation of a projection curve depends on the control of the characteristics of the data in the prior stage and the extraction of key characteristics. The projection curve may be generated by fusing multiple feature images and then generating a projection curve from the feature images, or may be deformed forms of each key feature image projection curve including but not limited to a gray level projection curve, a binary projection curve, an edge feature difference variance projection curve, and weighted summation thereof. Therefore, any calculation mode based on gray features or features showing differences between characters and background and a weighted sum projection curve generation method thereof belong to the technical scope of the application, and the execution strategies are the same and also belong to the protection scope of the application.
The above-provided detailed description is merely a few examples under the general inventive concept and does not limit the scope of the present application. Any other embodiments which are extended according to the solution of the application without inventive effort fall within the scope of protection of the application for a person skilled in the art.

Claims (8)

1. The text line character segmentation method based on projection is characterized by comprising the following steps of:
Acquiring a text line image to be segmented;
Judging whether fonts in the text line images to be segmented are inclined fonts or not according to the text line images to be segmented, correcting the inclined fonts if yes, calculating character projection data, and directly calculating the character projection data if no;
carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
according to the normalized character segmentation data, character segmentation is carried out on the text line image to be segmented;
wherein the step of performing the tilt font correction includes:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves; the meaning of the font tilt angle θ=max (G (θ) ×m (θ)) is: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
2. The projection-based text line character segmentation method according to claim 1, characterized by comprising, after the step of acquiring a text line image to be segmented:
And preprocessing the text line image to be segmented, wherein the preprocessing is to rotationally correct the text line image to be segmented to obtain a preprocessed text image in the horizontal direction.
3. The projection-based text line character segmentation method according to claim 1, wherein the step of performing the oblique font correction includes a coarse positioning correction process and a fine positioning correction process performed sequentially;
the coarse positioning correction process comprises the following steps:
Inputting an angle search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
Determining the inclination angle of the rough positioning fonts;
the accurate positioning correction process comprises the following steps:
calculating an accurate positioning search range;
Calculating the product of G (theta) and M (theta) under each angle;
selecting an angle corresponding to the maximum value of G (theta) and M (theta);
determining a font tilt angle;
and carrying out rotation deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented.
4. A method of projection-based text line character segmentation as claimed in any one of claims 1 to 3, wherein the step of calculating character projection data includes:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
5. The projection-based text line character segmentation method as set forth in claim 4, further comprising, prior to the step of weighting and summing the grayscale image perpendicular projection curve, the binary image perpendicular projection curve, and the edge intensity differential variance projection curve to obtain character projection data:
and performing expansion processing on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve.
6. The method for character segmentation of text lines based on projection according to any one of claims 1 to 3 or 5, wherein the character projection data includes maximum data of characters, actual width data of characters, and actual height data of characters.
7. A projection-based text line character segmentation apparatus, the apparatus comprising:
The text line image acquisition module is used for acquiring text line images to be segmented;
The character projection data calculation module is used for judging whether the fonts in the text line images to be segmented are inclined fonts according to the text line images to be segmented, if so, correcting the inclined fonts, then calculating character projection data, and if not, directly calculating the character projection data;
The data normalization module is used for carrying out normalization processing on the character projection data to obtain normalized character segmentation data;
The character segmentation module is used for carrying out character segmentation on the text line image to be segmented according to the normalized character segmentation data;
the character projection data calculation module further comprises an inclined character correction sub-module and a character projection curve sub-module;
the inclined character correction sub-module is used for executing the following steps:
rotationally deforming the text line image to be segmented;
performing vertical projection on the text line image to be segmented to obtain each group of vertical projection curves;
Calculating a horizontal gap G (theta) of each group of vertical projection curve characters;
Calculating the pixel point cumulative average value M (theta) of each group of vertical projection curves;
Calculating a font inclination angle theta=max (G (theta)) according to the horizontal gap G (theta) of each group of vertical projection curve characters and the pixel point cumulative average value M (theta) of each group of vertical projection curves; the meaning of the font tilt angle θ=max (G (θ) ×m (θ)) is: when the value of G (theta) and M (theta) is the maximum value in the rotational deformation process, the rotational deformation angle theta at the moment is the font inclination angle;
Performing rotational deformation of the angle theta on the text line image to be segmented to obtain a corrected text line image to be segmented;
The character projection curve sub-module is used for executing the following steps:
gray processing is carried out on the text line image to be segmented to obtain a text line gray image, and vertical projection is carried out on the text line gray image to obtain a gray image vertical projection curve;
performing binarization processing on the text line image to be segmented to obtain a text line binarization image, and performing vertical projection on the text line binarization image to obtain a binary image vertical projection curve;
Performing edge intensity difference variance processing on the text line image to be segmented to obtain a text line edge intensity difference variance image, and performing edge intensity difference variance projection on the text line edge intensity difference variance image to obtain an edge intensity difference variance projection curve;
and carrying out weighted summation on the gray image vertical projection curve, the binary image vertical projection curve and the edge intensity difference variance projection curve to obtain character projection data.
8. The device for text line character segmentation based on projection of claim 7, wherein the text line image acquisition module further comprises a preprocessing sub-module, and the preprocessing sub-module is used for performing rotation correction on the text line image to be segmented to obtain a preprocessed text image in a horizontal direction.
CN202010931307.9A 2020-09-07 2020-09-07 Text line character segmentation method and device based on projection Active CN111967474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010931307.9A CN111967474B (en) 2020-09-07 2020-09-07 Text line character segmentation method and device based on projection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010931307.9A CN111967474B (en) 2020-09-07 2020-09-07 Text line character segmentation method and device based on projection

Publications (2)

Publication Number Publication Date
CN111967474A CN111967474A (en) 2020-11-20
CN111967474B true CN111967474B (en) 2024-04-26

Family

ID=73392538

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010931307.9A Active CN111967474B (en) 2020-09-07 2020-09-07 Text line character segmentation method and device based on projection

Country Status (1)

Country Link
CN (1) CN111967474B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906347B (en) * 2021-03-22 2021-10-15 掌阅科技股份有限公司 Character typesetting method, electronic equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
JP2014127161A (en) * 2012-12-27 2014-07-07 Nidec Sankyo Corp Character segmentation device, character recognition device, character segmentation method, and program
CN106529534A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Variable-length license plate character segmentation method based on hybrid tilt correction and projection method
CN107832762A (en) * 2017-11-06 2018-03-23 广西科技大学 A kind of License Plate based on multi-feature fusion and recognition methods
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction
CN108932516A (en) * 2018-07-11 2018-12-04 凌云光技术集团有限责任公司 It is a kind of rotate text image bearing calibration and device
CN110705488A (en) * 2019-10-09 2020-01-17 广州医药信息科技有限公司 Image character recognition method
CN111046872A (en) * 2019-12-12 2020-04-21 深圳市杰恩世智能科技有限公司 Optical character recognition method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446896B (en) * 2015-08-04 2020-02-18 阿里巴巴集团控股有限公司 Character segmentation method and device and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102184399A (en) * 2011-03-31 2011-09-14 上海名图信息技术有限公司 Character segmenting method based on horizontal projection and connected domain analysis
JP2014127161A (en) * 2012-12-27 2014-07-07 Nidec Sankyo Corp Character segmentation device, character recognition device, character segmentation method, and program
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction
CN106529534A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Variable-length license plate character segmentation method based on hybrid tilt correction and projection method
CN107832762A (en) * 2017-11-06 2018-03-23 广西科技大学 A kind of License Plate based on multi-feature fusion and recognition methods
CN108932516A (en) * 2018-07-11 2018-12-04 凌云光技术集团有限责任公司 It is a kind of rotate text image bearing calibration and device
CN110705488A (en) * 2019-10-09 2020-01-17 广州医药信息科技有限公司 Image character recognition method
CN111046872A (en) * 2019-12-12 2020-04-21 深圳市杰恩世智能科技有限公司 Optical character recognition method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于多特征提取的实用车牌识别方法;马爽;樊养余;雷涛;吴鹏;;计算机应用研究(第11期);全文 *
基于垂直投影和模板匹配的车牌字符分割方法;程广涛;陈雪;张文治;;北华航天工业学院学报(第01期);全文 *

Also Published As

Publication number Publication date
CN111967474A (en) 2020-11-20

Similar Documents

Publication Publication Date Title
US5410611A (en) Method for identifying word bounding boxes in text
US7636483B2 (en) Code type determining method and code boundary detecting method
US6674919B1 (en) Method for determining the skew angle of a two-dimensional barcode
US20020051575A1 (en) Method and apparatus for recognizing text in an image sequence of scene imagery
CN111046872B (en) Optical character recognition method
CN108133216B (en) Nixie tube reading identification method capable of realizing decimal point reading based on machine vision
CN110533036B (en) Rapid inclination correction method and system for bill scanned image
US9317767B2 (en) System and method for selecting segmentation parameters for optical character recognition
CN115082934B (en) Method for dividing and identifying handwritten Chinese characters in financial bill
US6771842B1 (en) Document image skew detection method
CN112101351B (en) Text line rotation correction method and device based on projection
CN111967474B (en) Text line character segmentation method and device based on projection
CN113392669A (en) Image information detection method, detection device and storage medium
JP3411472B2 (en) Pattern extraction device
CN112419207A (en) Image correction method, device and system
CN113139535A (en) OCR document recognition method
US20150015603A1 (en) Method for cutting out character, character recognition apparatus using this method, and program
JP3099771B2 (en) Character recognition method and apparatus, and recording medium storing character recognition program
CN109858484A (en) A kind of multi-class transformation VLP correction algorithm based on deflection evaluation
CN113537184A (en) OCR (optical character recognition) model training method and device, computer equipment and storage medium
JP2006155126A (en) Vehicle number recognition device
CN108647713B (en) Embryo boundary identification and laser track fitting method
CN114120320A (en) Image multi-target information identification method, system and medium
CN112183574B (en) File authentication and fake comparison method and device, terminal and storage medium
CN114529570A (en) Image segmentation method, image identification method, user certificate subsidizing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100094 Beijing city Haidian District Cui Hunan loop 13 Hospital No. 7 Building 7 room 701

Applicant after: Lingyunguang Technology Co.,Ltd.

Address before: 100094 Beijing city Haidian District Cui Hunan loop 13 Hospital No. 7 Building 7 room 701

Applicant before: Beijing lingyunguang Technology Group Co.,Ltd.

GR01 Patent grant
GR01 Patent grant