CN113159031A - Handwritten text detection method and device and storage medium - Google Patents

Handwritten text detection method and device and storage medium Download PDF

Info

Publication number
CN113159031A
CN113159031A CN202110428121.6A CN202110428121A CN113159031A CN 113159031 A CN113159031 A CN 113159031A CN 202110428121 A CN202110428121 A CN 202110428121A CN 113159031 A CN113159031 A CN 113159031A
Authority
CN
China
Prior art keywords
text
line
text line
positioning information
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110428121.6A
Other languages
Chinese (zh)
Inventor
陈鹏飞
毛亮
陈映庭
杨晓帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huiyi Culture Technology Co ltd
Original Assignee
Guangzhou Huiyi Culture Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huiyi Culture Technology Co ltd filed Critical Guangzhou Huiyi Culture Technology Co ltd
Priority to CN202110428121.6A priority Critical patent/CN113159031A/en
Publication of CN113159031A publication Critical patent/CN113159031A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention discloses a handwritten text detection method, a device and a storage medium, wherein the method comprises the following steps: inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information; performing affine transformation correction on original text lines in a text picture to be detected according to the text line positioning information to obtain corrected text lines; dividing the single character of the corrected text line according to the horizontal projection to obtain a candidate character area; and calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain a final character detection result. The embodiment of the invention not only can effectively correct text lines with different angles and different directions, but also can accurately combine the characters divided by the left and right components, and detect the handwritten text by combining the characteristics of Chinese characters, thereby further improving the accuracy and reliability of text detection.

Description

Handwritten text detection method and device and storage medium
Technical Field
The invention relates to the technical field of computer vision, in particular to a handwritten text detection method, a handwritten text detection device and a storage medium.
Background
Text detection and recognition have wide application scenes in daily life, such as identification card recognition, ticket recognition, license plate recognition, form recognition and the like. The handwritten text is more changeable in shape compared with the printed text, and the corresponding detection and recognition difficulty is higher. Most of the existing text detection methods aim at printed text, the printed text is more orderly arranged compared with handwritten text lines, and the detection of a single character is easy to carry out no matter the traditional method or the deep learning method. However, due to the fact that the handwritten text lines have different heights and the characters have left-right and up-down structures, the conventional text detection method is difficult to accurately detect the handwritten text.
Disclosure of Invention
The invention provides a handwritten text detection method, which aims to solve the technical problem that the conventional text detection method is difficult to accurately detect handwritten texts.
A first embodiment of the present invention provides a handwritten text detection method, including:
inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information;
performing affine transformation correction on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line;
dividing the single character of the corrected text line according to horizontal projection to obtain a candidate character area;
and calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain a final character detection result.
Further, the method for locating the text line of the text picture to be detected by using the key point location algorithm to obtain text line location information specifically comprises the following steps:
and adding a key point output branch on the basis of the yolov3 key point positioning algorithm to improve the yolov3 key point positioning algorithm, and positioning the text line of the text picture to be detected according to the improved yolov3 key point positioning algorithm to obtain text line positioning information.
Further, the original text line includes an oblique text line and a text line with inconsistent height, and the affine transformation correction is performed on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line, specifically:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the key point coordinates of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the inclined text line and the high-low inconsistent text line to a text correction text line according to the affine transformation matrix.
Further, the segmenting the single character of the corrected text textual line according to the horizontal projection to obtain a candidate character region specifically includes:
after binarization processing is carried out on the corrected text line, accumulating pixel values in the corrected text line in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain candidate character areas.
Further, the calculating an average width of the bounding boxes of the whole line of characters in the candidate character region, and merging the bounding boxes of the whole line of characters according to the average width to obtain a final character detection result, specifically:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
A second embodiment of the present invention provides a handwritten text detection apparatus, including:
the positioning module is used for inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information;
the correction module is used for carrying out affine transformation correction on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line;
the segmentation module is used for segmenting the single character of the corrected text line according to the horizontal projection to obtain a candidate character area;
and the merging module is used for calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain a final character detection result.
Further, the correction module is specifically configured to:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the key point coordinates of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the inclined text line and the high-low inconsistent text line to a text correction text line according to the affine transformation matrix.
Further, the segmentation module is specifically configured to: after binarization processing is carried out on the corrected text line, accumulating pixel values in the corrected text line in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain candidate character areas.
Further, the merging module is specifically configured to:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
A third embodiment of the present invention provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute a handwritten text detection method as described above.
The embodiment of the invention adopts the key point positioning algorithm to position the text line to obtain accurate text line positioning information, and corrects the text line in the text picture to be detected through radioactive transformation according to the text line positioning information, so that the text lines at different angles and different directions can be effectively corrected, and the accuracy of text detection can be improved; the embodiment of the invention can accurately combine the characters divided from the left and right components in the preliminary detection, and the surrounding frame of each character is subjected to position adjustment, so that the handwritten text is detected by combining the characteristics of Chinese characters, and the accuracy and the reliability of text detection are further improved.
Drawings
Fig. 1 is a schematic flowchart of a handwritten text detection method provided in an embodiment of the present invention;
FIG. 2 is a diagram illustrating the effect of text line positioning according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating an effect of text line rectification according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an effect of segmenting a whole line of text according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an effect of text detection according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a handwritten text detection apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the description of the present application, it is to be understood that the terms "first", "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless otherwise specified.
In the description of the present application, it is to be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present application can be understood in a specific case by those of ordinary skill in the art.
Referring to fig. 1-5, a first embodiment of the present invention is shown. A first embodiment of the present invention provides a handwritten text detection method as shown in fig. 1, including:
s1, inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information;
s2, performing affine transformation correction on the original text lines in the text picture to be detected according to the text line positioning information to obtain corrected text lines;
s3, segmenting the single character of the corrected text line according to the horizontal projection to obtain a candidate character area;
and S4, calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain the final character detection result.
The embodiment of the invention adopts the key point positioning algorithm to position the text line to obtain accurate text line positioning information, and corrects the text line in the text picture to be detected through radioactive transformation according to the text line positioning information, so that the text lines at different angles and different directions can be effectively corrected, and the accuracy of text detection can be improved; the embodiment of the invention can accurately combine the characters divided from the left and right components in the preliminary detection, and the surrounding frame of each character is subjected to position adjustment, so that the handwritten text is detected by combining the characteristics of Chinese characters, and the accuracy and the reliability of text detection are further improved.
As a specific implementation manner of the embodiment of the present invention, a key point positioning algorithm is used to position a text line of a text picture to be detected, so as to obtain text line positioning information, which specifically includes:
a key point output branch is added on the basis of the yolov3 key point positioning algorithm to improve the yolov3 key point positioning algorithm, and the text line of the text picture to be detected is positioned according to the improved yolov3 key point positioning algorithm to obtain text line positioning information.
Referring to fig. 2, an effect diagram of text line positioning according to an embodiment of the present invention is provided. In the embodiment of the invention, the improved yolov3 key point positioning algorithm can realize the detection and key point positioning of the text line at the same time, and is favorable for improving the positioning accuracy of the text line.
As a specific implementation manner of the embodiment of the present invention, the original text line includes an oblique text line and a text line with inconsistent height, and affine transformation correction is performed on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line, specifically:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the coordinates of the key points of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the oblique text lines and the inconsistent text lines in height to the text correction text lines according to the affine transformation matrix.
Referring to fig. 3, an effect diagram of text line rectification according to an embodiment of the invention is shown. In the embodiment of the invention, the side lengths of the quadrangles connected according to the four-point positioning information are respectively as follows: the upper side length W1, the lower side length W2, the left side length H1 and the right side length H2 determine the coordinates of the target correction rectangle to be (0,0), ((W1+ W2)/2,0), ((W1+ W2)/2, (H1+ H2)/2), (0), (H1+ H2)/2). And (3) calculating an affine transformation matrix from the coordinates of the key points of the text line to the target correction rectangle by using opencv, and correcting the inclined text line or the text line with high and low inconsistency into the text line corresponding to the target correction rectangle by using affine transformation. The embodiment of the invention corrects the inclined text lines and the text lines with different heights in the original text lines through the radial transformation, realizes the linear transformation from the two-dimensional coordinates to the two-dimensional coordinates, and is favorable for keeping the straightness of the two-dimensional graph. The effect of correcting the text lines is improved. The principle of the radiation transformation is as follows:
Figure BDA0003030336290000061
as a specific implementation manner of the embodiment of the present invention, a single character of the corrected text textual line is segmented according to the horizontal projection to obtain a candidate character region, which specifically includes:
after binarization processing is carried out on the corrected text line, pixel values in the corrected text line are accumulated in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain a candidate character area.
Referring to fig. 4, an effect diagram of dividing a whole line of text according to an embodiment of the present invention is shown. In the embodiment of the invention, the candidate character area is obtained by dividing the single character according to the horizontal projection, and the candidate character area is the primary character detection result. The embodiment of the invention can rapidly and accurately realize the segmentation of the single character by adopting the horizontal projection method, avoids the problems of overlong time consumption and limited performance caused by detecting a large number of characters, and is beneficial to improving the character detection efficiency.
As a specific implementation manner of the embodiment of the present invention, calculating an average width of a bounding box of an entire row of characters in a candidate character region, and merging the bounding boxes of the entire row of characters according to the average width to obtain a final character detection result, specifically:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
Please refer to fig. 5, which is a schematic diagram illustrating an effect of text detection according to an embodiment of the present invention. The embodiment of the invention can accurately combine the characters divided from the left and right components in the preliminary detection, and the surrounding frame of each character is subjected to position adjustment, so that the handwritten text is detected by combining the characteristics of Chinese characters, and the accuracy and the reliability of text detection are further improved.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention adopts the key point positioning algorithm to position the text line to obtain accurate text line positioning information, and corrects the text line in the text picture to be detected through radioactive transformation according to the text line positioning information, so that the text lines at different angles and different directions can be effectively corrected, and the accuracy of text detection can be improved; according to the embodiment of the invention, the horizontal projection is adopted to divide a single character in the corrected text, and two character surrounding boxes with smaller adjacent widths are combined according to the width of the surrounding box of the whole line of characters, so that the problem of poor text detection effect caused by the fact that the left and right components of Chinese in the text are divided into two Chinese characters can be effectively avoided, and the accuracy of text detection can be further improved.
Referring to fig. 6, a second embodiment of the present invention provides a handwritten text detection apparatus, including:
the positioning module 10 is configured to input a text picture to be detected, and position a text line of the text picture to be detected by using a key point positioning algorithm to obtain text line positioning information;
the correction module 20 is configured to perform affine transformation correction on an original text line in a text picture to be detected according to the text line positioning information to obtain a corrected text line;
the segmentation module 30 is configured to segment a single character of the corrected text line according to the horizontal projection to obtain a candidate character region;
and the merging module 40 is configured to calculate an average width of the bounding boxes of the entire row of characters in the candidate character region, and merge the bounding boxes of the entire row of characters according to the average width to obtain a final character detection result.
The embodiment of the invention adopts the key point positioning algorithm to position the text line to obtain accurate text line positioning information, and corrects the text line in the text picture to be detected through radioactive transformation according to the text line positioning information, so that the text lines at different angles and different directions can be effectively corrected, and the accuracy of text detection can be improved; the embodiment of the invention can accurately combine the characters divided from the left and right components in the preliminary detection, and the surrounding frame of each character is subjected to position adjustment, so that the handwritten text is detected by combining the characteristics of Chinese characters, and the accuracy and the reliability of text detection are further improved.
As a specific implementation of the embodiment of the present invention, a module is defined, which is specifically configured to:
a key point output branch is added on the basis of the yolov3 key point positioning algorithm to improve the yolov3 key point positioning algorithm, and the text line of the text picture to be detected is positioned according to the improved yolov3 key point positioning algorithm to obtain text line positioning information.
Referring to fig. 2, an effect diagram of text line positioning according to an embodiment of the present invention is provided. In the embodiment of the invention, the improved yolov3 key point positioning algorithm can realize the detection and key point positioning of the text line at the same time, and is favorable for improving the positioning accuracy of the text line.
As a specific implementation manner of the embodiment of the present invention, the correcting module 20 is specifically configured to:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the coordinates of the key points of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the oblique text lines and the inconsistent text lines in height to the text correction text lines according to the affine transformation matrix.
Referring to fig. 3, an effect diagram of text line rectification according to an embodiment of the invention is shown. In the embodiment of the invention, the side lengths of the quadrangles connected according to the four-point positioning information are respectively as follows: the upper side length W1, the lower side length W2, the left side length H1 and the right side length H2 determine the coordinates of the target correction rectangle to be (0,0), ((W1+ W2)/2,0), ((W1+ W2)/2, (H1+ H2)/2), (0), (H1+ H2)/2). And (3) calculating an affine transformation matrix from the coordinates of the key points of the text line to the target correction rectangle by using opencv, and correcting the inclined text line or the text line with high and low inconsistency into the text line corresponding to the target correction rectangle by using affine transformation. The embodiment of the invention corrects the inclined text lines and the text lines with different heights in the original text lines through the radial transformation, realizes the linear transformation from the two-dimensional coordinates to the two-dimensional coordinates, and is favorable for keeping the straightness of the two-dimensional graph. The effect of correcting the text lines is improved. The principle of the radiation transformation is as follows:
Figure BDA0003030336290000091
as a specific implementation manner of the embodiment of the present invention, the segmentation module 30 is specifically configured to: after binarization processing is carried out on the corrected text line, pixel values in the corrected text line are accumulated in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain a candidate character area.
Referring to fig. 4, an effect diagram of dividing a whole line of text according to an embodiment of the present invention is shown. In the embodiment of the invention, the candidate character area is obtained by dividing the single character according to the horizontal projection, and the candidate character area is the primary character detection result. The embodiment of the invention can rapidly and accurately realize the segmentation of the single character by adopting the horizontal projection method, avoids the problems of overlong time consumption and limited performance caused by detecting a large number of characters, and is favorable for improving the character detection efficiency
As a specific implementation manner of the embodiment of the present invention, the merging module 40 is specifically configured to:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
Please refer to fig. 5, which is a schematic diagram illustrating an effect of text detection according to an embodiment of the present invention. The embodiment of the invention can accurately combine the characters divided from the left and right components in the preliminary detection, and the surrounding frame of each character is subjected to position adjustment, so that the handwritten text is detected by combining the characteristics of Chinese characters, and the accuracy and the reliability of text detection are further improved.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention adopts the key point positioning algorithm to position the text line to obtain accurate text line positioning information, and corrects the text line in the text picture to be detected through radioactive transformation according to the text line positioning information, so that the text lines at different angles and different directions can be effectively corrected, and the accuracy of text detection can be improved; according to the embodiment of the invention, the horizontal projection is adopted to divide a single character in the corrected text, and two character surrounding boxes with smaller adjacent widths are combined according to the width of the surrounding box of the whole line of characters, so that the problem of poor text detection effect caused by the fact that the left and right components of Chinese in the text are divided into two Chinese characters can be effectively avoided, and the accuracy of text detection can be further improved.
A third embodiment of the present invention provides a computer-readable storage medium, which includes a stored computer program, wherein when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute a handwritten text detection method as described above.
The foregoing is a preferred embodiment of the present invention, and it should be noted that it would be apparent to those skilled in the art that various modifications and enhancements can be made without departing from the principles of the invention, and such modifications and enhancements are also considered to be within the scope of the invention.

Claims (10)

1. A method for detecting handwritten text, comprising:
inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information;
performing affine transformation correction on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line;
dividing the single character of the corrected text line according to horizontal projection to obtain a candidate character area;
and calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain a final character detection result.
2. The handwritten text detection method according to claim 1, wherein said positioning the text line of the text picture to be detected by using a key point positioning algorithm to obtain text line positioning information, specifically:
and adding a key point output branch on the basis of the yolov3 key point positioning algorithm to improve the yolov3 key point positioning algorithm, and positioning the text line of the text picture to be detected according to the improved yolov3 key point positioning algorithm to obtain text line positioning information.
3. The handwritten text detection method according to claim 1, wherein the original text lines include oblique text lines and inconsistent text lines, and affine transformation correction is performed on the original text lines in the text picture to be detected according to the text line positioning information to obtain corrected text lines, specifically:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the key point coordinates of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the inclined text line and the high-low inconsistent text line to a text correction text line according to the affine transformation matrix.
4. The method according to claim 1, wherein said segmenting individual words of said text-corrected text-line according to horizontal projection to obtain candidate word regions comprises:
after binarization processing is carried out on the corrected text line, accumulating pixel values in the corrected text line in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain candidate character areas.
5. The method according to claim 1, wherein the calculating an average width of bounding boxes of an entire row of characters in the candidate character region, and merging the bounding boxes of the entire row of characters according to the average width to obtain a final character detection result, specifically comprises:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
6. A handwritten text detection device, comprising:
the positioning module is used for inputting a text picture to be detected, and positioning a text line of the text picture to be detected by adopting a key point positioning algorithm to obtain text line positioning information;
the correction module is used for carrying out affine transformation correction on the original text line in the text picture to be detected according to the text line positioning information to obtain a corrected text line;
the segmentation module is used for segmenting the single character of the corrected text line according to the horizontal projection to obtain a candidate character area;
and the merging module is used for calculating the average width of the surrounding frames of the whole line of characters in the candidate character area, and merging the surrounding frames of the whole line of characters according to the average width to obtain a final character detection result.
7. The handwritten text detection device of claim 6, wherein said correction module is specifically configured to:
obtaining four-point positioning information of the text line according to the text line positioning information, obtaining each side length of a quadrangle formed by connecting the four-point positioning information, and determining a target correction rectangle according to each side length of the quadrangle;
calculating an affine transformation matrix from the key point coordinates of the text line to the target correction rectangle by using opencv;
and performing affine transformation on the inclined text line and the high-low inconsistent text line to a text correction text line according to the affine transformation matrix.
8. The device for detecting handwritten text according to claim 6, wherein said segmentation module is specifically configured to: after binarization processing is carried out on the corrected text line, accumulating pixel values in the corrected text line in the horizontal direction to obtain a wave line;
and segmenting wave crests in the wavy lines by setting a threshold value to obtain candidate character areas.
9. The handwritten text detection device of claim 6, wherein said merging module is specifically configured to:
and calculating the average width of the whole line of character surrounding frames in the candidate character area, and combining the surrounding frames which are adjacent to the whole line of character surrounding frames and have the width smaller than the average width to obtain a final character detection result.
10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform a method for handwritten text detection as claimed in any of claims 1 to 5.
CN202110428121.6A 2021-04-21 2021-04-21 Handwritten text detection method and device and storage medium Pending CN113159031A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110428121.6A CN113159031A (en) 2021-04-21 2021-04-21 Handwritten text detection method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110428121.6A CN113159031A (en) 2021-04-21 2021-04-21 Handwritten text detection method and device and storage medium

Publications (1)

Publication Number Publication Date
CN113159031A true CN113159031A (en) 2021-07-23

Family

ID=76869121

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110428121.6A Pending CN113159031A (en) 2021-04-21 2021-04-21 Handwritten text detection method and device and storage medium

Country Status (1)

Country Link
CN (1) CN113159031A (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0991382A (en) * 1995-07-17 1997-04-04 Nippon Telegr & Teleph Corp <Ntt> Method and device for on-line handwritten character recognition
JPH1166238A (en) * 1997-08-22 1999-03-09 Hitachi Software Eng Co Ltd Handwritten character recognition method
CN104268603A (en) * 2014-09-16 2015-01-07 科大讯飞股份有限公司 Intelligent marking method and system for text objective questions
CN107688806A (en) * 2017-08-21 2018-02-13 西北工业大学 A kind of free scene Method for text detection based on affine transformation
US20180089525A1 (en) * 2016-09-29 2018-03-29 Konica Minolta Laboratory U.S.A., Inc. Method for line and word segmentation for handwritten text images
CN109993160A (en) * 2019-02-18 2019-07-09 北京联合大学 A kind of image flame detection and text and location recognition method and system
CN111259899A (en) * 2020-01-13 2020-06-09 华中科技大学 Code spraying character detection method
CN111488870A (en) * 2019-01-28 2020-08-04 富士通株式会社 Character recognition method and character recognition device
KR20200101481A (en) * 2019-01-28 2020-08-28 삼성전자주식회사 Electronic device and method for correcting handwriting
WO2021051868A1 (en) * 2019-09-20 2021-03-25 平安科技(深圳)有限公司 Target location method and apparatus, computer device, computer storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0991382A (en) * 1995-07-17 1997-04-04 Nippon Telegr & Teleph Corp <Ntt> Method and device for on-line handwritten character recognition
JPH1166238A (en) * 1997-08-22 1999-03-09 Hitachi Software Eng Co Ltd Handwritten character recognition method
CN104268603A (en) * 2014-09-16 2015-01-07 科大讯飞股份有限公司 Intelligent marking method and system for text objective questions
US20180089525A1 (en) * 2016-09-29 2018-03-29 Konica Minolta Laboratory U.S.A., Inc. Method for line and word segmentation for handwritten text images
CN107688806A (en) * 2017-08-21 2018-02-13 西北工业大学 A kind of free scene Method for text detection based on affine transformation
CN111488870A (en) * 2019-01-28 2020-08-04 富士通株式会社 Character recognition method and character recognition device
KR20200101481A (en) * 2019-01-28 2020-08-28 삼성전자주식회사 Electronic device and method for correcting handwriting
CN109993160A (en) * 2019-02-18 2019-07-09 北京联合大学 A kind of image flame detection and text and location recognition method and system
WO2021051868A1 (en) * 2019-09-20 2021-03-25 平安科技(深圳)有限公司 Target location method and apparatus, computer device, computer storage medium
CN111259899A (en) * 2020-01-13 2020-06-09 华中科技大学 Code spraying character detection method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
WAKAHARA, T.: "Online handwritten character recognition using local affine transformation", TRANSACTIONS OF THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS, no. 2, pages 379 - 386 *
吕岳,等: "基于组件合并的手写体汉字串分割", 软件学报, no. 11, pages 1554 - 1559 *
朱健菲,等: "回归――聚类联合框架下的手写文本行提取", 中国图象图形学报, no. 08, pages 1207 - 1217 *

Similar Documents

Publication Publication Date Title
CN111814722B (en) Method and device for identifying table in image, electronic equipment and storage medium
JP4170441B2 (en) Document image inclination detection apparatus and storage medium for document image inclination detection program
Tian et al. Rectification and 3D reconstruction of curved document images
CN109389121B (en) Nameplate identification method and system based on deep learning
CN101908136B (en) Table identifying and processing method and system
US20020051575A1 (en) Method and apparatus for recognizing text in an image sequence of scene imagery
CN103034848B (en) A kind of recognition methods of form types
US20080232715A1 (en) Image processing apparatus
US20090016608A1 (en) Character recognition method
JP4395188B2 (en) Document image recognition apparatus and storage medium for document image recognition program
CN104376318A (en) Removal of underlines and table lines in document images while preserving intersecting character strokes
CN104008359A (en) Accurate grid sampling method used for recognizing QR code
US8855419B2 (en) Image rectification using an orientation vector field
US9008444B2 (en) Image rectification using sparsely-distributed local features
JP4859061B2 (en) Image correction method, correction program, and image distortion correction apparatus
CN112419225B (en) SOP type chip detection method and system based on pin segmentation
CN112800731A (en) Table repairing method for dealing with distorted graphs in image table extraction
CN102073862A (en) Method for quickly calculating layout structure of document image
CN115457559B (en) Method, device and equipment for intelligently correcting texts and license pictures
CN113159031A (en) Handwritten text detection method and device and storage medium
Lehal et al. A range free skew detection technique for digitized Gurmukhi script documents
JP2004280713A (en) License plate number recognition device
JP3303246B2 (en) Image processing device
CN111914847B (en) OCR (optical character recognition) method and system based on template matching
CN112418210B (en) Intelligent classification method for tower inspection information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination