CN115995081A - Method for correcting curved text - Google Patents
Method for correcting curved text Download PDFInfo
- Publication number
- CN115995081A CN115995081A CN202211520195.3A CN202211520195A CN115995081A CN 115995081 A CN115995081 A CN 115995081A CN 202211520195 A CN202211520195 A CN 202211520195A CN 115995081 A CN115995081 A CN 115995081A
- Authority
- CN
- China
- Prior art keywords
- text
- communication
- curve
- point
- points
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Character Input (AREA)
Abstract
The invention discloses a method for correcting a bent text, which corrects the bent text according to the position information of a connected region in a segmentation diagram, has the advantages of simpler operation process, better correction effect, no need of constructing any model, no need of single character segmentation, contribution to recovering an original image, contribution to improving the accuracy of text recognition due to better text correction result, suitability for various text image scenes and stronger universality. In addition, the method only corrects the bending text in the segmentation map, so that the possibility of deformation of the unbent text caused by correction is effectively reduced.
Description
Technical Field
The invention belongs to the technical field of computer vision and text detection and recognition, and particularly relates to a method for correcting a bent text.
Background
The existing method for detecting and processing the bending text solves the problem of detecting the bending text by using a regression method, describes boundary polygons of the bending text by using multi-point coordinates, and then directly predicts vertex coordinates of the polygons. For example, CTD proposes boundary polygons for directly predicting 14 vertexes of a curved text, and prediction coordinates of the thinned vertexes of a Bi-LSTM layer are utilized in a network, so that the curved text detection based on a regression method is realized; the method based on the model in machine learning not only needs to collect the data set for training, has high complexity of the model, has higher cost, and is suitable for a single scene.
Chinese patent publication No. CN113989298A proposes a method for contract document curved text line correction, specifically: performing text detection on the text image to obtain a segmented binarization map of the text image; traversing a communication area in the binarization graph, and solving a minimum circumscribed rectangle; judging whether the connected areas in the binarization graph are processed or not, if yes, directly skipping the processing, and if not, entering the next step; calculating the actual overlapping ratio of the number of the pixels in the communication area and the area of the circumscribed rectangle, setting an overlapping ratio threshold, comparing the overlapping ratio threshold with the actual overlapping ratio, if the actual overlapping ratio is larger than the overlapping ratio threshold, entering the next step, otherwise, returning to the previous step; performing curve fitting on the communication area, and acquiring an inflection point of a curve; the inflection point is corrected. Although belonging to the extraction of bent text, the technology is limited to the processing of bent text of contracted documents, and has certain error when used for other scenes such as book bent text.
Chinese patent publication No. CN113139537a proposes a method of bending a curved text line by an image processing method, an electronic circuit, a vision-impaired assisting device, and a medium, specifically: performing text line detection on the input image to obtain a text line image including curved text lines; determining a plurality of reference points in the text line image for warping the text line; determining a text line curve for bending the text line based on the plurality of reference points; adjusting the curved text line using an adjustment parameter determined based on the text line curve to obtain an identified text line corresponding to the curved text line, wherein the identified text line includes a plurality of characters displayed horizontally; the technology of the patent realizes the curved text processing through the mutual cooperation of the image processing and external hardware, and the required cost is higher.
Disclosure of Invention
In view of the above, the invention provides a method for correcting the bent text, which has simple operation process, does not need to construct any model and has better correction effect.
A method of correcting curved text, comprising the steps of:
(1) Performing text detection on the text image to obtain a binarization graph after text detection segmentation;
(2) Carrying out thinning and straightening operation on each communication region in the binarization graph to obtain a corresponding communication curve;
(3) Fitting the communication curve by using the contour points to obtain mathematical equation expression of the communication curve, and recording the coordinates of the left endpoint and the right endpoint of the communication curve;
(4) Determining a reference demarcation point from a corresponding connected curve of the connected region needing text bending correction;
(5) Splitting the connected region by using the reference demarcation point, and correcting the text frame in the connected region;
(6) Outputting the corrected text image.
Further, in the step (1), a DBnet text detection algorithm based on segmentation is adopted to carry out text detection on the text image.
Further, in the step (3), a method of searching contour points is adopted to search contour points of the connected curves, the found contour points are subjected to de-duplication and sequencing operations, the contour points are further utilized to fit the connected curves to obtain mathematical equation expression of the connected curves, and meanwhile, the left endpoint coordinates and the right endpoint coordinates of each connected curve are recorded.
Further, a least square method is adopted to fit the communication curve to obtain the mathematical equation expression of the communication curve.
Further, the specific implementation manner of the step (4) is as follows: firstly, uniformly taking 20 points from a communication curve, and determining a straight line L_left where a left end point of the communication curve and the point p are located, and a straight line L_right where a right end point of the communication curve and the point p are located for any point p; then calculating the distance and distance between the straight lines L_left and L_right and the communication curve; taking the point with the smallest distance as c according to the traversing of the 20 points, and taking the point c as a reference demarcation point if the following conditions are met; otherwise, judging that the corresponding connected region does not need to be subjected to text bending correction;
|k_left-k_right|>threshold
wherein: k_left is the slope of the straight line where the left end point of the communication curve and the point c are located, k_right is the slope of the straight line where the right end point of the communication curve and the point c are located, and threshold is the threshold.
Further, the specific implementation manner of calculating the distance and distance between the straight lines l_left and l_right and the connected curve is as follows: firstly, 50 points are evenly taken from the communication curve, in the 50 points, the sum of the distances from all points positioned on the left side of the point p to the direct L_left is calculated, the sum of the distances from all points positioned on the right side of the point p to the direct L_right is calculated, and then the sum of the distances of the two parts is added to obtain the distance sum between the straight lines L_left and L_right and the communication curve.
Further, the specific implementation manner of the step (5) is as follows: splitting the communication areas according to a reference demarcation point to change an original communication area into a left sub-communication area and a right sub-communication area; then calculating and determining the minimum circumscribed rectangle of the two sub-communication areas; and finally, correcting the text boxes at the corresponding positions of the left and right minimum circumscribed rectangles in the text image by using a perspective transformation method.
Further, in the step (6), for the connected region requiring text bending correction, the text boxes with the left and right minimum circumscribed rectangles corrected at the corresponding positions in the text image are spliced, so that the purpose of correction is achieved, and the text image after the bending correction is output.
According to the method, the bending text is corrected according to the position information of the connected region in the segmentation map, the operation process is simple, no model is needed to be constructed, the correction effect is good, single character segmentation is not needed, dependence on text content is avoided, recovery of an original image is facilitated, meanwhile, the good text correction result is also beneficial to improving the accuracy of text recognition, the method is suitable for various text image scenes, and the method has strong universality. In addition, the method only corrects the bending text in the segmentation map, so that the possibility of deformation of the unbent text caused by correction is effectively reduced.
Drawings
FIG. 1 is a flow chart of a method for correcting curved text according to the present invention.
Fig. 2 is an original text image.
Fig. 3 is a binarization chart obtained after text detection.
Fig. 4 is a result of the image of the binarized map after thinning and straightening.
Fig. 5 is a text image with fitted connected curves.
Fig. 6 is a binarization map after splitting a single connected region.
Fig. 7 is a binarization chart after all connected regions are split.
Fig. 8 is a correction effect diagram of two sub-connected regions corresponding to left and right in an original text image.
Fig. 9 is a diagram of the effect of an uncorrected text box.
Fig. 10 is a diagram of corrected text box effects.
Detailed Description
In order to more particularly describe the present invention, the following detailed description of the technical scheme of the present invention is provided with reference to the accompanying drawings and the specific embodiments.
As shown in fig. 1, the method for correcting the bent text specifically comprises the following steps:
(1) And performing text detection on the text image, and further obtaining a binarization map after the image is segmented.
In the present embodiment, text detection is performed using a DBnet text detection algorithm based on segmentation, and a post-segmentation binarization map is obtained. The text detection algorithm based on segmentation is a basic model of a binary segmentation map of a text image, the detection method directly influences the positioning of a text box and influences the positioning of a later text box, so that the text detection model needs to be trained, a DBnet network carries out self-adaptive binarization on each pixel point, a binarization threshold value is obtained by network learning, the step of binarization is thoroughly added into the network to be trained together, a final output map is very robust to the threshold value, and the segmented binarization map is finally output. The original text image is shown in fig. 2, and the binarized image after text detection is shown in fig. 3.
(2) And carrying out thinning and straightening operation on each connected region of the binary image, obtaining a plurality of coordinate points for which the thinned and straightened connected curves are de-duplicated and sequenced, and recording the left end point (x_left, y_left) and the right end point (x_right, y_right) of each connected curve.
In this embodiment, a thinning and straightening method is used to make each connected region of the binary image perform a thinning and straightening operation to obtain a connected curve, a contour point searching method is used to search the contour point of the obtained connected curve, then a conventional method is used to de-weight and sort the found contour coordinate points, and an image result of the thinned and straightened binary image is shown in fig. 4.
(3) And (3) fitting a curve to each connected curve according to the plurality of coordinate points obtained in the step (2).
And carrying out a least square method according to the obtained de-duplicated and sequenced contour coordinate points in each communication curve to realize curve fitting of the communication curve. The least square method can find the optimal function matching of the data by minimizing the square sum of errors, so that the fitted curve is more accurate, unknown data can be simply obtained by using the least square method, and the square sum of errors between the obtained data and actual data is minimized; a text image with a fitted connected curve is shown in fig. 5.
(4) Traversing each fitted communication curve, calculating to obtain coordinate points (x_cut, y_cut) of the split communication areas, setting a slope difference threshold value, comparing the slope difference threshold value with the K1, and entering a step (5) if the K1 is larger than the slope difference threshold value, otherwise judging the next fitted communication curve, wherein the specific implementation process is as follows:
firstly, traversing each fitted connected curve obtained in the step (3), and uniformly taking 20 coordinate points (x 0, y 0) in the curve according to the left end point (x_left, y_left) and the right end point (x_right, y_right) of each curve recorded in the step (2).
Then, a straight line equation l_left and the slope thereof, which are formed by connecting the left end point (x_left, y_left) of the curve with the coordinate points (x 0, y 0), and a straight line equation l_right and the slope thereof, which are formed by connecting the right end point (x_right, y_right) of the curve with the coordinate points (x 0, y 0), are calculated with the 20 coordinate points (x 0, y 0) as the centers, respectively, to obtain 20 groups of slopes and 20 groups of straight line equations, each group of slopes is composed of slopes of two straight lines, which are formed by connecting the same coordinate point (x 0, y 0) with the left and right end points of the curve, and each group of straight line equations is composed of two straight lines, which are formed by connecting the same coordinate point (x 0, y 0) with the left and right end points of the curve.
Calculating the slopes of the left and right straight lines according to the coordinate points (x 0, y 0) and a straight line equation:
k_left=(y_left–y0)/(x_left–x0)
k_right=(y_right–y0)/(x_right–x0)
l_left expression: y-y0=k_left (x-x 0)
L_right expression: y-y0=k_right (x-x 0)
Further, 50 coordinate points (x 1, y 1) are uniformly taken in the curve, the 20 coordinate points (x 0, y 0) are taken as centers, and the sum of the distance from the coordinate points to the straight line L_left_left is calculated for the coordinate points on the left side of (x 0, y 0) in the 50 coordinate points (x 1, y 1); for these coordinate points on the right side of (x 0, y 0), the sum of the distance distances from these coordinate points to the straight line l_right_right is calculated, and the sum of the two parts is added to obtain the sum distance of the distances.
When x1< x0, when x=x1, y=k_left (x 1-x 0) +y0 in l_left
distance_left=distance_left+k_left(x1-x0)+y0–y1
When x1> x0, when x=x1, y=k_right (x 1-x 0) +y0 in l_right
distance_right=distance_right+k_right(x1-x0)+y0–y1
Calculating the sum distance of the distances to the left and right straight lines according to the positions of 50 coordinate points (x 1, y 1):
distance=distance_right+distance_right
finally, the coordinate point (x_cut, y_cut) at which the distance is the smallest among the 20 coordinate points is recorded, and the absolute value K1 of the slope difference between the two straight lines is found when the coordinate is (x_cut, y_cut), a slope difference threshold is set, the slope difference threshold is compared with K1, and if K1 is larger than the slope difference threshold, the step (5) is entered.
The absolute value K1 of the slope difference of the left and right straight lines is calculated as:
K1=|k_left-k_right|
(5) Splitting the connected region in the binary image obtained in the step (1) according to the coordinate points (x_cut, y_cut), and correcting the text box at the corresponding position in the original text image, wherein the specific implementation process is as follows:
first, since the connected regions in the binarization map represent text regions in the original image, the connected regions in the binarization map obtained in step (1) may be split according to coordinate points (x_cut, y_cut), so that the bent text is split into two sub-connected regions from one connected region.
Then, determining the minimum circumscribed rectangle of the two sub-communication areas by using a method for acquiring the minimum circumscribed rectangle according to the left sub-communication area and the right sub-communication area;
further, the text box at the corresponding positions of the left and right minimum bounding rectangles in the original text image is corrected by using a perspective transformation method, specifically:
recording a coordinate point (x 2, y 2) with the smallest y value in the left upper right coordinate point of the smallest circumscribed rectangle on the left side and the right upper left coordinate point of the smallest circumscribed rectangle on the right side, and recording a coordinate point (x 3, y 3) with the largest y value in the right lower right coordinate point of the smallest circumscribed rectangle on the left side and the right lower left coordinate point of the smallest circumscribed rectangle on the right side;
for a text box of which the left minimum bounding rectangle corresponds to a position in an original text image, determining coordinate points of the left upper and left lower sides of the text box as coordinate points of the left upper and left lower sides of the left minimum bounding rectangle, and determining coordinate points of the right upper and right lower sides of the text box as (x 2, y 2) and (x 3, y 3);
for a text box of which the right minimum bounding rectangle corresponds to a position in an original text image, determining coordinate points of the upper left and lower left of the text box as (x 2, y 2) and (x 3, y 3), and determining coordinate points of the upper right and lower right of the text box as coordinate points of the upper right and lower right of the right minimum bounding rectangle;
and respectively performing perspective transformation on the text boxes on the left side and the right side according to the obtained coordinate points, wherein the binary image is shown in fig. 6 after the splitting of the single connected region, the binary image is shown in fig. 7 after the splitting of the connected region, and the correction effect of the corresponding left sub-connected region and the corresponding right sub-connected region in the original text image is shown in fig. 8.
(6) Outputting the corrected text image.
And (3) splicing the text boxes with the corrected corresponding positions of the left and right minimum circumscribed rectangles in the text image in the step (5), thereby achieving the purpose of correction, and finally outputting the text image after the bending treatment.
The uncorrected text box effect is shown in fig. 9, and the corrected text box effect is shown in fig. 10.
The embodiments described above are described in order to facilitate the understanding and application of the present invention to those skilled in the art, and it will be apparent to those skilled in the art that various modifications may be made to the embodiments described above and that the general principles described herein may be applied to other embodiments without the need for inventive faculty. Therefore, the present invention is not limited to the above-described embodiments, and those skilled in the art, based on the present disclosure, should make improvements and modifications within the scope of the present invention.
Claims (8)
1. A method of correcting curved text, comprising the steps of:
(1) Performing text detection on the text image to obtain a binarization graph after text detection segmentation;
(2) Carrying out thinning and straightening operation on each communication region in the binarization graph to obtain a corresponding communication curve;
(3) Fitting the communication curve by using the contour points to obtain mathematical equation expression of the communication curve, and recording the coordinates of the left endpoint and the right endpoint of the communication curve;
(4) Determining a reference demarcation point from a corresponding connected curve of the connected region needing text bending correction;
(5) Splitting the connected region by using the reference demarcation point, and correcting the text frame in the connected region;
(6) Outputting the corrected text image.
2. The method of curved text correction according to claim 1, wherein: and (3) performing text detection on the text image by adopting a DBnet text detection algorithm based on segmentation in the step (1).
3. The method of curved text correction according to claim 1, wherein: and (3) performing contour point searching operation on the connected curves by adopting a contour point searching method, performing de-duplication and sequencing operation on the found contour points, fitting the connected curves by utilizing the contour points to obtain mathematical equation expression of the connected curves, and recording the left endpoint coordinates and the right endpoint coordinates of each connected curve.
4. A method of correcting curved text according to claim 3, wherein: and fitting the communication curve by adopting a least square method to obtain the mathematical equation expression of the communication curve.
5. The method of curved text correction according to claim 1, wherein: the specific implementation manner of the step (4) is as follows: firstly, uniformly taking 20 points from a communication curve, and determining a straight line L_left where a left end point of the communication curve and the point p are located, and a straight line L_right where a right end point of the communication curve and the point p are located for any point p; then calculating the distance and distance between the straight lines L_left and L_right and the communication curve; taking the point with the smallest distance as c according to the traversing of the 20 points, and taking the point c as a reference demarcation point if the following conditions are met; otherwise, judging that the corresponding connected region does not need to be subjected to text bending correction;
|k_left-k_right|>threshold
wherein: k_left is the slope of the straight line where the left end point of the communication curve and the point c are located, k_right is the slope of the straight line where the right end point of the communication curve and the point c are located, and threshold is the threshold.
6. The method of curved text correction according to claim 5, wherein: the specific implementation manner for calculating the distance and distance between the straight lines L_left and L_right and the communication curve is as follows: firstly, 50 points are evenly taken from the communication curve, in the 50 points, the sum of the distances from all points positioned on the left side of the point p to the direct L_left is calculated, the sum of the distances from all points positioned on the right side of the point p to the direct L_right is calculated, and then the sum of the distances of the two parts is added to obtain the distance sum between the straight lines L_left and L_right and the communication curve.
7. The method of curved text correction according to claim 1, wherein: the specific implementation manner of the step (5) is as follows: splitting the communication areas according to a reference demarcation point to change an original communication area into a left sub-communication area and a right sub-communication area; then calculating and determining the minimum circumscribed rectangle of the two sub-communication areas; and finally, correcting the text boxes at the corresponding positions of the left and right minimum circumscribed rectangles in the text image by using a perspective transformation method.
8. The method of curved text correction according to claim 1, wherein: and (3) in the step (6), splicing the text boxes of which the corresponding positions of the left and right minimum circumscribed rectangles in the text image are corrected for the connected region needing to be subjected to text bending correction, so that the purpose of correction is achieved, and the text image subjected to bending correction processing is output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520195.3A CN115995081A (en) | 2022-11-28 | 2022-11-28 | Method for correcting curved text |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211520195.3A CN115995081A (en) | 2022-11-28 | 2022-11-28 | Method for correcting curved text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115995081A true CN115995081A (en) | 2023-04-21 |
Family
ID=85991412
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211520195.3A Pending CN115995081A (en) | 2022-11-28 | 2022-11-28 | Method for correcting curved text |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115995081A (en) |
-
2022
- 2022-11-28 CN CN202211520195.3A patent/CN115995081A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN110647795B (en) | Form identification method | |
US8401333B2 (en) | Image processing method and apparatus for multi-resolution feature based image registration | |
RU2621601C1 (en) | Document image curvature eliminating | |
US20090016608A1 (en) | Character recognition method | |
CN113435240A (en) | End-to-end table detection and structure identification method and system | |
CN111461113B (en) | Large-angle license plate detection method based on deformed plane object detection network | |
CN113989604B (en) | Tire DOT information identification method based on end-to-end deep learning | |
CN112883795B (en) | Rapid and automatic table extraction method based on deep neural network | |
CN111598087B (en) | Irregular character recognition method, device, computer equipment and storage medium | |
CN112541491A (en) | End-to-end text detection and identification method based on image character region perception | |
CN110738030A (en) | Table reconstruction method and device, electronic equipment and storage medium | |
CN110598698A (en) | Natural scene text detection method and system based on adaptive regional suggestion network | |
CN115546809A (en) | Table structure identification method based on cell constraint and application thereof | |
CN111274863A (en) | Text prediction method based on text peak probability density | |
JP3099771B2 (en) | Character recognition method and apparatus, and recording medium storing character recognition program | |
CN111738272A (en) | Target feature extraction method and device and electronic equipment | |
CN111832497B (en) | Text detection post-processing method based on geometric features | |
CN113628113A (en) | Image splicing method and related equipment thereof | |
CN109635798B (en) | Information extraction method and device | |
CN111612802A (en) | Re-optimization training method based on existing image semantic segmentation model and application | |
CN110826564A (en) | Small target semantic segmentation method and system in complex scene image | |
CN115995081A (en) | Method for correcting curved text | |
CN116030236A (en) | Rapid detection method for dynamic region in mobile application page | |
CN114511862B (en) | Form identification method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |