CN109919155B

CN109919155B - Inclination angle correction method for text image and terminal

Info

Publication number: CN109919155B
Application number: CN201910189109.7A
Authority: CN
Inventors: 庄国金; 陈文传; 杜保发; 林玉玲; 郝占龙; 方恒凯; 吴建杭
Original assignee: Xiamen Shangji Network Technology Co ltd
Current assignee: Xiamen Shangji Network Technology Co ltd
Priority date: 2019-03-13
Filing date: 2019-03-13
Publication date: 2021-03-12
Anticipated expiration: 2039-03-13
Also published as: CN109919155A

Abstract

The invention relates to a method and a terminal for correcting an inclination angle of a text image, and belongs to the field of data processing. The invention obtains a processed second text image by correcting the inclination angle of the first text image; the value range of the inclination angle is 0-45 degrees; calling an OCR recognition engine to recognize the second text image to obtain a first character string; rotating the second text image by 180 degrees to obtain a processed third text image; calling an OCR recognition engine to recognize the third text image to obtain a second character string; if the number of the high-frequency words in the first character string is larger than that in the second character string, marking the second text image as a final text image; otherwise, marking the third text image as a final text image. The accuracy of correcting the inclination angle of the text image is improved.

Description

Inclination angle correction method for text image and terminal

Technical Field

The invention relates to a method and a terminal for correcting an inclination angle of a text image, and belongs to the field of data processing.

Background

People often need to convert paper documents into electronic documents in daily work and life. A common electronic document conversion method is to shoot a paper document and then upload a photo of the paper document to an electronic device. However, the situation of inclination of the picture and the like often occurs in the shooting process, which affects the experience of people reading electronic documents through electronic equipment.

In the prior art, chinese patent application No. 201510047692.X provides a method for correcting the tilt angle and analyzing the layout of a text image, and a vision-aiding device and system. The method for correcting the inclination angle of the text image comprises the following steps: carrying out edge detection on the text image to obtain an edge image; detecting a connected domain of the text image to obtain a central point of a circumscribed rectangle of the connected domain of the text image; detecting a connected domain of the edge image to obtain a central point of a circumscribed rectangle of the connected domain of the edge image; applying Hough transform to the set of the central points of the circumscribed rectangles of the connected domain of the obtained text image and the central points of the circumscribed rectangles of the connected domain of the edge image to perform inclination detection, so as to obtain an inclination angle of the text image; and under the condition that the inclination angle is larger than or equal to a preset first inclination angle threshold value, performing inclination angle correction on the text image. The method for correcting the inclination angle of the text image does not need Hough transformation on each pixel point of the text image, and has the advantages of small operand, small influence by illumination and high accuracy.

However, the tilt angle correction method for the text image provided in the above patent document has a limitation in that the adjustable tilt angle range is 0 to 45 degrees, and the accuracy of the tilt angle correction for the text image having a large tilt angle is not high.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: how to improve the accuracy of correcting a text image with a large inclination angle.

In order to solve the technical problems, the invention adopts the technical scheme that:

the invention provides a method for correcting the inclined angle of a text image, which comprises the following steps:

correcting the inclination angle of the first text image to obtain a processed second text image; the value range of the inclination angle is 0-45 degrees;

calling an OCR recognition engine to recognize the second text image to obtain a first character string;

rotating the second text image by 180 degrees to obtain a processed third text image;

calling an OCR recognition engine to recognize the third text image to obtain a second character string;

if the number of the high-frequency words in the first character string is larger than that in the second character string, marking the second text image as a final text image; otherwise, marking the third text image as a final text image.

Preferably, the tilt angle of the first text image is corrected to obtain a processed second text image, specifically:

rotating the first text image by 90 degrees to obtain a processed fourth text image;

performing connected domain detection on the first text image to obtain a first central point set; the first set of center points consists of center points of connected domains in the first text image; performing connected domain detection on the fourth text image to obtain a second central point set; the second set of center points consists of center points of connected domains in the fourth text image;

fitting a straight line according to the first central point set to obtain a first straight line set; fitting straight lines according to the second central point set to obtain a second straight line set; the included angle between the straight line and the horizontal direction is less than 45 degrees;

if the number of elements of the first straight line set is larger than that of elements of the second straight line set, correcting the inclination angle of the first text image according to the first straight line set to obtain a second text image; otherwise:

and correcting the inclination angle of the fourth text image according to the second straight line set to obtain a second text image.

Preferably, the connected domain detection is performed on the first text image to obtain a first center point set, which specifically is:

s11, detecting connected domains of the first text image to obtain all connected domains in the first text image;

s12, horizontally projecting the first text image, and adding connected domains projected to fall into the same section to the same connected domain set to obtain a plurality of connected domain sets;

s13, acquiring a connected domain set from the plurality of connected domain sets;

s14, acquiring a connected domain from a connected domain set to obtain a first connected domain;

s15, acquiring another connected domain adjacent to the first connected domain from the connected domain set to obtain a second connected domain;

s16, calculating the average value of the height of the first connected domain and the height of the second connected domain to obtain the average character width;

s17, if the difference between the height of the first connected domain and the height of the second connected domain is smaller than a preset height threshold value, and the horizontal distance between the circumscribed rectangle of the first connected domain and the circumscribed rectangle of the second connected domain is smaller than the average character width, adding the center point of the first connected domain and the center point of the second connected domain to a preset third center point set;

s18, repeating the steps S14 to S17 until the connected domain set is traversed;

s19, repeating the steps S13 to S18 until the plurality of connected domain sets are traversed to obtain a plurality of third central point sets; the first set of center points is comprised of a plurality of the third sets of center points.

Preferably, a straight line is fitted according to the first central point set to obtain a first straight line set, which specifically comprises:

s21, fitting a straight line according to the third central point set, and adding the straight line to the first straight line set;

s22, repeating the step S21 until all the third center point sets are traversed.

Preferably, the inclination angle of the first text image is corrected according to the first straight line set to obtain a second text image, specifically:

s1, acquiring an inclination angle of a straight line from the first straight line set to obtain a first inclination angle;

s2, if the difference value between the first inclination angle and the inclination angle of another straight line in the first straight line set except the straight line is smaller than a preset inclination angle threshold value, adding the another straight line to a preset third straight line set;

s3, repeating the step S2 until the first straight line set is traversed;

s4, repeating the steps S1 to S3 until the first straight line set is traversed to obtain a plurality of third straight line sets;

s5, obtaining the third straight line set with the largest number of elements to obtain a fourth straight line set;

s6, acquiring an optimal rotation angle according to the inclination angle of each straight line in the fourth straight line set;

and S7, rotating the first text image by the optimal rotation angle to obtain a second text image.

The present invention also provides a tilt angle correction terminal for a text image, comprising one or more processors and a memory, the memory storing a program and configured to perform the following steps by the one or more processors:

s3, repeating the step S2 until the first straight line set is traversed;

The invention has the following beneficial effects:

1. the invention provides a method and a terminal for correcting a tilt angle of a text image. In order to improve the accuracy of the inclination correction of the text image, the invention respectively identifies the second text image and the inverted image (third text image) of the second text image by an OCR (optical character recognition) technology, and respectively matches the OCR identification results of the second text image and the inverted image with a preset high-frequency vocabulary library, wherein the final correction result is obtained when the OCR identification results of the second text image and the inverted image are more high-frequency vocabularies. By analyzing the OCR recognition results of the corrected text images and verifying the accuracy of the correction results, an optimal corrected text image is selected as a final text image, and the accuracy of the inclination angle correction of the text image is improved to a great extent.

2. Further, the invention carries out straight line fitting operation on the central points of the connected domains in the text image, if the connected domains are arranged in a disordered way, the straight lines cannot be successfully fitted, and only when part of the connected domains are arranged regularly, a plurality of straight lines can be fitted according to the central points of the regularly arranged connected domains. According to the method, an initial text image (a first text image) is rotated by 90 degrees, a fourth text image rotated by 90 degrees is compared with the initial text image, multiple connected domains of a fitting straight line obtained according to the central point of the connected domains are arranged more regularly, and fine adjustment operation of an inclination angle of 0-45 degrees is further performed on the basis of the regular arrangement of the multiple connected domains. Before fine adjustment of the inclination angle of 0-45 degrees, the method can automatically identify the approximate arrangement direction of the text in the initial text image, and determine whether to rotate the initial text image by 90 degrees according to the actual situation, thereby effectively improving the accuracy of the inclination angle correction of the text image.

3. Furthermore, the connected domains are classified according to lines by a horizontal projection method, and then the connected domains with small height difference and small horizontal distance are selected from each line in sequence to serve as basic elements of the fitted straight line, so that interference factors such as broken strokes, one point, one cross and the like of the Chinese characters, which cannot reflect the overall structure of the Chinese characters but can influence the accuracy of the fitted straight line, are removed, the accuracy of the inclination angle of the fitted straight line reflecting the inclination degree of the text image is effectively improved, and the accuracy of the inclination angle correction of the text image is improved.

4. Furthermore, a third central point set is composed of central points of the connected regions in the same row, and straight lines are fitted according to each third central point set, so that straight lines are fitted in the horizontal direction.

5. Furthermore, the abnormal fitting straight lines with larger inclination angles than those of other straight lines are identified by traversing the inclination angle difference between every two fitting straight lines, the abnormal fitting straight lines are excluded, and an optimal rotation angle is calculated according to the fitting straight lines with small inclination angle difference to correct the text image, so that the accuracy of the inclination angle correction of the text image is improved.

Drawings

FIG. 1 is a block diagram illustrating a flow chart of an embodiment of a method for correcting a tilt angle of a text image according to the present invention;

FIG. 2 is a diagram illustrating a first text image according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating a fourth text image according to an embodiment of the present invention;

FIG. 4 is a schematic view of a fitted straight line according to an embodiment of the present invention;

FIG. 5 is a schematic view of another fitted straight line according to an embodiment of the present invention;

FIG. 6 is a diagram illustrating a second text image according to an embodiment of the present invention;

FIG. 7 is a diagram illustrating a third text image according to an embodiment of the present invention;

FIG. 8 is a block diagram illustrating a tilt angle correction terminal for text images according to an embodiment of the present invention;

FIG. 9 is a diagram illustrating a fifth text image according to an embodiment of the present invention;

FIG. 10 is a diagram illustrating a sixth text image according to an embodiment of the present invention;

description of reference numerals:

1. a processor; 2. A memory.

Detailed Description

The invention is described in detail below with reference to the figures and the specific embodiments.

Referring to fig. 1 to fig. 10,

the first embodiment of the invention is as follows:

as shown in fig. 1, the present embodiment provides a method for correcting a tilt angle of a text image, including:

s1, correcting the inclination angle of the first text image to obtain a processed second text image; the range of the inclination angle is 0 to 45 degrees. The method specifically comprises the following steps:

and S11, rotating the first text image by 90 degrees to obtain a processed fourth text image.

For example, the first text image is shown in fig. 2, characters in the first text image are inverted and inclined, and the OCR recognition engine cannot correctly recognize corresponding characters according to the first text image. Fourth text image as shown in fig. 3, the fourth text image is a result of rotating the first text image by 90 degrees.

S12, detecting a connected domain of the first text image to obtain a first center point set; the first set of center points consists of center points of connected domains in the first text image. The method specifically comprises the following steps:

and S121, detecting connected domains of the first text image to obtain all connected domains in the first text image.

The connected domain refers to a set of all the points which are connected with each other, the points which are connected with each other form a region, and the points which are not connected with each other form a different region. The position of each character in the text image can be preliminarily recognized through connected component detection.

And S122, horizontally projecting the first text image, and adding the connected domains projected to fall into the same section to the same connected domain set to obtain a plurality of connected domain sets.

The horizontal projection is to project the first text image along the horizontal direction, and the pixel points falling into the same section come from the same row of connected domains. Connected components in the text image are classified by lines.

S123, acquiring a connected domain set from the plurality of connected domain sets.

S124, acquiring a connected domain from the connected domain set to obtain a first connected domain.

And S125, acquiring another connected domain adjacent to the first connected domain from the connected domain set to obtain a second connected domain.

And S126, calculating the average value of the height of the first connected domain and the height of the second connected domain to obtain the average character width.

And S127, if the difference value between the height of the first connected domain and the height of the second connected domain is smaller than a preset height threshold value, and the horizontal distance between the external rectangle of the first connected domain and the external rectangle of the second connected domain is smaller than the average character width, adding the central point of the first connected domain and the central point of the second connected domain to a preset third central point set.

S128, repeating the step S124 to the step S127 until the connected domain set is traversed.

For example, a connected domain set stores connected domains located in the same row, each connected domain in the connected domain set is traversed sequentially and compared with adjacent connected domains, so that connected domains which are not greatly different in height and small in horizontal distance in the same row are selected, and the center points of the connected domains are stored in a third center point set.

S129, repeatedly executing the step S123 to the step S128 until the plurality of connected domain sets are traversed to obtain a plurality of third central point sets; the first set of center points is comprised of a plurality of the third sets of center points.

For example, each set of connected domains is traversed to obtain a plurality of sets of third center points. A third set of centroids stores centroids of a row of connected components.

In the embodiment, the connected domains are classified according to lines by a horizontal projection method, and then the connected domains with small height difference and small horizontal distance are sequentially selected from each line as basic elements of the fitted straight line, so that interference factors such as broken strokes, one point, one horizontal line and the like of the Chinese characters, which cannot reflect the overall structure of the Chinese characters but can affect the accuracy of the fitted straight line, are removed, the accuracy of the inclination angle of the fitted straight line reflecting the inclination degree of the text image is effectively improved, and the accuracy of the inclination angle correction of the text image is improved.

S13, detecting a connected domain of the fourth text image to obtain a second center point set; the second set of center points consists of center points of connected domains in the fourth text image.

The specific implementation manner of step S13 is the same as that of step S12.

S14, fitting a straight line according to the first central point set to obtain a first straight line set; the included angle between the straight line and the horizontal direction is less than 45 degrees. The method specifically comprises the following steps:

s141, fitting a straight line according to the third central point set, and adding the straight line to the first straight line set;

and S142, repeatedly executing the step S141 until all the third central point sets are traversed.

For example, the result of fitting a straight line from the first set of center points is shown in FIG. 4.

S15, fitting straight lines according to the second central point set to obtain a second straight line set; the included angle between the straight line and the horizontal direction is less than 45 degrees.

The specific implementation manner of step S14 is the same as that of step S14. The result of fitting a straight line from the second center point is shown in fig. 5.

S16, if the number of elements of the first straight line set is larger than that of elements of the second straight line set, correcting the inclination angle of the first text image according to the first straight line set to obtain a second text image; otherwise:

For example, the number of straight lines in fig. 5 is obviously smaller than that in fig. 4, so the tilt angle of the first text image should be corrected according to the first set of straight lines to obtain the processed second text image.

And correcting the inclination angle of the first text image according to the first straight line set to obtain a second text image. The method specifically comprises the following steps:

s161, obtaining the inclination angle of a straight line from the first straight line set to obtain a first inclination angle.

And S162, if the difference value between the first inclination angle and the inclination angle of another straight line except the straight line in the first straight line set is smaller than a preset inclination angle threshold value, adding the another straight line to a preset third straight line set.

And S163, repeatedly executing the step S162 until the first straight line set is traversed.

For example, the inclination angle of one line in the first line set is 30 degrees, and the inclination angles of the other lines are 28 degrees, 29 degrees, 31 degrees, 32 degrees, and 20 degrees, respectively, and by repeatedly performing step S162, the line with the inclination angle of 20 degrees is excluded, and the other lines are taken as elements of a third line set.

S164, repeatedly executing the steps S161 to S163 until the first straight line set is traversed, so as to obtain a plurality of third straight line sets.

S165, obtaining the third straight line set with the largest number of elements to obtain a fourth straight line set.

For example, when the number of inclination angle-like lines obtained by using a line having an inclination angle of 30 degrees as a reference is the largest, the third line set is used as a reference for calculating the optimum rotation angle.

And S166, acquiring an optimal rotation angle according to the inclination angle of each straight line in the fourth straight line set.

The method includes, but is not limited to, selecting an average value, a maximum value, a minimum value, or a median of the inclination angles of the lines in the fourth line set as the optimal rotation angle.

And S167, rotating the first text image by the optimal rotation angle to obtain a second text image.

In the embodiment, the abnormal fitting straight line with the larger inclination angle compared with the inclination angles of other straight lines is identified by traversing the inclination angle difference between every two fitting straight lines, the abnormal fitting straight line is excluded, and an optimal rotation angle is calculated according to a plurality of fitting straight lines with small inclination angle difference to correct the text image, so that the accuracy of the inclination angle correction of the text image is improved.

And correcting the inclination angle of the fourth text image according to the second straight line set to obtain a second text image in the same implementation manner as that of correcting the inclination angle of the first text image according to the first straight line set to obtain the second text image.

In the embodiment, the straight line fitting operation is performed on the central points of the connected domains in the text image, if the connected domains are arranged in a disordered manner, the straight lines cannot be successfully fitted, and only when some connected domains are arranged regularly, a plurality of straight lines can be fitted according to the central points of the connected domains arranged regularly. In this embodiment, an initial text image (a first text image) is rotated by 90 degrees, a fourth text image rotated by 90 degrees is compared with the initial text image, and a plurality of connected domains of a fitting straight line obtained according to a central point of the connected domain are arranged more regularly, so that a fine adjustment operation of an inclination angle of 0-45 degrees is further performed on the basis of the arrangement. Before fine adjustment of the inclination angle of 0-45 degrees, the method can automatically recognize the approximate arrangement direction of the text in the initial text image, and determine whether to rotate the initial text image by 90 degrees according to the actual situation, thereby effectively improving the accuracy of the inclination angle correction of the text image.

And S2, calling an OCR recognition engine to recognize the second text image to obtain a first character string.

For example, as shown in FIG. 6, the OCR recognition engine is invoked to recognize the first character string from FIG. 6 as "Wangtangtong from J Xian Yun" U dust "Jiya-Yiying i Shi".

S3, rotating the second text image by 180 degrees to obtain a processed third text image; and calling an OCR recognition engine to recognize the third text image to obtain a second character string.

For example, the third text image is shown in FIG. 7, and the OCR recognition engine is invoked to recognize the second character string from FIG. 7 as "cast-on-stem! In the morning of the next day, I and the columns of Shenzhou flying songs are grouped as the female director king.

S4, if the number of the high-frequency words in the first character string is larger than that in the second character string, marking the second text image as a final text image; otherwise, marking the third text image as a final text image.

For example, the first string does not include high-frequency words, and the number of high-frequency words in the first string is 0. The second character string contains high-frequency words such as 'the next day', 'the first morning', 'column group' and 'director', and the number of the high-frequency words in the second character string is 4. Therefore, the number of high-frequency words in the second character string is greater than that in the first character string, and the third text image is the final text image.

In the method and the terminal for correcting the tilt angle of the text image, the characters in the text image are adjusted to be parallel to the horizontal direction, and at this time, the corrected second text image may be inverted. In order to improve the accuracy of the tilt angle correction of the text image, the present embodiment respectively recognizes the second text image and the inverted image (third text image) of the second text image by using an OCR recognition technology, and respectively matches the OCR recognition results of the second text image and the inverted image of the second text image with a preset high-frequency vocabulary library, where the final correction result is obtained when there are many high-frequency vocabularies in the second text image and the preset high-frequency vocabulary library. By analyzing the OCR recognition results of the corrected text images and verifying the accuracy of the correction results, an optimal corrected text image is selected as a final text image, and the accuracy of the inclination angle correction of the text image is improved to a great extent.

For example, with the existing tilt angle correction technology, only fig. 2 is corrected to fig. 9, fig. 3 is corrected to fig. 10, and the correction range is 0-45 degrees, but the corrected text image may be inverted, or it needs to be rotated 90 degrees to make the text in the image in the correct position. The present embodiment determines whether the correction result (fig. 2 and 3) by the prior art (correction within 45 degree angle) needs to be further rotated by 180 degrees or 90 degrees by the OCR recognition technology. For example, the OCR recognition result of FIG. 2 has fewer recognized correct characters and fewer high frequency words; when the OCR recognition is performed after the image in FIG. 2 is rotated by 180 degrees, the number of correct characters in the OCR recognition result is large, and high-frequency words are large. Therefore, the text image (figure 2) obtained by correction in the prior art needs to be rotated by 180 degrees, the range of the prior art capable of automatically correcting the text inclination angle is expanded, and the accuracy of the inclination angle correction of the text image is improved.

The second embodiment of the invention is as follows:

as shown in fig. 8, the present embodiment provides a tilt angle correction terminal for a text image, including one or more processors 1 and a memory 2, where the memory 2 stores a program and is configured to be executed by the one or more processors 1 to:

For example, the first text image is shown in fig. 2, and the fourth text image is shown in fig. 3.

The specific implementation manner of step S13 is the same as that of step S12.

For example, if the first character string does not contain high-frequency words, and the second character string contains high-frequency words such as "day two", "morning", "column group", and "director", the third text image is the final text image.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method for correcting a tilt angle of a text image, comprising:

correcting the inclination angle of the first text image to obtain a processed second text image, specifically:

correcting the inclination angle of the fourth text image according to the second straight line set to obtain a second text image;

the value range of the inclination angle is 0-45 degrees;

2. The method for correcting the inclination angle of the text image according to claim 1, wherein the detecting the connected components of the first text image to obtain the first central point set comprises the following specific steps:

s19, repeatedly executing the S13 to the S18 until the plurality of connected domain sets are traversed to obtain a plurality of third central point sets; the first set of center points is comprised of a plurality of the third sets of center points.

3. The method for correcting the inclination angle of the text image according to claim 2, wherein a first set of straight lines is obtained by fitting straight lines according to the first set of central points, and specifically:

s22, repeating the step S21 until all the third central point sets are traversed.

4. The method for correcting the tilt angle of the text image according to claim 1, wherein the tilt angle of the first text image is corrected according to the first line set to obtain a second text image, specifically:

s3, repeatedly executing the S2 until the first straight line set is traversed;

s4, repeating the steps from S1 to S3 until the first straight line set is traversed to obtain a plurality of third straight line sets;

5. A tilt angle correction terminal for a text image, comprising one or more processors and a memory, the memory storing a program and configured to perform the following steps by the one or more processors:

correcting the inclination angle of the first text image to obtain a processed second text image; the value range of the inclination angle is 0-45 degrees, and specifically comprises the following steps:

6. The terminal for correcting an inclination angle of a text image according to claim 5, wherein the connected component detection is performed on the first text image to obtain a first central point set, specifically:

7. The terminal for correcting the inclination angle of the text image according to claim 6, wherein a first set of straight lines is obtained by fitting straight lines according to the first set of central points, and specifically:

8. The terminal for correcting the tilt angle of a text image according to claim 5, wherein the second text image is obtained by correcting the tilt angle of the first text image according to the first straight line set, and specifically comprises:

s3, repeatedly executing the S2 until the first straight line set is traversed;