CN115063279B

CN115063279B - Method and device for preprocessing text watermark image

Info

Publication number: CN115063279B
Application number: CN202210305987.2A
Authority: CN
Inventors: 李公宝; 丛升日
Original assignee: Beijing Guoyin Technology Co ltd
Current assignee: Beijing Guoyin Technology Co ltd
Priority date: 2022-03-25
Filing date: 2022-03-25
Publication date: 2023-03-14
Anticipated expiration: 2042-03-25
Also published as: CN115063279A

Abstract

The invention relates to a method and a device for preprocessing a text watermark image. The method comprises the following steps: extracting all text lines in the text watermark image; positioning a text image area to be corrected according to the relative position relation between text lines, and obtaining left and right boundary lines of the text image area in a straight line fitting mode, so as to position a minimum external quadrangle of the text image area; and automatically correcting the text watermark image by utilizing the minimum external quadrangle of the text image area and adopting a four-point perspective transformation method. The method can realize the automatic correction function of the text watermark image, solves the problems of document deflection, trapezoidal deformation and the like frequently occurring in the photographed text watermark image, improves the extraction and identification efficiency of the text watermark information, and has the characteristics of low complexity, simplicity, high efficiency, strong practicability and the like.

Description

Method and device for preprocessing text watermark image

Technical Field

The invention belongs to the technical field of image processing, relates to a method and a device for preprocessing a text watermark image, and particularly relates to a method and a device for automatically correcting a photographed text watermark image.

Background

The data and information of the electronic text document have the characteristics of convenient storage and quick transmission, and are easy to divulge and difficult to trace because the electronic text document is owned after being seen. In addition, the photographing function has become an essential function of the smart phone, and as the smart phone is developed and popularized, photographing with the mobile phone becomes very simple, which brings about a security problem of the electronic text document. A divulger can shoot sensitive information displayed on a computer screen or print out a paper document and leak the information out. On one hand, the enterprise and public institution has difficulty in implementing management of mobile phone/camera and other photographing devices, on the other hand, even if internal data is found to take leaked pictures through a screen, information such as a divulger and photographing time cannot be determined, and therefore targeted measures cannot be taken to block a divulging source. Therefore, embedding watermark information that is not recognizable to the naked eye from a text document displayed on an electronic screen or printed out is an important way to solve the above-mentioned problems.

In the process that the text document containing the watermark information is photographed by the smart phone or the digital camera, the situations of document inclination, photographing angle deflection and the like are inevitable, so that the generated document image can generate various deformations, and the identification efficiency of the text watermark is greatly influenced. Therefore, in order to improve the efficiency of extracting and identifying the text watermark image, the document image which generates geometric deformation needs to be corrected. In the prior art, most of the methods adopt a manual mode to correct the text watermark image, and the methods have low operation efficiency and cannot adapt to automatic batch processing.

Disclosure of Invention

The invention aims to provide a text watermark image automatic correction method based on perspective transformation aiming at the defects of the text watermark image correction method, which is used for solving the problems of poor automation and low accuracy in the existing text watermark image deformation correction process and further improving the automatic extraction efficiency of text watermark information.

The technical scheme of the invention is as follows:

a preprocessing method for text watermark images comprises the following steps:

extracting all text lines in the text watermark image;

positioning a text image area to be corrected according to the relative position relation between text lines, and obtaining left and right boundary lines of the text image area in a straight line fitting mode, so as to position a minimum external quadrangle of the text image area;

and automatically correcting the text watermark image by utilizing the minimum external quadrangle of the text image area and adopting a four-point perspective transformation method.

Furthermore, the region with the largest gradient change in the text watermark image is the text image region, so the method firstly obtains the gradient subgraph of the original image by using the image morphological transformation mode, extracts all text boxes in the gradient subgraph, and then combines all the text boxes in the same line to obtain the complete text line.

And further, the watermark positioning of the text image is realized by fitting the pixel points of the left (right) boundary line of the text line. When the text watermark image is subjected to angle rotation or trapezoidal deformation, all text lines have certain gradient characteristics, so that the text image area can be roughly positioned through the relative position relationship of the text lines. In order to automatically correct the deformed text image region, it is necessary to calculate a minimum quadrangle including the text image region in advance. The left (right) boundary line of the quadrangle is formed by fitting the pixel point sets of the left (right) boundaries of all the text lines in the text image area.

Further, the invention utilizes a four-point perspective transformation method to automatically correct the text watermark image. The Perspective Transformation (Perspective Transformation) refers to a process of projecting a text watermark image from an original plane to a new view plane parallel to the plane of an image shooting device.

A preprocessing device for text watermark image adopting the method comprises:

the text line extraction module is used for extracting all text lines in the text watermark image;

the text image area positioning module is used for positioning a text image area to be corrected according to the relative position relation between the text lines, obtaining left and right boundary lines of the text image area in a straight line fitting mode and further positioning a minimum external quadrangle of the text image area;

and the automatic correction module is used for automatically correcting the text watermark image by utilizing the minimum external quadrangle of the text image area and adopting a four-point perspective transformation method.

Compared with the prior art, the invention has the beneficial effects that:

by adopting the method and the device, the automatic correction function of the text watermark image can be realized, the problems of document deflection, trapezoidal deformation and the like which often occur in the photographed text watermark image are solved, the extraction and identification efficiency of the text watermark information is further improved, and the method and the device have the characteristics of low complexity, simplicity, high efficiency, strong practicability and the like. In addition, the method can also effectively solve the technical problems of complex layout analysis of mixed image-text, text region extraction and automatic image correction in complex natural scenes and the like, and can be applied to the field of conventional Optical Character Recognition (OCR).

Drawings

FIG. 1 is an original text watermark image;

FIG. 2 is a gradient subgraph obtained from FIG. 1;

FIG. 3 is a schematic view of a textbox area;

FIG. 4 is a diagram of a text box to be searched positioned to the left of a known text box;

FIG. 5 is a diagram of all lines of text;

FIG. 6 is a line of text reserved for participation in computations;

FIG. 7 is a diagram illustrating the quadrilateral positioning effect of text regions;

fig. 8 shows the effect of correction using the four-point perspective transformation.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention shall be described in further detail with reference to the following detailed description and accompanying drawings.

The invention provides a preprocessing method of a text watermark image. The specific implementation of the overall process is as follows:

s101, firstly, automatically extracting all text lines in the text watermark image.

The region with the largest gradient change in the text watermark image is the text image region, therefore, the method firstly utilizes the mode of image morphological transformation to obtain the gradient subgraph of the original image, extracts all text boxes in the gradient subgraph, and then combines all the text boxes in the same line to obtain the complete text line region.

1. And extracting gradient subgraphs from the original text watermark image.

For a given original image I, as shown in FIG. 1, a gradient subgraph I is obtained as follows _p ：

(1) Firstly, acting a Sobel operator on an original image I to obtain an initial gradient image I _t Namely:

I _t ＝Sobel(I) (1)

(2) To I _t Obtaining an image I after carrying out the binarization operation of Otsu method _b ；

(3) To make the outline of the text region more prominent, for I _b Performing a dilation operation, namely:

I _P ＝dilate(I _b ) (2)

I _p namely, the gradient subgraph obtained after the original image I is processed, as shown in fig. 2.

2. And extracting the region where the text box is located from the gradient subgraph.

Search gradient subgraph I _p The edge pixels of all the white areas in the image are obtained into a closed contour curve point set L _i And calculating each closed curve L _i The smallest circumscribed quadrangle of (a). Filtering the areas contained by all quadrangles, wherein the graphic objects meeting the following conditions are the text box areas:

(1) External quadrangle S _i Is greater than a threshold T;

(2) External quadrangle S _i Height h of _i Is less than or equal to width w _i 2 times of the total weight of the powder.

Obtaining a final text box sequence S after the condition filtering ₁ ,S ₂ ,…,S _n 。

3. And combining the text boxes to obtain all complete text line regions.

And dividing all the text boxes into different sets according to the position relation among the text boxes, and combining the text boxes in each set according to the sequence from left to right to obtain a complete text line. To determine whether a merge is possible between two text boxes, the data that each text box needs to store is shown in FIG. 3, where P ₁ ,P ₂ ,P ₃ ,P ₄ For four vertices stored in clockwise order, L _e Is the midpoint of the left boundary of the text box, R _i Is the middle point of the right border of the text box, C _e The center point of the entire text box. Left vertex P ₁ 、P ₄ Distance between two points and right vertex P ₂ 、P ₃ The maximum value of the distance between the two points is taken as the height of the text box.

For the current text box is S _i If the new searched text box S _k Is located at S _i As shown in fig. 4, if the following three conditions are satisfied, the text is displayedFrame S _k And S _i It is possible to merge:

(1)S _k is located at the center point of S _i To the left of the center point of (a);

(2)S _k and S _i Should be in the same row, i.e. satisfy:

wherein, P _1i (y)、R _ik (y)、P _4i (y)、P _2k (y)、L _ei (y)、P _3k (y) are each P _1i 、R _ik 、P _4i 、P _2k 、L _ei 、P _3k The ordinate of the point;

(3) In all the text boxes satisfying (1) and (2), S _k Right boundary midpoint R of _ik And S _i Left boundary midpoint L of _ei Is the smallest.

When the text box S to be searched _k In the current text box S _i On the right hand side, the manner of determination is similar.

All the text box sequences S positioned on the l line in the original document image ₁ (l),S ₂ (l),…,S _n (l) The merging results in a complete text line LW (l), as shown in fig. 5. Wherein the left boundary of LE (l) is S ₁ (l) The left boundary of LE (l) is S _n (l) The connecting line of the upper and lower boundary points of the right boundary of the text box sequence is the upper boundary and the lower boundary of the LE (l), and the line height of the LE (l) is equal to the maximum value of the heights of all the text boxes in the text box sequence.

And S102, realizing the watermark positioning of the text image by fitting the pixel points of the left (right) boundary line of the text line.

When the text watermark image is subjected to angle rotation or trapezoidal deformation, all text lines have certain gradient characteristics, so that the text image area can be roughly positioned through the relative position relationship of the text lines. In order to automatically correct the deformed text image region, it is necessary to calculate a minimum quadrangle including the text image region in advance. The left (right) boundary line of the quadrangle is formed by fitting the pixel point sets of the left (right) boundaries of all the text lines in the text image area.

1. And fitting straight lines of the left and right boundary lines of the text image area.

In the same text paragraph, the left (right) boundary position offset step length when all text lines are deformed basically keeps the same gradual change rule, so that the left (right) boundary line of the quadrangle is kept parallel to the offset direction of the text lines. The left boundary is taken as an example to describe a specific fitting method, and the right boundary is processed in a similar manner. To fit the borderline more accurately, the paragraph top lines of each text paragraph need to be filtered out, since the first characters of these lines will typically have some indentation, thus breaking the gradual change law of the position offset step. The specific filtering method is as follows:

first, the coordinates of the center point of the left boundary of each text line are calculated, i.e. L as shown in FIG. 3 _e The coordinates of the point are noted as the coordinates of the starting point of the text line. For all text line sequences L ₁ ,L ₂ ,…,L _n The index values of the text lines that need to be retained are stored in the set S. Suppose that the currently reserved line of text L _x If it stores its index x in S, then the lines L satisfying the following conditions are found in the lines from x +1 to x +3 _y ：

Wherein X _x And X _y Respectively a text line L _x And line L of text _y Coordinates of the starting point, h _x Is the current line L _x The row height of (a). Line L of text that will satisfy the above conditions _y Stores the subscript y of (a) into the set S, with the text line L _y And continuing to search downwards as the current line until all the text lines are filtered. All the remaining lines of text are marked for fitting of the left border as shown in fig. 6.

Let the elements in the set S be S ₁ ,S ₂ ,…,S _k Then, go toThe direction of the shift of the document image, i.e., left, right or no shift, is judged by the text lines saved in S. In general, the offset direction of all text lines in a text image is uniform. The specific judgment process is as follows:

step1: if the absolute value of the difference between the horizontal coordinates of the starting points of all the text lines in the set S is smaller than the threshold value T, the left boundary of the text image area can be judged to have no deflection, the text image does not need to be corrected, and otherwise, the Step2 is carried out;

step2: let the subscript of the current line be S _i For the next line of text S _i+1 If the following conditions are satisfied, it can be determined that the text line is shifted to the right:

wherein, T _L If it is a preset line deflection threshold value, S is set _i And S _i+1 Stored in the set Q. If the right shift is determined, the following condition is satisfied:

then S can be equally divided _i+1 Added to set Q because of line of text S _i And S _i+1 The gradual change of the offset between the two parts is not obvious, and the whole position offset trend is not influenced. If the text line S _i+1 The following occurs:

to explain the occurrence of abnormality in the positional relationship of the present line, S _i+1 Cannot be saved to the set Q, requiring further review of the next line of text S _i+2 。

If satisfy | S _i+2 -S _i I =2, and the line of text S _i And S _i+2 Also satisfies

Or

Then S _i+2 Can be saved into the set Q and the subsequent search is continued. Otherwise, the determination process ends.

It should be noted that the process for determining the left shift of the text line position is similar to the above process, and specifically includes:

for the current text line S _i If the next text line S _i+1 If the following conditions are satisfied, the text line is judged to be left offset:

will S _i And S _i+1 Stored in the set Q; if it has been determined that the text behavior is shifted to the left while satisfying equation (6), then S is applied _i+1 Adding to the set Q; if the text line S _i+1 The following occurs:

then further review of the text line S is required _i+2 ；

If satisfy | S _i+2 -S _i I =2, and the line of text S _i And S _i+2 Satisfy the requirement of

Or

Then S _i+2 Storing the search result into a set Q, and continuing to perform subsequent search; otherwise, the determination process ends.

Step3: when the text line is shifted to the left or right, the element in the set Q is Q ₁ ,Q ₂ ,…,Q _t The corresponding text lines are respectively

A set P of points for the left boundary of the t lines of text is obtained, i.e. the vertex P as shown in FIG. 3 ₁ To P ₄ After straight line fitting is carried out on all pixel points by using a least square method, the slope of the left boundary line of the text image region can be obtained:

the corresponding offset is then:

wherein (x) _i ,y _i ) Is a point P in the point set P _i N is the number of dots included in the set P.

Likewise, the right boundary of the text image region can be fitted in the same way, except that the end line of each text segment needs to be filtered out.

2. The smallest circumscribed quadrangle of the text image area is determined.

Calculating a straight line l intersecting the leftmost (right) boundary of the text line of the text image region by the slope of the left (right) boundary line _left (l _right ) In addition, the minimum quadrangle containing the text image region can be calculated by the upper boundary line of the topmost text line and the lower boundary line of the bottommost text line of the text image region, and the four vertexes of the minimum quadrangle are respectively the top left vertex P _lt Top right corner vertex P _rt Lower right corner vertex P _rb And the lower left corner vertex P _lb . The final quadrilateral area effect is shown in fig. 7.

And S103, automatically correcting the text watermark image by using a four-point perspective transformation method.

The invention utilizes a four-point perspective transformation method to automatically correct the text watermark image. The Perspective Transformation (Perspective Transformation) is a process of projecting a text image from an original plane to a new viewing plane parallel to the plane of the image capturing device.

A point (u, v) set in the original image corresponds to a coordinate point (x, y) in the transformed image, and the general perspective transformation formula is:

wherein w represents the scaling of the image in the plane of the original image, w' represents the scaling of the image in the plane of the transformed image,

is a transformation matrix.

Four vertices P of a minimum bounding quadrilateral passing through the text image region _lt 、P _rt 、P _rb 、P _lb And four vertices P of the rectified rectangular text image region _lt ′、P _rt ′、P _rb ′、P _lb ', the transformation matrix in equation (10) can be calculated. The corrected text image area can be calculated by substituting equation (10) for equations (11) and (12). That is, for any point (u, v) in the original image, the coordinates (x, y) in the corrected image are calculated by:

the final auto-rectification result of the original image is shown in fig. 8.

Based on the same inventive concept, another embodiment of the present invention provides a device for preprocessing a text watermark image by using the method of the present invention, including:

Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps of the inventive method.

Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.

The particular embodiments of the present invention disclosed above are illustrative only and are not intended to be limiting, since various alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The invention should not be limited to the disclosure of the embodiments in the present specification, but the scope of the invention is defined by the appended claims.

Claims

1. A preprocessing method for text watermark images is characterized by comprising the following steps:

extracting all text lines in the text watermark image;

the minimum external quadrangle of the text image area is utilized, and a four-point perspective transformation method is adopted to automatically correct the text watermark image;

the left and right boundary lines of the text image area are obtained by a straight line fitting mode, wherein the step of obtaining the left boundary line comprises the following steps:

filtering out the paragraph top line of each text paragraph;

let the elements in the set S be S ₁ ，S ₂ ，...，S _k Judging the offset direction of the document image, namely, left, right or no offset, through the text line saved in the S, wherein the judging process is as follows:

step1: if the absolute value of the difference between the horizontal coordinates of the starting points of all the text lines in the set S is smaller than the threshold value T, judging that the left boundary of the text image area is not deflected, the text image is not corrected, and otherwise, entering Step2;

step2: let the subscript of the current line be S _i For the next line of text S _i+1 If the following conditions are met, judging that the text line is shifted to the right:

wherein, T _L Is a preset line deflection threshold value, and S _i And S _i+1 Stored in the set Q; if the right shift is determined, the following condition is satisfied:

then handle S _i+1 Adding into the set Q; if the text line S _i+1 The following occurs:

to explain the occurrence of abnormality in the positional relationship of the present line, S _i+1 Cannot be saved to the set Q, requiring further review of the next line of text S _i+2 ；

If satisfy | S _i+2 -S _i | =2, and line of text S _i And S _i+2 Also satisfies

Or

Then S _i+2 Storing the search result into a set Q, and continuing to perform subsequent search; otherwise, the judging process is finished;

the manner of judging the left shift of the text line is as follows:

will S _i And S _i+1 Stored in the set Q; if it has been determined that the text behavior is shifted to the left while satisfying equation (6), then S is applied _i+1 Adding into the set Q; if the text line S _i+1 The following occurs:

further review of the text line S is required _i+2 ；

If satisfy | S _i+2 -S _i | =2, and line of text S _i And S _i+2 Satisfy the requirement of

Or

step3: when the text line is shifted to the left or right, the element in the set Q is Q ₁ ，Q ₂ ，...，Q _t The corresponding text lines are respectively

Obtaining a point set P of the left boundary line of the t text lines, and performing straight line fitting by using a least square method to obtain the slope of the left boundary line of the text image region:

the corresponding offset is then:

wherein (x) _i ，y _i ) Is a point P in the point set P _i N is the number of dots included in the set P.

2. The method of claim 1, wherein extracting all lines of text in the text watermark image comprises:

firstly, a gradient subgraph of an original image is obtained by using an image morphological transformation mode, all text boxes are extracted from the gradient subgraph, and then all the text boxes in the same line are combined to obtain a complete text line.

3. The method of claim 1, wherein filtering out a paragraph leader line of each text paragraph comprises:

firstly, calculating the coordinate of the center point of the left boundary of each text line and recording the coordinate as the coordinate of the starting point of the text line; for all text line sequences L ₁ ，L ₂ ，...，L _n Storing subscript values of text lines needing to be reserved into a set S; suppose that the currently reserved line of text L _x If it stores its index x in S, then the lines L satisfying the following conditions are found in the lines from x +1 to x +3 _y ：

Wherein X _x And X _y Are respectively a text line L _x And line L of text _y Coordinates of the starting point, h _k Is the current line L _k High of (2); line L of text that will satisfy the above conditions _y Stores the subscript y of (a) into the set S, with the text line L _y Continuing to search downwards as the current line until all text lines are filtered; marking all the reserved text lines and then using the marked text lines for fitting the left boundary line; the right border line of the text region is fitted in a similar manner to the above process, except that the end lines of each text segment need to be filtered out.

4. The method of claim 1, wherein locating the smallest bounding quadrilateral of the text image region comprises: calculating a straight line l intersecting the leftmost border of the text line of the text image region by the slope of the left border _left Calculating a straight line l intersecting the rightmost boundary of the text line of the text image region from the slope of the right boundary line _right The minimum quadrangle containing the text image region can be calculated by the upper boundary line of the uppermost text line of the text image region and the lower boundary line of the lowermost text line, and the four vertexes of the minimum quadrangle are respectively the top left vertex P _lt Top right corner vertex P _rt Lower right corner vertex P _rb And the lower left corner vertex P _lb 。

5. The method of claim 1, wherein the four-point perspective transformation method is a process of projecting the text image from the original plane to a new viewing plane parallel to the plane of the image capture device.

6. The method according to claim 1, wherein the automatic rectification of the text watermark image by using a four-point perspective transformation method comprises:

a point (u, v) in the original image, corresponding to a coordinate point (x, y) in the transformed image, is given by the general perspective transformation formula:

four vertices P of a minimum bounding quadrilateral passing through the text image region _lt 、P _rt 、P _rb 、P _lb And four vertices P of the rectified rectangular text image region _lt ′、P _rt ′、P _rb ′、P _lb ', calculating the transformation matrix in equation (10); substituting the formula (10) for the formulas (11) and (12) to calculate to obtain a corrected text image area; for any point (u, v) in the original image, the coordinates (x, y) in the rectified image are calculated by:

7. an apparatus for preprocessing a text watermark image using the method of any one of claims 1 to 6, comprising:

8. An electronic apparatus, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1 to 6.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a computer, implements the method of any one of claims 1 to 6.