CN109543525B

CN109543525B - Table extraction method for general table image

Info

Publication number: CN109543525B
Application number: CN201811217691.5A
Authority: CN
Inventors: 边赟; 李天易; 罗嘉礼; 巫浩; 李腾飞; 倪浩原
Original assignee: Chengdu Zhongke Information Technology Co ltd
Current assignee: Chengdu Zhongke Information Technology Co ltd
Priority date: 2018-10-18
Filing date: 2018-10-18
Publication date: 2020-12-11
Anticipated expiration: 2038-10-18
Also published as: CN109543525A

Abstract

The invention belongs to the field of office automation, and discloses a table extraction method of a general table image, which comprises the following steps: s1, preprocessing the universal form image; s2, filtering characters in the preprocessed image by using morphological operation; s3, carrying out reconstruction operation on the filtered image; s4, redrawing the table to realize table extraction; the invention solves the problems of low speed and low accuracy of table extraction in the prior art.

Description

Table extraction method for general table image

Technical Field

The invention belongs to the field of office automation, and particularly relates to a table extraction method for a general table image.

Background

The form, also called table, is a visual communication mode and a means for organizing and organizing data. People widely adopt tables with different shapes and colors in communication, scientific research and data analysis activities. Forms of various kinds often appear in many places such as print media, handwritten records, computer software, architectural decorations, traffic signs, and the like. The conventions and terms used to describe the tables exactly will vary from context to context. In addition, different tables may differ in kind, structure, flexibility, labeling, expression, and use. In various books and technical articles, tables are usually placed in a floating area with numbers and titles, thereby distinguishing them from the body parts of the articles.

Forms are widely used in various industries as a highly refined, centralized form of information presentation. With the popularization of computers and the improvement of enterprise informatization degree, the use of computers for making forms is increasing. In practical applications, due to differences in industries and application fields, contents and formats of forms are very different, it is difficult to meet various application requirements with several specific form styles, and people have increasingly used electronic documents instead of paper documents, for example, a user can take an image of a paper document with a smart phone and then send the image to others to complete information transmission, but electronic documents obtained by shooting or scanning are stored in a picture format.

In summary, the diversity of tables and the table images stored in the picture format make the tables difficult to extract.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention aims to provide a fast and accurate table extraction method for a general table image, which extracts a table from the table image, improves the efficiency of document management in office automation, and solves the problems of slow speed and low accuracy of table extraction in the prior art.

The technical scheme adopted by the invention is as follows:

a table extraction method of a universal table image comprises the following steps:

s1: preprocessing the universal form image;

s2: filtering characters in the preprocessed image by using morphological operation;

s3: the reconstruction operation is carried out on the filtered image, and the method comprises the following steps:

s3-1: and performing open operation image reconstruction, comprising the following steps:

a-1: initializing an input filtered image instant operation image into an original marked image;

a-2: creating a second structural element;

a-3: and carrying out reconstruction operation according to the original marked image and the second structural element, wherein the calculation formula is as follows:

in the formula, h_k+1Marking the image for the current time; h is_kMarking the image for the previous generation; k is an iteration parameter; b is a second structural element;

is a function of the expansion operation; g is a mask image; n is intersection;

a-4: judging whether the current marked image is the same as the previous generation marked image, if so, realizing reconstruction, outputting the obtained current marked image, otherwise, updating the iteration parameters, and returning to the step A-3;

s3-2: extracting line feature coordinates of the obtained marked image, comprising the following steps:

b-1: representing the binary image by adopting two-dimensional data;

b-2: taking the first element of the two-dimensional array as a projection origin, projecting other elements in the horizontal and vertical directions, and obtaining an accumulated value corresponding to the projection direction;

b-3: traversing the accumulated values, and screening out all feature coordinates meeting the conditions, wherein the accumulated values corresponding to the feature coordinates are larger than the minimum distance between the coordinate horizontal lines in the two projection directions;

b-4: further analyzing the characteristic coordinates, and extracting a middle value when more than one characteristic coordinate meets the conditions;

b-5: traversing other elements of the two-dimensional array, and repeating the steps B-2 to B-4 to obtain a characteristic coordinate array;

s3-3: classifying the image intersection points according to the characteristic coordinate array, and calculating a weight array of a table in the marked image, wherein the method comprises the following steps:

c-1: dividing the table into different types according to the mode that the regular line segments of the table are connected with the grid points;

c-2: traversing each intersection point according to the obtained characteristic coordinate array, namely the horizontal and vertical line coordinate array;

the intersection points are the intersection points of the horizontal lines and the vertical lines in the table;

c-3: sequentially extracting image blocks in four directions by taking the current intersection point as a starting point, and sequentially judging whether a straight line exists, wherein if the straight line exists, the output result is 1, and if not, the output result is 0;

the four directions are: a lateral positive direction, a lateral negative direction, a longitudinal positive direction, and a longitudinal negative direction;

c-4: classifying the intersection points according to the output result, storing the output result in the established corresponding three-dimensional array, and modifying and checking all the three-dimensional arrays according to the modification rule;

the modification rule is as follows: if a transverse positive straight line exists at the intersection point, a transverse negative straight line inevitably exists at the intersection point on the right side of the intersection point, and the other directions are the same;

c-5: and obtaining the weight sum of each intersection point according to the weight of each corresponding direction in the three-dimensional array, and storing the weight sum in a weight array, wherein the calculation formula is as follows:

p(x,y)＝q₁*p(x,y,0)+q₂*p(x,y,1)+q₃*p(x,y,2)+q₄*p(x,y,3)

in the formula, p (x, y) is a weight array; q. q.s₁P (x, y,0) is the product of the weight corresponding to the horizontal direction and the three-dimensional array; q. q.s₂P (x, y,1) is the product of the weight corresponding to the negative direction and the three-dimensional array; q. q.s₃P (x, y,2) is the product of the weight corresponding to the longitudinal direction and the three-dimensional array; q. q.s₄P (x, y,3) is the product of the weight corresponding to the longitudinal direction and the negative direction and the three-dimensional array;

s4: redrawing the table to realize table extraction;

and traversing each row and each column according to the type of the intersection point and the weight array, searching corresponding start endpoint coordinates and end endpoints, connecting the start endpoint coordinates and the end endpoints, redrawing the table, and extracting the table.

Further, in step S1, the preprocessing includes reduction processing, gradation processing, and binarization processing performed in this order.

Further, in step S1, the calculation formula of the preprocessing is:

carrying out gray value processing by using a weighting method, wherein the calculation formula is as follows:

F(i,j)＝0.3R(i,j)+0.59G(i,j)+0.11B(i,j)

in the formula, F (i, j) is the gray value of the pixel point; r (i, j), G (i, j) and B (i, j) are corresponding to the brightness values of the red, green and blue colors of the pixel points; (i, j) are pixel point coordinates;

the calculation formula for the binarization processing is as follows:

wherein g (x, y) is a binarized image; f (x, y) is an input gray image; t is a threshold value.

Further, the gray value with the largest inter-class variance is used as the optimal threshold value for binarization processing, and the calculation formula is as follows:

T^*＝arg max[g(t)]

in the formula, T^*Is an optimal threshold value; g (t) is the between-class variance; t is a selected gray value variable, and t belongs to { 0., M-1}, wherein M is the number of gray values in the image; arg max (·) is the maximum satisfaction function.

Further, in step S2, the morphological operation performed on the preprocessed image, i.e., the binarized image, is an on operation, i.e., an erosion operation and an expansion operation are performed in sequence, and the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

is an open operation function; SE is a first structural element; a is an input binary image;

is a function of the expansion operation; theta is the corrosion operating function.

The invention has the beneficial effects that:

1) the method is based on the digital image processing technology, and is used for extracting the forms in the general form image, so that the speed and the practicability of the extraction method are improved;

2) the method filters the characters in the general form image by using morphological operation, and filters the characters in the image by using open operation, thereby eliminating fine noise, achieving the effect of denoising the target, smoothing the outline of the target object, eliminating fine protrusions and improving the accuracy of form extraction;

3) the invention improves the efficiency of document management in office automation and solves the problems of low speed and low accuracy of table extraction in the prior art.

Drawings

FIG. 1 is a flow diagram of a table extraction method for a generic table image;

FIG. 2 is a flow chart of a method of a reconstruction operation;

FIG. 3 is a flow chart of a method of performing an on-operation image reconstruction;

FIG. 4 is a flow chart of a method of extracting line feature coordinates of an acquired marker image;

FIG. 5 is a flowchart of a method of classifying image intersections and calculating a weight array for tables in a labeled image;

FIG. 6 is a generic form image;

FIG. 7 is a filtered image;

FIG. 8 is a redrawn table.

Detailed Description

The invention is further explained below with reference to the drawings and the specific embodiments.

Example 1:

as shown in fig. 1, the present embodiment provides a table extraction method for a common table image, including the following steps:

s1: preprocessing is performed on the general form image shown in fig. 6, wherein the preprocessing includes reduction processing, gray scale processing and binarization processing which are performed in sequence;

F(i,j)＝0.3R(i,j)+0.59G(i,j)+0.11B(i,j)

the calculation formula for the binarization processing is as follows:

wherein g (x, y) is a binarized image; f (x, y) is an input gray image; t is a threshold value;

the gray value with the largest inter-class variance is used as the optimal threshold value of binarization processing, and the calculation formula is as follows:

T^*＝arg max[g(t)]

in the formula, T^*Is an optimal threshold value; g (t) is the between-class variance; t is a selected gray value variable, and t belongs to { 0., M-1}, wherein M is the number of gray values in the image, and argmax (·) is a maximum satisfying function;

performing morphological operation on the preprocessed image, namely the binary image, to be open operation, namely performing corrosion operation and expansion operation in sequence;

the dilation operation is "domain dilation" in which a highlight portion in an image is dilated, an effect graph has a highlight area larger than an original image, mathematically, dilation is defined as a set operation, and dilation of SE to a is defined as:

this formula shows that SE expands A as a result of

A set of points z overlapping at least one pixel with A, the dilation operation being operative to incorporate background points with the target area into the object, to fill in objects on the imageA small space in the middle;

erosion is "field predated" where the highlight in the original is eroded, the effect map has a smaller highlight area than the original, mathematically, the dilation or erosion operation is a convolution of the image (or a portion of the image) with a kernel, and the erosion of SE on a is defined as:

A⊙SE＝{z|(SE)_Z∈A}

this formula shows that the result of the expansion of SE on A is such that (SE)_zContaining the set of points z in A, the erosion operation will remove burr bumps in the image that are smaller than the structural elements;

in practical operation, when the binary image is processed, the dilation operation and the erosion operation are combined together, and the operation of opening operation is to perform the erosion operation on the A by the SE first and then perform the dilation operation. The calculation formula of the open operation of SE on a is as follows:

in the formula (I), the compound is shown in the specification,

The opening operation can eliminate fine noise, so that the effect of denoising the target is achieved, the contour of the target object is smoothed, and fine protrusions are eliminated;

s3: performing a reconstruction operation on the filtered image as shown in fig. 7;

the reconstruction operation, as shown in fig. 2, includes the following steps:

s3-1: performing on-operation image reconstruction, which is a morphological transformation involving two images and a structural element, one image being a marker image, i.e. a starting point of the transformation, and the other image being a mask image, and used for constraining the transformation process, as shown in fig. 3, the marker image obtaining method for the reconstruction operation includes the following steps:

a-2: creating a second structural element;

is a function of the expansion operation; g is a mask image; n is intersection;

s3-2: extracting the line feature coordinates of the obtained marked image, as shown in fig. 4, a binary image can be represented by a two-dimensional array, including the following steps:

b-1: representing the binary image by adopting two-dimensional data;

s3-3: classifying the image intersection points according to the feature coordinate array, and calculating the weight array of the table in the marked image, as shown in fig. 5, the method comprises the following steps:

p(x,y)＝q₁*p(x,y,0)+q₂*p(x,y,1)+q₃*p(x,y,2)+q₄*p(x,y,3)

s4: and (3) redrawing the table to extract the table, as shown in fig. 8, traversing each row and each column according to the type of the intersection point and the weight array, searching corresponding start endpoint coordinates and end endpoints, connecting the start endpoint coordinates and the end endpoints to finish redrawing the table, and extracting the table.

The invention aims to provide a rapid and accurate table extraction method of a general table image, which is used for extracting a table from the table image, improving the management efficiency of documents in office automation and solving the problems of low speed and low accuracy of table extraction in the prior art.

The present invention is not limited to the above-described alternative embodiments, and various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. A form extraction method of a general form image is characterized in that: the method comprises the following steps:

s1: preprocessing the universal form image;

a-2: creating a second structural element;

is a function of the expansion operation; g is a mask image; n is intersection;

b-1: representing the binary image by adopting two-dimensional data;

the intersection point is the intersection point of a transverse line and a longitudinal line in the table;

p(x,y)＝q₁*p(x,y,0)+q₂*p(x,y,1)+q₃*p(x,y,2)+q₄*p(x,y,3)

s4: redrawing the table to realize table extraction;

2. The form extraction method for a common form image according to claim 1, characterized in that: in step S1, the preprocessing includes a reduction process, a gradation process, and a binarization process performed in this order.

3. The form extraction method for a common form image according to claim 2, characterized in that: in step S1, the calculation formula of the preprocessing is:

F(i,j)＝0.3R(i,j)+0.59G(i,j)+0.11B(i,j)

the calculation formula for the binarization processing is as follows:

4. The form extraction method for a common form image according to claim 3, characterized in that: the gray value with the largest inter-class variance is used as the optimal threshold value of binarization processing, and the calculation formula is as follows:

T^*＝arg max[g(t)]

5. The form extraction method for a common form image according to claim 4, wherein: in step S2, the morphological operation performed on the preprocessed image, i.e., the binarized image, is an open operation, i.e., a corrosion operation and an expansion operation are sequentially performed, and the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,