CN112036232A

CN112036232A - Image table structure identification method, system, terminal and storage medium

Info

Publication number: CN112036232A
Application number: CN202010662891.2A
Authority: CN
Inventors: 刘云锴; 彭程; 边赟
Original assignee: Chengdu Zhongke Information Technology Co ltd; Chengdu Information Technology Co Ltd of CAS
Current assignee: Chengdu Zhongke Information Technology Co ltd; Chengdu Information Technology Co Ltd of CAS
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2020-12-04
Anticipated expiration: 2040-07-10
Also published as: CN112036232B

Abstract

The application relates to an image table structure identification method, a system, a terminal and a storage medium. The method comprises the following steps: carrying out frame line detection on the form image to be identified by using an LSD algorithm to respectively obtain the detection results of the transverse lines and the longitudinal lines of the form structure in the form image to be identified; respectively detecting each straight line in the detection results of the transverse lines and the longitudinal lines according to a set transverse threshold and a set longitudinal threshold to obtain straight lines which belong to a collinear common segment in the detection results of the transverse lines and the longitudinal lines, and combining two or more straight lines which belong to a collinear common segment of the same frame line to obtain complete transverse lines and longitudinal lines in the table structure; and combining the complete transverse lines and the complete longitudinal lines, and aligning the combined transverse lines and the combined longitudinal lines to obtain a table structure in the table image to be identified. The image preprocessing work of the embodiment of the application is less, the recognition speed is higher, and the recognition result is more accurate.

Description

Image table structure identification method, system, terminal and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method, a system, a terminal, and a storage medium for identifying an image table structure.

Background

The table is used as the most simplified expression mode for summarizing text data records or the most common expression format in data statistics and result analysis, and is a basic tool in various data analysis tools. Various forms are filled in the current network information, but many forms are provided in the form of pictures, such as various scanning archives, PDF files, etc., the automatic recognition of these image forms and the reduction of the form content of the picture type into digital data are the basis for the rapid processing and analysis of these data. The identification of the table image is more difficult than the common image text data due to the table structure characteristics of the table data.

The form image recognition method adopted in the prior art comprises the following steps:

1. detecting a table frame line based on a projection method; the method has the following disadvantages: the image preprocessing workload is large, the image needs to be corroded, the requirement on corrosion structure elements is high, and the final recognition result can be directly influenced; in addition, each pixel point in the picture needs to be judged, and the pixel points in the foreground are accumulated and summed, so that whether the pixel points are table frame lines or not is judged according to the magnitude of the summed value.

2. Identifying a table image based on a Hoffman method; the method has the following disadvantages: the frame line detection time cost is too high, the specific coordinate representation of the detection straight line cannot be determined, and false detection and missing detection are caused to adjacent pixel points or parallel lines which are similar to the same straight line.

3. The table frame line detection based on the run length method also has high requirements on the resolution of the picture, has poor effect in the low-resolution image, has poor detection effect on the frame lines of different sections on the same line, and has low accuracy when applied to engineering practice.

Disclosure of Invention

The application provides an image table structure identification method, a system, a terminal and a storage medium, which aim to solve the technical problems of large image preprocessing workload, long detection time and poor effect in low-resolution image or frame line detection in the prior art at least to a certain extent.

In order to solve the above problems, the present application provides the following technical solutions:

an image table structure identification method comprises the following steps:

step a: carrying out frame line detection on the form image to be identified by using an LSD algorithm to respectively obtain the detection results of the transverse lines and the longitudinal lines of the form structure in the form image to be identified;

step b: respectively detecting each straight line in the detection results of the transverse lines and the longitudinal lines according to a set transverse threshold and a set longitudinal threshold to obtain the straight lines which belong to the collinear common segment in the detection results of the transverse lines and the longitudinal lines, and combining two or more than two collinear common segment straight lines which belong to the same frame line to obtain the complete transverse lines and longitudinal lines in the table structure;

step c: and combining the complete transverse lines and the complete longitudinal lines, and aligning the combined transverse lines and the combined longitudinal lines to obtain a table structure in the table image to be identified.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step a, the frame line detection of the form image to be recognized by using the LSD algorithm includes:

calculating the level-line angle of each pixel point in the table image to be recognized;

defining an error value of the form image to be recognized, calculating an error between a level-line angle of each pixel point and a current region angle, performing region combination on the pixel points with the errors smaller than the error value, and updating the combined region;

constructing an external matrix for each updated region, calculating the NFA value of each updated region, and judging the matrix of which the NFA value meets a set threshold value as an output straight line;

the technical scheme adopted by the embodiment of the application further comprises the following steps: in the step a, the performing frame line detection on the form image to be recognized by using the LSD algorithm further includes:

and screening all the straight lines by using a set parameter threshold, and eliminating useless frame lines in all the straight lines to obtain the detection results of the horizontal lines and the vertical lines of the table structure in the table image to be identified.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in step b, the detecting each straight line in the detection results of the horizontal line and the vertical line according to the set horizontal threshold and the vertical threshold respectively includes:

let the coordinates of two adjacent straight lines be (x)₀,y₀,x₁,y₁) And (x)₀′,y₀′,x₁′,y₁') giving a transverse threshold lineWidth and a longitudinal threshold lineHeight, and judging whether straight lines need to be combined by adopting a straight line judgment rule based on double thresholds; the straight line judgment rule based on the double thresholds is specifically as follows:

will satisfy ((y)₀′-y₀)≤lineHeight)∨((y₁′-y₁) No more than lineHeight) is judged as a transverse colinear;

will satisfy (((y)₀′-y₀)≤lineHeight)∨((y₁′-y₁)≤lineHeight))∧ ((x₀′-x₁) Not more than lineWidth) is judged as a transverse collinear common segment;

will satisfy ((x)₀′-x₀)≤lineWidth)∨((x₁′-x₁) Not more than lineWidth) is judged as being longitudinally collinear;

will satisfy (((x)₀′-x₀)≤lineWidth)∨((x₁′-x₁)≤lineWidth))∧ ((y₀′-y₁) No more than lineHeight) are judged as longitudinal collinear common segments.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step b, the merging two or more collinear co-segment straight lines belonging to the same frame line further includes:

sorting the horizontal line coordinate sets in the horizontal line detection results from small to large according to the vertical coordinates of the horizontal line coordinate sets to obtain horizontal line sets, and sorting the vertical line coordinate sets in the vertical line detection results from small to large according to the horizontal coordinates of the vertical line coordinate sets to obtain vertical line sets;

traversing each transverse line and each longitudinal line in the transverse line set and the longitudinal line set respectively, and detecting by using the straight line judgment rule based on the double thresholds to obtain a transverse line collinear set and a longitudinal line collinear set;

traversing each transverse line and each longitudinal line in the transverse line collinear set and the longitudinal line collinear set respectively, and detecting the transverse lines and the longitudinal lines of the collinear common sections by using the straight line judgment rule based on the double thresholds;

respectively merging and thinning the transverse lines and the longitudinal lines of the collinear common sections;

and screening the combined and refined transverse lines and longitudinal lines according to a set length threshold to obtain a final transverse line combined and refined result set heng _ final and a final longitudinal line combined and refined result set zong _ final.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step c, the aligning the combined horizontal line and vertical line includes:

introducing heng _ left, heng _ right and y _ is _ visited into the horizontal line merging and refining result set heng _ final, wherein the heng _ left, heng _ right and y _ is _ visited are respectively used for judging whether each horizontal line meets the requirements of left coordinate alignment, right coordinate alignment and vertical coordinate alignment;

traversing each transverse line in the heng _ final, and respectively performing left coordinate alignment, right coordinate alignment and vertical coordinate alignment on each transverse line by using the heng _ left, heng _ right and y _ is _ visited to obtain the aligned heng _ final;

introducing zong _ up, zong _ below and x _ is _ visited into a set zong _ final of the result of merging and thinning the longitudinal lines, wherein the longitudinal lines are respectively used for judging whether each longitudinal line meets the requirements of upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment;

traversing each vertical line in the zong _ final, and respectively performing upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment on each vertical line by using the zong _ up, zong _ below and x _ is _ viewed to obtain the zong _ final after coordinate alignment.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the operation of performing left coordinate alignment, right coordinate alignment and vertical coordinate alignment on each transverse line comprises the following steps:

respectively obtaining a left coordinate, a right coordinate and a vertical coordinate of a current transverse line, if the heng _ left, heng _ right and y _ is _ found of the current transverse line are 0, indicating that the transverse line is not subjected to left alignment, right alignment and vertical alignment, traversing the residual transverse lines by taking the transverse line as a reference transverse line, finding out all transverse lines meeting preset conditions, respectively calculating a left coordinate mean value, a right coordinate mean value and a vertical coordinate mean value of all transverse lines meeting the preset conditions, respectively updating the left coordinate, the right coordinate and the vertical coordinate of the reference transverse line and all transverse lines meeting the preset conditions by using the left coordinate mean value, the right coordinate mean value and the vertical coordinate mean value, and obtaining heng _ final after the transverse lines are aligned;

the preset conditions are as follows: and the left coordinate difference, the right coordinate difference and the vertical coordinate difference with the reference transverse line are all within a set alignment threshold, and the left alignment, the right alignment and the vertical alignment are not carried out.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the operation of performing upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment on each longitudinal line comprises the following steps:

respectively acquiring an upper coordinate, a lower coordinate and an abscissa of a current longitudinal line, if zong _ up, zong _ below and x _ is _ determined are 0, indicating that the longitudinal line is not subjected to upper alignment, lower alignment and abscissa alignment, traversing the remaining longitudinal lines by taking the longitudinal line as a reference longitudinal line, finding out all longitudinal lines meeting preset conditions, respectively calculating an upper coordinate mean value, a lower coordinate mean value and an abscissa mean value of all the longitudinal lines meeting the preset conditions, respectively updating the upper coordinate, the lower coordinate and the abscissa of the reference longitudinal line and all the longitudinal lines meeting the preset conditions by using the upper coordinate mean value, the lower coordinate mean value and the abscissa mean value, and obtaining zong _ final after the longitudinal lines are aligned;

the preset conditions are as follows: and the upper coordinate difference, the lower coordinate difference and the abscissa difference with the reference vertical line are within a set alignment threshold, and the upper alignment, the lower alignment and the abscissa alignment are not carried out.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the method for obtaining heng _ final after the alignment of the horizontal lines further comprises the following steps:

traversing each transverse line in the heng _ final after the transverse lines are aligned, taking the current transverse line as a reference transverse line, traversing each longitudinal line in the zong _ final, finding out all longitudinal lines of which the horizontal coordinates and the horizontal coordinates of the reference transverse line are within an alignment threshold value, and assigning the horizontal coordinates of the reference transverse line as the horizontal coordinates of all the longitudinal lines within the alignment threshold value; if no vertical line with the horizontal coordinate within the alignment threshold value is found, the horizontal line is judged to be a false frame line, and the horizontal line is removed.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the step of obtaining the zong _ final after the alignment of the longitudinal lines further comprises:

traversing each longitudinal line in the aligned zong _ final of the longitudinal lines, taking the current longitudinal line as a reference longitudinal line, traversing each transverse line in the heng _ final, finding out all transverse lines of which the longitudinal coordinates and the longitudinal coordinates of the reference longitudinal line are within an alignment threshold, and assigning the longitudinal coordinates of the reference longitudinal line as the longitudinal coordinates of all transverse lines within the alignment threshold; and if no horizontal line with the vertical coordinate within the alignment threshold value of the vertical coordinate and the vertical coordinate of the reference vertical line is found, judging that the vertical line is a false frame line, and rejecting the false frame line.

Another technical scheme adopted by the embodiment of the application is as follows: an image table structure recognition system, comprising:

the frame line detection module: the method comprises the steps that frame line detection is carried out on a form image to be recognized by using an LSD algorithm, and detection results of transverse lines and longitudinal lines of a form structure in the form image to be recognized are obtained;

a threshold detection module: the system comprises a table structure, a horizontal line detection result and a vertical line detection result, wherein the table structure is used for respectively detecting each straight line in the horizontal line detection result and the vertical line detection result according to a set horizontal threshold and a set vertical threshold to respectively obtain straight lines which belong to a collinear common segment in the horizontal line detection result and the vertical line detection result, and combining two or more than two collinear common segments which belong to the same frame line to obtain a complete horizontal line and a complete vertical line in the table structure;

a frame line merging module: for merging the complete transverse and longitudinal lines;

a wire alignment module: and aligning the combined transverse line and longitudinal line to obtain a table structure in the table image to be identified.

The embodiment of the application adopts another technical scheme that: a terminal comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions for implementing the image table structure identification method;

the processor is to execute the program instructions stored by the memory to control image table structure identification.

The embodiment of the application adopts another technical scheme that: a storage medium storing processor-executable program instructions for performing the image table structure identification method.

Compared with the prior art, the embodiment of the application has the advantages that: the method, the system, the terminal and the storage medium for identifying the image table structure detect the frame line in the table structure by using an LSD algorithm, merge and refine a plurality of line segments belonging to the same frame line by adopting a straight line judgment rule based on double thresholds to obtain a complete frame line, and perform alignment and correction operations on the complete frame line to obtain a final table frame line. Compared with the prior art, the embodiment of the application has at least the following advantages:

1. the algorithm parameters are strong in universality and generalization performance; in the actual experiment process, the image resolution is close, and in the images with similar table structures, parameters are hardly required to be adjusted, so that the images can be directly packaged for use.

2. Compared with the traditional Huffman frame line detection method, the frame line detection method based on projection and the line segment detection method based on run, the LSD line segment detection algorithm widely applied to the remote sensing image is applied to the identification of the image table structure, and the image preprocessing work of the embodiment of the application is less, the identification speed is higher, and the identification result is more accurate.

Drawings

FIG. 1 is a flow chart of an image table structure identification method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of the detection results of horizontal and vertical lines in the table structure according to the embodiment of the present application;

3(a) and 3(b) are diagrams illustrating the merged refining result of collinear co-segment straight lines according to the embodiment of the present application;

fig. 4(a) and 4(b) are schematic structural diagrams of rough tables obtained by combining horizontal lines and vertical lines according to embodiments of the present application;

FIG. 5 is a table structure diagram illustrating the alignment of the horizontal and vertical lines according to an embodiment of the present application;

FIG. 6(a) is a schematic diagram of a non-strictly defined form image, and FIG. 6(b) is a schematic diagram of a recognition result of recognizing the non-strictly defined form image by using an algorithm according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of an image table structure recognition system according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a storage medium according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Please refer to fig. 1, which is a flowchart illustrating an image table structure recognition method according to an embodiment of the present application. The image table structure identification method of the embodiment of the application comprises the following steps:

step 100: acquiring a form image to be identified;

step 200: carrying out outline detection (including horizontal lines and vertical lines of a table) on the table image to be identified by using an LSD (Line Segment Detector) algorithm to obtain horizontal Line and vertical Line detection results of the table structure;

in step 200, the LSD algorithm is an image straight line processing algorithm, and is widely applied to detection and identification of geometric shape targets in remote sensing images because the LSD algorithm has strong sensitivity to straight lines, and the LSD algorithm sets an error value for each image, completes merging of pixels based on the error value, and a region where a plurality of pixels are merged is a detected frame line, thereby completing detection of the image frame line in a linear time.

Specifically, in the embodiment of the present application, the table frame line detection algorithm based on the LSD algorithm includes the following steps:

step 201: calculating the level-line angle LLA and the gradient value of each pixel point in the table image I to be recognized

Let I (x, y) be the gray value at (x, y) in the form image I to be recognized, I_x(x, y) represents the gradient value of the pixel point I (x, y) in the x direction, I_y(x, y) represents the gradient value of the pixel point i (x, y) in the y direction,

for the total gradient value, the calculation method is as follows:

step 201: the pixel point region is increased; defining an error value tau of the table image to be recognized, calculating an error between a level-line angle of each pixel point and a current region angle, performing region merging on the level-line angles of a plurality of pixel points with errors smaller than the error value tau, and updating the merged region;

further, assuming that the current region is region, let S_xRepresents the current region cos (θ)_region) Change of direction, S_yDenotes the current region sin (θ)_region) If the direction changes, the angle of the current area is calculated in the following manner:

S_x＝S_x+cos(LLA(i(x，y)))

S_y＝S_y+sin(LLA(i(x，y))) (2)

step 202: and constructing an external matrix for each updated region, calculating the NFA value of each external matrix, judging whether the NFA value meets a set threshold value, if so, judging that the region is an output straight line, otherwise, judging that the region is not an output straight line.

Further, assuming that rows and columns of the table image I to be recognized are M and N, respectively, N represents the number of pixel points in the external matrix, and k represents the number of alignment points in the external matrix (the alignment points are pixel points whose level-line angles of the pixel points in the external matrix and whose matrix principal direction angles are within the error value τ), the NFA value calculation formula of the matrix rect is as follows:

in the formula (3), the first and second groups,

the parameter γ represents the number of possible p values, and τ is the error value.

Step 203: detecting all straight lines in the output frame line area by using an LSD (least squares decomposition) straight line detection algorithm, screening straight line detection results by using a set parameter threshold, removing useless frame lines in the frame line detection results, and obtaining horizontal line and vertical line detection results of a table structure in the table image to be identified;

in particular, assume that (x) is used₀,y₀,x₁,y₁) Represents each frame line in the form of coordinates of (x)₀,y₀) Indicates the starting position of a certain straight line, (x)₁,y₁) And (3) representing the end position of the straight line, wherein the table image has useless frame lines such as character information, non-table structure frame lines and the like, and the LSD algorithm further screens the straight line detection result through a set parameter threshold value after all the straight lines are detected, and eliminates the useless frame lines to obtain the horizontal line detection result and the vertical line detection result of the table structure. Fig. 2 is a schematic diagram of the detection results of the horizontal line and the vertical line in the table structure according to the embodiment of the present application.

Step 300: carrying out binarization processing on the frame line result of the table image detected by the LSD based on the maximum inter-class variance method of the global threshold;

in step 300, since the difference between the background and the foreground of the table image is large, in the embodiment of the present application, the inter-class variance between the background and the foreground is first calculated, and the threshold obtained when the inter-class variance between the background and the foreground reaches the maximum value is used as the global threshold for the image binarization.

Further, assuming that the table image to be recognized has M gray values, given the initial gray value t₀Dividing the form image to be recognized into G₁And G₂Two groups, wherein G₁Containing all values less than the grey value t₀Pixel of (2), G₂Containing all values greater than the grey value t₀N represents the total number of pixels in the form image to be recognized, N_iThe number of pixels representing the gray value i, the probability of each gray value i appearing in the form image to be identified is:

G₁and G₂Probability of occurrence

And

mean value

And

respectively as follows:

between-class variance σ (t)₀)²Comprises the following steps:

maximum between-class variance, i.e. argmax σ (t)₀)²,t₀∈[0,M-1]。

Based on the above, in the embodiment of the present application, the maximum inter-class variance method based on the global threshold is used to perform binarization processing on the frame line result of the table image detected by the LSD, and since the threshold selected by the maximum inter-class variance method is relatively stable and the segmentation efficiency is relatively high, a better binarization effect can be obtained.

Step 400: adopting a straight line judgment rule based on double thresholds to respectively merge and refine two or more straight lines belonging to the same frame line in the detection results of the transverse lines and the longitudinal lines to obtain complete transverse lines and longitudinal lines in a table structure;

in step 400, due to a printing process, human interference or other external factors, the frame lines of the scanned form in the form image to be recognized have problems of coarsening, fracture, even missing, and the like, so that one frame line is detected as a plurality of different line segments. Therefore, each line segment needs to be detected respectively to obtain a collinear common segment line segment and a collinear non-common segment line segment, and a plurality of line segments belonging to the same frame line are merged and thinned into a complete frame line.

Further, the embodiment of the application adopts a straight line judgment rule based on double thresholds to judge whether straight lines need to be combined. Let the coordinates of two adjacent straight lines be (x)₀,y₀,x₁,y₁) And (x'₀,y′₀,x′₁,y′₁) Given a horizontal threshold lineWidth and a vertical threshold lineHeight (preferably, the lineWidth value is set to be 15, the lineHeight value is set to be 20, and the lineHeight value can be specifically adjusted according to practical applications), a straight line judgment rule based on the double thresholds is defined as follows:

definition 1 if ((y)₀′-y₀)≤lineHeight)∨((y₁′-y₁) Less than or equal to lineHeight), the two line segments are transversely collinear.

Definition 2 if (((y)₀′-y₀)≤lineHeight)∨((y₁′-y₁)≤lineHeight))∧((x₀′-x₁) And if the linear width is less than or equal to the lineWidth, the two line segments are transversely collinear common segments.

Definition 3 if ((x)₀′-x₀)≤lineWidth)∨((x₁′-x₁) Less than or equal to lineWidth), the two line segments areLongitudinally co-linear.

Definition 4 if (((x)₀′-x₀)≤lineWidth)∨((x₁′-x₁)≤lineWidth))∧ ((y₀′-y₁) Less than or equal to lineHeight), the two line segments are longitudinally collinear and common segments.

Wherein, the definitions 1 and 2 can detect the straight line of the collinear common segment and the straight line of the collinear non-common segment in the horizontal line detection result, and the definitions 3 and 4 can detect the straight line of the collinear common segment and the straight line of the collinear non-common segment in the vertical line detection result. When two or more straight lines are collinear common segments, the straight lines are in accordance with a transverse threshold or a longitudinal threshold, and the collinear common segments can be merged and refined by using a frame line refining algorithm.

Further, taking the merging and refinement of the straight lines of the transverse collinear co-segment as an example, the merging and refinement algorithm specifically includes:

step 401: sorting horizontal line coordinate sets in the horizontal line detection result according to the vertical coordinates of the horizontal line coordinate sets from small to large to obtain a horizontal line set heng _ lines;

step 402: traversing each transverse line in the heng _ lines, judging whether two adjacent transverse lines are collinear by using definition 1, adding the transverse lines of which the judgment results are collinear into a transverse line collinear set lineWidth _ lines, and sequencing according to the starting transverse coordinate of each transverse line from small to large;

step 403: traversing each transverse line in the transverse line collinear set lineWidth _ lines, judging whether two adjacent transverse lines are collinear and common sections by using definition 2, and if so, executing a step 404; otherwise, go to step 405;

step 404: merging and thinning the collinear common-section transverse lines according to the formulas (13) to (15), and adding merged and thinned results into an output result set lines _ location;

step 405: directly adding the horizontal lines which are not collinear and are in the same section into an output result set lines _ location;

step 406: screening each transverse line in the output result set lines _ location according to a set transverse line length threshold value to obtain a final transverse line merging and refining result set heng _ final;

the threshold value of the length of the transverse line set in the embodiment of the present application is 70, and it is understood that the threshold value may be determined according to the table structure in the table image.

The vertical line merging and thinning algorithm is similar to the horizontal line merging and thinning algorithm, and a final vertical line merging and thinning result set is zong _ final, which is not described herein again.

Please refer to fig. 3(a) and fig. 3(b) together, which are schematic diagrams of the merged refining result of the collinear co-segment straight lines, wherein fig. 3(a) is the merged refining result of the horizontal lines, and fig. 3(b) is the merged refining result of the vertical lines. (new _ x) in FIG. 3(a)₀，new_x₁New _ y) represents the new coordinates of the horizontal line after merging, (new _ y) in fig. 3(b)₀，new_y₁New _ x) represents the new coordinates of the merged vertical lines, and the coordinates are calculated as follows:

new-x₀＝min(x₀x₀′)

new-x₁＝max(x₁，x₁′)

new_y₀＝min(y₀，y₀′)

new_y₁＝max(y₁，y₁′)

based on the above, in the embodiment of the present application, the straight line judgment rule based on the dual threshold is used to detect the collinear straight line and the straight line of the collinear non-common segment, and the frame line thinning algorithm is used to merge a plurality of collinear common segment straight lines belonging to the same frame line, so that a complex table with multiple merged cells and multiple numbers of cells can be processed.

Step 500: merging the horizontal line merging and thinning result and the vertical line merging and thinning result to obtain a rough table structure;

in step 500, the rough table structure obtained by merging the horizontal lines and the vertical lines is as shown in fig. 4(a) and 4(b), and it can be seen in fig. 4(a) that a pseudo frame line (i.e., a vertical line not controlled by any two horizontal lines or a horizontal line not controlled by any two vertical lines) exists in the wire frame labeling area at the lower right side, and it can be seen in fig. 4(b) that a missing or overlong line segment also exists at the intersection point of the horizontal line and the vertical line in the plurality of wire frame labeling areas. Further elaboration of the table structure is therefore required.

Step 600: respectively aligning and correcting the horizontal lines and the vertical lines in the rough table structure to obtain a final table structure in the table image to be identified;

in step 600, the length of any one of the frame lines in the table structure is determined by the remaining two frame lines, taking the horizontal line as an example, and any one of the horizontal line is determined by the two vertical lines as the left and right end points, the horizontal coordinates of the two vertical lines directly determine the horizontal coordinates and the length of the horizontal line using the horizontal line as the start point and the end point, and the horizontal coordinates of all the horizontal lines using the same vertical line as the start point and the end point are the same, as are the vertical lines. If a certain frame line exists alone, the frame line must be a false frame line that is misdetected. Therefore, the embodiment of the application corrects the missing or overlong line segment existing at the intersection point of the horizontal line and the vertical line in the rough table structure by using the characteristics, and eliminates the false frame line to obtain the complete table structure.

Further, taking the alignment and correction of the horizontal line as an example, the algorithm specifically includes:

step 601: three identifiers are respectively introduced into each frame line in the horizontal and vertical line combination refinement result set heng _ final and zong _ final, and the three identifiers introduced into the horizontal line combination refinement result set heng _ final are as follows: the heng _ left, heng _ right and y _ is _ visited are respectively used for judging whether each horizontal line meets the left coordinate alignment, the right coordinate alignment and the vertical coordinate alignment; three identifiers introduced in the vertical line merging refinement result set zong _ final are: and zong _ up, zong _ below and x _ is _ viewed are respectively used for judging whether each longitudinal line meets the upper coordinate alignment, the lower coordinate alignment and the horizontal coordinate alignment, and the initial assignment of the six identifiers is 0.

Step 602: traversing each transverse line in the heng _ final, and respectively performing left coordinate alignment, right coordinate alignment and vertical coordinate alignment on each transverse line by using heng _ left, heng _ right and y _ is _ visited to obtain heng _ final after coordinate alignment;

further, taking the left coordinate alignment of the horizontal line as an example, the specific algorithm is as follows: and obtaining the left coordinate of the current transverse line, if the heng _ left value of the current transverse line is 0, namely, the transverse line is not subjected to left alignment treatment, taking the transverse line as a reference transverse line, traversing the rest transverse lines, finding out all transverse lines meeting preset conditions, calculating the left coordinate mean value of all transverse lines meeting the preset conditions, updating the left coordinates of the reference transverse line and all transverse lines meeting the preset conditions by using the left coordinate mean value, and assigning the heng _ left value to be 1. Wherein, the preset conditions for finding out all the transverse lines are as follows: the difference value of the left coordinate with the reference horizontal line is within a set alignment threshold (preferably, the alignment threshold is set to 15 in the embodiment of the present application, and may be specifically set according to the actual application), and the left alignment process is not performed.

It can be understood that the right coordinate alignment and the vertical coordinate alignment of the horizontal line are the same as the left coordinate alignment algorithm, and are not described herein again.

Step 603: traversing each transverse line in heng _ final after coordinate alignment, taking the current transverse line as a reference transverse line, traversing each longitudinal line in zong _ final, finding out all longitudinal lines of which the transverse coordinates and the transverse coordinates of the reference transverse line are within an alignment threshold value, and assigning the transverse coordinates of the reference transverse line as the transverse coordinates of all the longitudinal lines within the alignment threshold value; if no vertical line with the horizontal coordinate within the alignment threshold value is found, the horizontal line is judged to be a false frame line, and the horizontal line is removed.

Step 604: traversing each longitudinal line in the zong _ final, and respectively performing upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment on each longitudinal line by using zong _ up, zong _ below and x _ is _ viewed to obtain zong _ final after coordinate alignment;

the vertical line alignment algorithm is the same as the horizontal line alignment algorithm, and is not described herein again.

Based on the above, the final table structure obtained by performing the alignment and correction operations on the horizontal and vertical lines is shown in fig. 5.

Table 1 shows the recognition results and recognition times of table images with different resolutions and different complexity levels according to the embodiment of the present application. The more the number of cells, the more complex the table is, and the higher the probability of the recognized error. The frame line recognition rate is (N/M). times.100%, where N is the number of recognized frame lines and M is the number of actual frame lines.

TABLE 1 different types of form image recognition results

As can be seen from the recognition results in table 1, in the high-resolution table image, the recognition rates of both the simple table (with a small number of cells) and the complex table (with a large number of cells) can reach one hundred percent, and the time for processing one image is in the order of milliseconds. In the low-resolution table image, the recognition rate of the simple table is hundreds, the recognition rate of the complex table reaches 78.3% although the complex table is not completely recognized, and the recognition time of the simple table and the complex table is very short. Therefore, the table frame line identification speed of the embodiment of the application is high, and the accuracy is high.

The embodiment of the application can also correct the form image within a certain inclination angle range. As shown in table 2, in order to obtain the processing result and the recognition time of the oblique form image according to the embodiment of the present application:

TABLE 2 different Tilt Angle Table image detection results

As can be seen from table 2, as the inclination angle of the image increases, the difficulty of table identification increases, the maximum tolerance inclination angle of the embodiment of the present application is ± 1.7 degrees, the frame line of the table structure can be accurately identified within the inclination angle range, and the identification speed is high.

In order to verify the feasibility and the effectiveness of the embodiment of the application, the form image recognition algorithm of the embodiment of the application is applied to the community vote project of the key technical research and industrialization demonstration of the 'two committees' transition election intelligent service management platform of the Sichuan province. In the actual item of community voting, since a higher image resolution would increase the processing time and the cost for the ticketing machine would also increase, a high resolution image like 300dpi (image resolution) would not be used. However, the recognition error rate is increased due to the fact that the image resolution (for example, 100dpi) of the votes is too low, so that the image resolution of the votes in the actual project is generally between 100dpi and 300dpi, namely, the recognition accuracy is guaranteed, and the equipment cost can also be reduced. The test result shows that the form image recognition algorithm of the embodiment of the application can be well adapted to the project, and has the following advantages:

1. the recognition result with high accuracy can be obtained in the resolution of the lower vote image, and the recognition speed is high. Since the community vote of the base layer is generally a simple table with few cells, the identification time for identifying the table by applying the embodiment of the application is less than 0.2 second.

2. The vote inclination angle within a certain range can be tolerated, and the manual processing work is reduced. When the votes pass through the ticket counting device, the image is skewed, the image inclination angle is artificially set in the item, and the skewed image exceeding the angle is manually identified. By applying the method and the device, the image inclination angle can be enlarged, so that the algorithm can automatically deal with and correct the skew image in the inclination angle, and manual operation is reduced.

3. The method has strong universality, and can quickly and accurately identify the table structure even in a non-strictly defined table image. In the mode of community vote, a non-strictly defined form image exists, the non-strictly defined form image is shown in fig. 6(a), for the image, accurate identification can be performed through the embodiment of the application, and the identification result is shown in fig. 6 (b).

Please refer to fig. 7, which is a schematic structural diagram of an image table structure recognition system according to an embodiment of the present application. The image table structure recognition system of the embodiment of the application comprises:

the frame line detection module: the detection device is used for carrying out frame line detection on the frame line result of the table image detected by the LSD algorithm to obtain the detection result of the horizontal line and the vertical line of the table structure in the table image to be identified;

Please refer to fig. 8, which is a schematic diagram of a terminal structure according to an embodiment of the present application. The terminal 50 comprises a processor 51, a memory 52 coupled to the processor 51.

The memory 52 stores program instructions for implementing the above-described image table structure identification method.

The processor 51 is operable to execute program instructions stored by the memory 52 to control the image table structure identification.

The processor 51 may also be referred to as a CPU (Central Processing Unit). The processor 51 may be an integrated circuit chip having signal processing capabilities. The processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Fig. 9 is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 61 capable of implementing all the methods described above, where the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.

The image table structure identification method, the image table structure identification system, the image table structure identification terminal and the storage medium detect the frame line in the table structure by using the LSD algorithm, merge and refine a plurality of line segments belonging to the same frame line by adopting a straight line judgment rule based on double thresholds to obtain a complete frame line, and perform alignment and correction operations on the complete frame line to obtain a final table frame line. Compared with the prior art, the embodiment of the application has at least the following advantages:

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image table structure identification method is characterized by comprising the following steps:

step b: respectively detecting each straight line in the detection results of the transverse lines and the longitudinal lines according to a set transverse threshold and a set longitudinal threshold to obtain straight lines which belong to a collinear common segment in the detection results of the transverse lines and the longitudinal lines, and combining two or more straight lines which belong to a collinear common segment of the same frame line to obtain complete transverse lines and longitudinal lines in the table structure;

2. The method according to claim 1, wherein the step a of performing the frame line detection on the table image to be recognized by using the LSD algorithm comprises:

defining an error value of the form image to be recognized, calculating an error between a level-line angle of each pixel point and a current region angle, performing region merging on the pixel points with the errors smaller than the error value, and updating the merged region;

constructing an external matrix for each updated region, calculating the NFA value of each updated region, and judging the matrix of which the NFA value meets a set threshold value as an output straight line.

3. The method according to claim 2, wherein in step a, the detecting the frame line of the form image to be recognized by using the LSD algorithm further comprises:

4. The method according to claim 1, wherein in the step b, the detecting each straight line in the horizontal line and the vertical line detection results according to the set horizontal threshold and the vertical threshold comprises:

let the coordinates of two adjacent straight lines be (x)₀,y₀,x₁,y₁) And (x'₀,y′₀,x′₁,y′₁) Giving a horizontal threshold lineWidth and a longitudinal threshold lineHeight, and judging whether straight lines need to be combined or not by adopting a straight line judgment rule based on double thresholds; the straight line judgment rule based on the double thresholds is specifically as follows:

will satisfy (((y)₀′-y₀)≤lineHeight)∨((y₁′-y₁)≤lineHeight))∧((x₀′-x₁) Not more than lineWidth) is judged as a transverse collinear common segment;

will satisfy (((x)₀′-x₀)≤lineWidth)∨((x₁′-x₁)≤lineWidth))∧((y₀′-y₁) No more than lineHeight) are judged as longitudinal collinear common segments.

5. The image table structure recognition method according to claim 4, wherein in the step b, the merging two or more collinear co-segment straight lines belonging to the same frame line further comprises:

6. The image table structure recognition method according to claim 5, wherein in the step c, the aligning the merged horizontal line and vertical line includes:

traversing each vertical line in the zong _ final, and performing upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment on each vertical line by using the zong _ up, zong _ below and x _ is _ viewed respectively to obtain the zong _ final after coordinate alignment.

7. The image table structure recognition method according to claim 6, wherein the performing left coordinate alignment, right coordinate alignment, and vertical coordinate alignment operations on each horizontal line comprises:

respectively obtaining a left coordinate, a right coordinate and a vertical coordinate of a current transverse line, if the heng _ left, heng _ right and y _ is _ provided of the current transverse line are 0, indicating that the transverse line is not subjected to left alignment, right alignment and vertical alignment, traversing the residual transverse line as a reference transverse line, finding out all transverse lines meeting preset conditions, respectively calculating a left coordinate mean value, a right coordinate mean value and a vertical coordinate mean value of all transverse lines meeting the preset conditions, respectively updating the left coordinate, the right coordinate and the vertical coordinate of the reference transverse line and all transverse lines meeting the preset conditions by using the left coordinate mean value, the right coordinate mean value and the vertical coordinate mean value, and obtaining heng _ final after the transverse lines are aligned;

the preset conditions are as follows: and the left coordinate difference, the right coordinate difference and the vertical coordinate difference of the reference horizontal line are all within a set alignment threshold, and the left alignment, the right alignment and the vertical alignment are not carried out.

8. The image table structure recognition method according to claim 6, wherein the performing of the upper coordinate alignment, the lower coordinate alignment, and the abscissa alignment operation on each vertical line includes:

the preset conditions are as follows: and the upper coordinate difference, the lower coordinate difference and the horizontal coordinate difference with the reference vertical line are within a set alignment threshold, and the upper alignment, the lower alignment and the horizontal alignment are not carried out.

9. The image table structure recognition method of claim 7, wherein obtaining heng _ final after the alignment of the horizontal line further comprises:

traversing each transverse line in the heng _ final after the transverse lines are aligned, taking the current transverse line as a reference transverse line, traversing each longitudinal line in the zong _ final, finding out all longitudinal lines of which the abscissa and the abscissa of the reference transverse line are within an alignment threshold, and assigning the abscissa of the reference transverse line as the abscissa of all longitudinal lines within the alignment threshold; if no vertical line with the horizontal coordinate within the alignment threshold value is found, the horizontal line is judged to be a false frame line, and the horizontal line is removed.

10. The method according to claim 8, wherein obtaining the vertical line aligned zong final further comprises:

traversing each longitudinal line in the aligned zong _ final of the longitudinal lines, taking the current longitudinal line as a reference longitudinal line, traversing each transverse line in the heng _ final, finding out all transverse lines of which the longitudinal coordinates and the longitudinal coordinates of the reference longitudinal line are within an alignment threshold value, and assigning the longitudinal coordinates of the reference longitudinal line as the longitudinal coordinates of all transverse lines within the alignment threshold value; and if no horizontal line with the vertical coordinate within the alignment threshold value of the vertical coordinate and the vertical coordinate of the reference vertical line is found, judging that the vertical line is a false frame line, and rejecting the false frame line.

11. An image table structure recognition system, comprising:

the frame line detection module: the method comprises the steps that frame line detection is conducted on a to-be-identified form image through an LSD algorithm, and detection results of horizontal lines and vertical lines of a form structure in the to-be-identified form image are obtained;

a threshold detection module: the system comprises a table structure, a horizontal line detection result and a vertical line detection result, wherein the table structure is used for respectively detecting each straight line in the horizontal line detection result and the vertical line detection result according to a set horizontal threshold and a set vertical threshold to respectively obtain straight lines which belong to a collinear common segment in the horizontal line detection result and the vertical line detection result, and combining two or more straight lines which belong to a collinear common segment of the same frame line to obtain a complete horizontal line and a complete vertical line in the table structure;

12. A terminal, comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions for implementing the image table structure identification method of any one of claims 1-10;

13. A storage medium storing program instructions executable by a processor to perform the image table structure recognition method according to any one of claims 1 to 10.