CN111797685B

CN111797685B - Identification method and device of table structure

Info

Publication number: CN111797685B
Application number: CN202010462936.1A
Authority: CN
Inventors: 李壮
Original assignee: 贝壳找房(北京)科技有限公司
Current assignee: Seashell Housing Beijing Technology Co Ltd
Priority date: 2020-05-27
Filing date: 2020-05-27
Publication date: 2022-04-15
Anticipated expiration: 2040-05-27
Also published as: CN111797685A

Abstract

The embodiment of the invention provides a method and a device for identifying a table structure, wherein the identification method comprises the following steps: inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image; obtaining a table structure according to the table line segments in the target image; the neural network model is an improved L-CNN model, the improved L-CNN model is trained by taking a sample image with a table structure as a sample and taking a table line segment in the sample image as a sample label; the curved table line segment in the sample image is characterized by a broken line formed by a plurality of continuous straight line segments; the line profile loss in the improved L-CNN model accounts for a greater proportion of the calculated loss than the point profile loss. The embodiment of the invention enables the improved L-CNN model to be well applied to the field of table recognition, and has obviously improved recognition efficiency and accuracy compared with the prior art.

Description

Identification method and device of table structure

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for identifying a table structure.

Background

The form is widely applied as an effective data organization and display method, and also becomes a common page object in various documents, such as scientific periodicals, reports, financial statements and the like. People often need to manually process form documents to extract and collect the required information, which consumes enormous labor cost as the number of documents grows explosively.

Due to the development of Optical Character Recognition (OCR) technology, people begin to extract effective text information in pictures by using deep learning technology, and certain results are obtained. The table is used as a special form information carrier, different from a simple character OCR, the table needs a strict row-column corresponding relation, and extracts the character content in each cell and the interrelation between the cells. Therefore, the table structure extraction becomes a pre-task of table information structured extraction, and how to efficiently acquire content and structure information from the table, namely, table structure identification, becomes a problem to be solved.

Disclosure of Invention

Embodiments of the present invention provide a method and apparatus for identifying a table structure, which overcome the above problems or at least partially solve the above problems.

In a first aspect, an embodiment of the present invention provides a method for identifying a table structure, including:

inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image;

obtaining a table structure according to the table line segments in the target image;

the neural network model is an improved L-CNN model, and the improved L-CNN model is trained by taking a sample image with a table structure as a sample and taking a table line segment in the sample image as a sample label;

the curved table line segments in the sample image are characterized by a broken line formed by a plurality of continuous straight line segments;

the line profile loss in the improved L-CNN model accounts for a greater proportion of the calculated loss than the point profile loss.

Further, the obtaining a table structure according to the table line segment in the target image specifically includes:

defining the table line segment as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment; combining the horizontal/vertical line sections meeting the requirements of the included angle and the distance to obtain a horizontal/vertical frame line;

and determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from all the intersection points as a starting point, and searching other intersection points which form the minimum table unit with the intersection point until the minimum table unit where all the intersection points are located is searched.

Further, the table line segment is defined as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment, specifically:

determining a horizontal datum line in the target image, and calculating an included angle between a table line segment and the datum line to serve as an inclination angle of the table line segment; and defining table line segments with the inclination angles larger than 45 degrees as vertical line segments, and defining table line segments with the inclination angles smaller than 45 degrees as horizontal line segments.

Further, the horizontal/vertical line segments meeting the requirements of the included angle and the distance are combined to obtain a horizontal/vertical frame line, and the method specifically comprises the following steps:

creating a first set, and executing the following operations on the horizontal line segments until all the horizontal line segments are executed: for the currently operated horizontal line segment, judging whether the horizontal line segment and the horizontal line segment in each subset in the first set meet a merging condition; if the horizontal line segments are in accordance with the merging condition, the horizontal line segments are classified into a subset in accordance with the merging condition, if the horizontal line segments are not in accordance with the merging condition, a new subset is created in the first set, and the horizontal line segments are classified into the new subset; after all the horizontal line segments are executed, combining the horizontal line segments of each subset in the first set into a horizontal frame line;

creating a second set, and executing the following operations on the vertical line segments until all the vertical line segments are executed: judging whether the vertical line segment and the vertical line segment in each subset in the second set meet a merging condition or not for the currently operated vertical line segment; if the vertical line segments are in accordance with the merging conditions, the vertical line segments are classified into a subset in accordance with the merging conditions, if the vertical line segments are not in accordance with the merging conditions, a new subset is created in the second set, and the vertical line segments are classified into the new subset; and after all the vertical line segments are executed, combining the vertical line segments of each subset in the second set into one vertical frame line.

Further, the merging condition is that the included angle of the two line segments is smaller than 15 degrees and the minimum distance of the two line segments is smaller than 2 pixels.

Further, the merging the horizontal line segments of each subset in the first set into one horizontal frame line specifically includes:

establishing a coordinate system in the target image, wherein the transverse axis of the coordinate system is parallel to the datum line, and the longitudinal axis of the coordinate system is perpendicular to the datum line;

determining the horizontal axis coordinate of the left vertex or the right vertex of each horizontal line segment in any subset of the first set, sequencing each horizontal line segment according to the ascending or descending order of the horizontal axis coordinate, and connecting two adjacent sequenced horizontal line segments to obtain a horizontal frame line;

the merging of the vertical line segments of each subset in the second set into one vertical frame line specifically includes:

and determining the longitudinal axis coordinate of the left vertex or the right vertex of each vertical line segment in the subset for any subset in the second set, sequencing each vertical line segment according to the ascending or descending order of the longitudinal axis coordinate, and connecting two adjacent vertical line segments after sequencing to obtain a vertical frame line.

Further, the determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from all the intersection points as a starting point, and searching for other intersection points forming the minimum table unit with the intersection point until the minimum table unit where all the intersection points are located is searched, specifically:

determining all intersection points of the horizontal frame line and the vertical frame line, wherein the intersection points comprise a real intersection point formed by the intersection of the horizontal frame line and the vertical frame line, a virtual intersection point formed by the intersection of extension lines of the horizontal frame line and the vertical frame line, a virtual intersection point formed by the intersection of the extension line of the horizontal frame line and the vertical frame line as a virtual intersection point and a virtual intersection point formed by the intersection of the extension line of the horizontal frame line and the extension line of the vertical frame line;

setting a frame line index for each intersection point, wherein the frame line index comprises a first element and a second element, the first element is used for recording the sequence number of the straight line where the horizontal frame line corresponding to the intersection point is located, and the second element is used for recording the sequence number of the straight line where the vertical frame line corresponding to the intersection point is located; the serial numbers of the straight lines of the horizontal frame lines are increased progressively from top to bottom, and the straight lines of the vertical frame lines are increased progressively from left to right;

taking the searched row end point, column start point or column end point of the last minimum table unit as the row start point of the currently searched minimum table unit;

gradually increasing the second elements of the frame line indexes of the line starting points and keeping the first elements unchanged until the found intersection points corresponding to the frame line indexes are real intersection points and serve as line ending points of the minimum table units searched currently;

increasing the first elements of the frame line indexes of the row starting points one by one and keeping the second elements unchanged until the found intersection points corresponding to the frame line indexes are real intersection points and serve as the column starting points of the minimum table units searched currently;

respectively taking the intersection point of a first element of which the first element and a second element of the outline index are column starting points and a second element of which the second element is a row ending point as a column ending point of the currently searched minimum table unit to obtain the currently searched minimum table unit;

and the starting point of the row of the first minimum table unit is the real intersection point with the minimum first element or second element in the frame line index in the target image.

Further, the proportion of line feature map loss in the improved L-CNN model to the calculated loss and the proportion of point feature map loss in the calculated loss are 8: 1.

In a second aspect, an embodiment of the present invention provides an apparatus for identifying a table structure, including:

the input module is used for inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image;

the reprocessing module is used for obtaining a table structure according to the table line segments in the target image;

the line characteristic diagram loss in the improved L-CNN model accounts for the proportion of calculation loss which is greater than the point characteristic diagram loss

Further, the reprocessing module includes:

the frame line module is used for defining the table line segment as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment; combining the horizontal/vertical line sections meeting the requirements of the included angle and the distance to obtain a horizontal/vertical frame line;

and the table module is used for determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from all the intersection points as a starting point, and searching other intersection points which form the minimum table unit with the intersection points until the minimum table unit where all the intersection points are located is searched.

Further, the frame line module comprises a line segment defining unit for defining the table line segment as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment;

the line segment definition unit is specifically configured to: determining a horizontal datum line in the target image, and calculating an included angle between a table line segment and the datum line to serve as an inclination angle of the table line segment; and defining table line segments with the inclination angles larger than 45 degrees as vertical line segments, and defining table line segments with the inclination angles smaller than 45 degrees as horizontal line segments.

Furthermore, the frame line module comprises a merging unit for merging the horizontal/vertical line sections meeting the requirements of the included angle and the distance to obtain the horizontal/vertical frame line;

the merging unit further includes:

the first merging subunit is configured to create a first set, and perform the following operations on the horizontal line segments until all the horizontal line segments are completely executed: for the currently operated horizontal line segment, judging whether the horizontal line segment and the horizontal line segment in each subset in the first set meet a merging condition; if the horizontal line segments are in accordance with the merging condition, the horizontal line segments are classified into a subset in accordance with the merging condition, if the horizontal line segments are not in accordance with the merging condition, a new subset is created in the first set, and the horizontal line segments are classified into the new subset; after all the horizontal line segments are executed, combining the horizontal line segments of each subset in the first set into a horizontal frame line;

the second merging subunit is configured to create a second set, and perform the following operations on the vertical line segments until all the vertical line segments are completely performed: judging whether the vertical line segment and the vertical line segment in each subset in the second set meet a merging condition or not for the currently operated vertical line segment; if the vertical line segments are in accordance with the merging conditions, the vertical line segments are classified into a subset in accordance with the merging conditions, if the vertical line segments are not in accordance with the merging conditions, a new subset is created in the second set, and the vertical line segments are classified into the new subset; and after all the vertical line segments are executed, combining the vertical line segments of each subset in the second set into one vertical frame line.

Further, the first merging sub-unit includes a horizontal merging secondary unit for merging the horizontal line segments of each sub-set in the first set into a horizontal frame line, and the horizontal merging secondary unit specifically includes:

the coordinate system establishing three-level unit is used for establishing a coordinate system in the target image, wherein the transverse axis of the coordinate system is parallel to the datum line, and the longitudinal axis of the coordinate system is vertical to the datum line;

the sorting and merging three-level unit is used for determining the horizontal axis coordinate of the left vertex or the right vertex of each horizontal line segment in any subset of the first set, sorting each horizontal line segment according to the ascending or descending order of the horizontal axis coordinate, and connecting two adjacent sorted horizontal line segments to obtain a horizontal frame line;

the second merging sub-unit includes a vertical merging secondary unit for merging the vertical line segments of each sub-set in the second set into one vertical frame line, and the vertical merging secondary unit is specifically configured to: and determining the longitudinal axis coordinate of the left vertex or the right vertex of each vertical line segment in the subset for any subset in the second set, sequencing each vertical line segment according to the ascending or descending order of the longitudinal axis coordinate, and connecting two adjacent vertical line segments after sequencing to obtain a vertical frame line.

Further, the table module is specifically configured to:

an intersection point determination unit for determining all intersection points of the horizontal frame line and the vertical frame line, the intersection points including a real intersection point formed by intersecting the horizontal frame line and the vertical frame line, a virtual intersection point formed by intersecting extension lines of the horizontal frame line and the vertical frame line, a virtual intersection point formed by intersecting the extension line of the horizontal frame line and the vertical frame line as a virtual intersection point, and a virtual intersection point formed by intersecting the extension line of the horizontal frame line and the extension line of the vertical frame line;

the index unit is used for setting a frame line index for each intersection point, wherein the frame line index comprises a first element and a second element, the first element is used for recording the serial number of the straight line where the horizontal frame line corresponding to the intersection point is located, and the second element is used for recording the serial number of the straight line where the vertical frame line corresponding to the intersection point is located; the serial numbers of the straight lines of the horizontal frame lines are increased progressively from top to bottom, and the straight lines of the vertical frame lines are increased progressively from left to right;

a line starting point unit, configured to use the found line end point, column starting point, or column end point of the last minimum table unit as the line starting point of the currently found minimum table unit;

a line termination point unit, configured to increase second elements of the frame line index of the line starting point one by one and keep the first elements unchanged until an intersection point corresponding to the found frame line index is a real intersection point, and use the intersection point as a line termination point of the currently searched minimum table unit;

a column starting point unit, configured to increase the first elements of the frame line indexes of the row starting point one by one and keep the second elements unchanged until the found intersection point corresponding to the frame line index is a real intersection point, and use the found intersection point as a column starting point of the currently searched minimum table unit;

a column end point unit, configured to use an intersection point of a first element of which a first element and a second element of the outline index are a column start point and a second element of which a row end point, respectively, as a column end point of a currently searched minimum table unit, and obtain the currently searched minimum table unit;

In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the steps of the method provided in the first aspect when executing the program.

In a fourth aspect, an embodiment of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method as provided in the first aspect.

According to the identification method and the identification device for the table structure, provided by the embodiment of the invention, the improved L-CNN model which is usually applied to urban outdoor scene wireframe analysis is optimized through three types of optimization, namely reselecting a sample characteristic, improving a sample label, adjusting a training parameter of the model and the like, so that the improved L-CNN model can be well applied to the field of table identification, and compared with the identification efficiency and accuracy in the prior art, the identification method and the identification device for the table structure are remarkably improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flowchart illustrating a table structure recognition method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a target image according to an embodiment of the invention;

fig. 3 is a schematic structural diagram of an identification apparatus for table structure according to an embodiment of the present invention;

fig. 4 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to overcome the above problems in the prior art, the inventive concept of the embodiment of the present invention is: the improved L-CNN model commonly applied to city exterior scene wireframe analysis is optimized through three types of methods, namely reselecting sample characteristics, improving sample labels, adjusting training parameters of the model and the like, so that the improved L-CNN model can be well applied to the field of table recognition, and compared with the recognition efficiency and accuracy in the prior art, the recognition efficiency and accuracy are remarkably improved.

Fig. 1 is a flowchart illustrating a table structure identification method according to an embodiment of the present invention, as shown in fig. 1, including;

s101, inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image;

the neural network model of the embodiment of the invention is an improved L-CNN model (also called a frame line analysis model), the original L-CNN model is a neural network model proposed by Shanghai science and technology university team in 2018, and the model is suitable for frame line identification of scenes without curve segments and scenes with few intersection points between the line segments, so the original L-CNN model is widely used for frame line (building outline) identification of urban outdoor scenes. It can be understood that, because the distance from the building when the camera shoots the outside city scene is far greater than the distance from the text when the camera shoots the text with the table, and because the text itself is not necessarily flat when placed, and the camera usually has a certain degree of deformation (bending) when shooting a short-distance object, the frame line of the vertical building in the outside city scene is far straighter than the line segment of the table in the text with the table, the table in the text usually has various bends in the photo, and the original L-CNN model cannot solve the identification of the bend line segment; on the other hand, the intersection points of the table are far more than the intersection points of the frame lines of the building, and the table is not suitable for the original L-CNN model identification, and based on the characteristics, no public report exists at present for applying the L-CNN model to the table identification completely different from the building identification.

In order to overcome the defects of the original L-CNN model in table identification, the invention adopts three improvements:

(1) the improved L-CNN model is trained by taking a sample image with a table structure as a sample and taking table line segments in the sample image as sample labels. Firstly, the training sample is replaced by the training sample, the common image of the urban outdoor scene is replaced by the image with the table structure, and the table line segment in the sample image is trained by taking the table line segment as the sample label, so that the table line segment can be output after the template image with the table structure is input preliminarily.

(2) On the basis of the first improvement, in order to overcome the defect that the original L-CNN model cannot identify the curved line segment, the curved line segment is characterized by the broken line formed by a plurality of continuous straight line segments, so that the curved line segment in the original image is changed into the plurality of continuous straight line segments.

(3) The line profile loss in the improved L-CNN model accounts for a greater proportion of the calculated loss than the point profile loss.

In an original L-CNN model, losses calculated during training comprise line feature map loss line maps oss, point feature map loss junction maps loss and line classification loss, the proportion of the line feature map loss and the point feature map loss in the three losses is obviously greater than the line classification loss, and meanwhile, the point feature map loss is obviously greater than the line feature map loss, specifically, the line feature map loss: loss of point feature map: the specific gravity of the segment classification loss is as follows: 1:8:0.25. The invention adjusts the line characteristic diagram loss and the proportion of the point characteristic diagram in the calculation loss, enlarges the proportion of the line segment characteristic diagram, reduces the proportion of the connection point diagram, and experiments show that when the line characteristic diagram loss is adjusted: loss of point feature map: the specific gravity of the segment classification loss is as follows: and when the ratio is 8:1:0.25, the improved L-CNN model can predict the line segment more accurately.

And S102, obtaining a table structure according to the table line segments in the target image.

The output of the improved L-CNN model of the embodiment of the invention is the table line segment in the template image, and the improved L-CNN model is improved for adapting to table identification, so the table line segment output by the improved L-CNN model can accurately represent all frame lines of a table, and the table structure can be obtained by utilizing the frame lines.

According to the embodiment of the invention, the original L-CNN model which is usually applied to urban external scene wireframe analysis is optimized by three types of methods of reselecting the sample characteristics, improving the sample labels, adjusting the training parameters of the model and the like, so that the improved L-CNN model can be well applied to the field of form recognition, and compared with the recognition efficiency and accuracy in the prior art, the recognition efficiency and accuracy are obviously improved.

On the basis of the foregoing embodiments, as an optional embodiment, the obtaining a table structure according to a table line segment in the target image specifically includes:

s201, defining the table line segment as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment; combining the horizontal/vertical line sections meeting the requirements of the included angle and the distance to obtain a horizontal/vertical frame line;

the table line segments are defined as horizontal line segments or vertical line segments by calculating the inclination angles of the table line segments, and a foundation is laid for subsequently and respectively determining complete horizontal frame lines and complete vertical frame lines. Taking a common 3 × 3 table structure as an example, a 3 × 3 table refers to a table structure with 3 rows and 3 table cells, and through prediction of an improved L-CNN model, the most ideal situation is that 4 sides of each table are identified, so that there are 3 line segments in each row or each column in the 3 × 3 table structure, but actually, due to problems of a shooting angle, a shooting environment, and the table itself, a considerable portion of line segments originally belonging to one table will be presented as multiple line segments in a final identification result, so that the accuracy of determining the table cells by the line segments is not high.

S202, determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from all the intersection points as a starting point, and searching other intersection points which form the minimum table unit with the intersection points until the minimum table unit where all the intersection points are located is searched.

After the horizontal frame line and the vertical frame line are obtained, the intersection point of the horizontal frame line and the vertical frame line is further obtained, and the table unit represented by the minimum table unit can be obtained by searching the minimum table unit clockwise or anticlockwise.

After all the minimum table cells are obtained, the embodiment of the invention actually obtains the table structure. Further, sorting is performed according to the coordinates of the found minimum table unit, so that the number of each table unit in the table structure can be obtained, and the number of the table units in the table structure can be counted.

On the basis of the foregoing embodiments, as an optional embodiment, the table line segment is defined as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment, specifically:

On the basis of the above embodiments, as an optional embodiment, the step of combining the horizontal/vertical line segments meeting the requirements of the included angle and the distance to obtain the horizontal/vertical frame line specifically includes:

creating a first set, and executing the following operations on the horizontal line segments until all the horizontal line segments are executed: for the currently operated horizontal line segment, judging whether the horizontal line segment and the horizontal line segment in each subset in the first set meet a merging condition; if the horizontal line segments are in accordance with the merging condition, the horizontal line segments are classified into a subset in accordance with the merging condition, if the horizontal line segments are not in accordance with the merging condition, a new subset is created in the first set, and the horizontal line segments are classified into the new subset; after all the horizontal line segments are executed, the horizontal line segments of each subset in the first set are merged into a horizontal frame line.

Taking the horizontal line segment as an example, suppose there are 4 horizontal line segments, which are: a. b, c and d, starting from segment a, since the first set does not have a subset at this time, a subset merge is created that puts segment a into this subset: { a }; and then judging whether the line segment b and the line segment a in the subset { a } meet the merging condition, if so, obtaining the subset { a, b }, otherwise, obtaining the subset { a } and the subset { b }, and the like until the line segment d is merged into a certain subset, or is classified into a new subset.

On the basis of the above embodiments, as an optional embodiment, the merging condition is that an included angle between two line segments is less than 15 ° and a minimum distance between the two line segments is less than 2 pixels.

On the basis of the foregoing embodiments, as an optional embodiment, the merging the horizontal line segments of each subset in the first set into one horizontal frame line specifically includes:

On the basis of the foregoing embodiments, as an optional embodiment, an embodiment of the present invention provides a new method for searching a minimum table unit, and specifically, the determining all intersection points of a horizontal frame line and a vertical frame line, selecting one intersection point from the all intersection points as a starting point, and searching for other intersection points that form the minimum table unit with the intersection point until the minimum table unit in which all the intersection points are located is searched, including:

s301, determining all intersection points of the horizontal frame line and the vertical frame line, wherein the intersection points comprise a real intersection point formed by intersection of the horizontal frame line and the vertical frame line, a virtual intersection point formed by intersection of extension lines of the horizontal frame line and the vertical frame line, a virtual intersection point formed by intersection of the extension line of the horizontal frame line and the vertical frame line as a virtual intersection point, and a virtual intersection point formed by intersection of extension lines of the vertical frame line and the extension line of the horizontal frame line;

s302, setting a frame line index for each intersection point, wherein the frame line index comprises a first element and a second element, the first element is used for recording the serial number of the straight line where the horizontal frame line corresponding to the intersection point is located, and the second element is used for recording the serial number of the straight line where the vertical frame line corresponding to the intersection point is located; the serial numbers of the straight lines of the horizontal frame lines are increased progressively from top to bottom, and the straight lines of the vertical frame lines are increased progressively from left to right;

s303, taking the searched row end point, column start point or column end point of the last minimum table unit as the row start point of the currently searched minimum table unit;

s304, increasing the second elements of the frame line indexes of the line starting points one by one and keeping the first elements unchanged until the found intersection points corresponding to the frame line indexes are real intersection points and serve as line ending points of the minimum table unit searched currently;

s305, increasing the first elements of the frame line indexes of the row starting point one by one and keeping the second elements unchanged until the found intersection points corresponding to the frame line indexes are real intersection points and serve as the column starting points of the minimum table units searched currently;

s306, taking the intersection point of the first element of which the first element and the second element of the frame line index are respectively the column starting point and the second element of which the row end point as the column end point of the currently searched minimum table unit, and obtaining the currently searched minimum table unit;

According to the method for searching the minimum table unit, the minimum table unit is divergently searched from the intersection point at the upper left corner of the target image by defining the virtual intersection point, the real intersection point and the frame line index.

The method of determining the minimum table cell according to the present invention is described below as a specific example. FIG. 2 is a schematic diagram of a target image according to an embodiment of the present invention, as shown in FIG. 2, the target image has two rows of table cells, wherein the first row has three table cells: a table cell consisting of intersections 1,2, 6, and 7, a table cell consisting of

intersections

2, 4, 7, and 9, and a table cell consisting of

intersections

4, 5, 9, and 10; the second row has two table cells: the intersection points 1 to 15 in fig. 2 are indicated by open circles, the solid lines represent horizontal frame lines or vertical frame lines, and the dotted lines represent extension lines of the vertical frame lines, in the table cell consisting of the intersection points 6, 8, 11, and 13, and the table cell consisting of the intersection points 8, 10, 13, and 15.

As can be seen from the above embodiment, the intersection points 3, 12 and 14 in fig. 2 belong to virtual intersection points, and the other intersection points belong to real intersection points, and it should be noted that the intersection points 7 and 9 do not belong to virtual intersection points, because the intersection point 7 is an intersection point generated by the real 2 nd vertical frame line from the left and the middle horizontal frame line, and the homological intersection point 9 is an intersection point generated by the real 3 rd vertical frame line from the left and the middle horizontal frame line. Since intersection 1 belongs to the real intersection and intersection 1 is located at the upper left of the target image, intersection 1 is first used as the row start point of the first minimum table cell, the frame line index defining intersection 1 is (1,1), the frame line index of intersection 2 is (1,2), …, the frame line index of intersection 5 is (1,5), the index of intersection 6 is (2,1), …, the frame line index of intersection 13 is (3,3), and the frame line index of intersection 15 is (3, 5).

And increasing the second element of the frame line index of the intersection point 1, keeping the first element unchanged, namely, finding the vertex of the minimum table unit on the horizontal frame line, and taking the intersection point 2 as a line termination point because the intersection point 2 is a real intersection point. The first element of the frame line index of the intersection point 1 is increased, the second element is kept unchanged, namely, the vertex of the minimum table unit is found on the vertical frame line, and the intersection point 6 is a real node, so the intersection point 6 is used as a column starting point, and the column ending point can be obtained as an intersection point 7. The intersections 1,2, 7, 6 thus constitute a minimum unit. And continuously searching the next minimum table unit by using the intersection point 2, wherein the intersection point 3 is a virtual intersection point, so that the second element is continuously increased, and the intersection point 4 is found to be a real intersection point, so that the row end point of the next minimum table unit is obtained, and similarly, the intersection point 7 can be used as a column starting point, and further, the intersection point 9 is used as a column end point. As can be seen from the above example, when the row end point, the column start point or the column end point of a minimum table cell is the maximum value of the first element and/or the second element is the maximum value, the minimum table cell cannot be used as the row start point, for example, none of the intersection points 5, 10, 11, 12, 13, 14 and 15 cannot be used as the row start point. And when all the intersection points are found out to be the corresponding minimum table units or are determined not to be used as the row starting points, finishing the search of the minimum table units.

Fig. 3 is a schematic structural diagram of an identification apparatus of a table structure according to an embodiment of the present invention, and as shown in fig. 3, the identification apparatus of a table structure includes: an input module 201 and a reprocessing module 202, wherein:

an input module 201, configured to input a target image with a table structure into a pre-trained neural network model, and obtain a table line segment in the target image;

a reprocessing module 202, configured to obtain a table structure according to the table line segment in the target image;

The table structure identification device provided in the embodiment of the present invention specifically executes the flow of the above table structure identification method embodiment, and please refer to the content of the above table structure identification method embodiment for details, which is not described herein again. The recognition device of the table structure provided by the embodiment of the invention is generally applied to an improved L-CNN model for urban outdoor scene wireframe analysis, and the improved L-CNN model can be well applied to the field of table recognition by three kinds of optimization such as reselecting a sample characteristic, improving a sample label and adjusting a training parameter of the model, and the recognition efficiency and the accuracy are remarkably improved compared with the recognition efficiency and the accuracy in the prior art.

On the basis of the foregoing embodiments, as an optional embodiment, the reprocessing module includes:

On the basis of the above embodiments, as an optional embodiment, the outline module includes a line segment defining unit configured to define the table line segment as a horizontal line segment or a vertical line segment according to a tilt angle of the table line segment;

On the basis of the above embodiments, as an optional embodiment, the wire module includes a merging unit for merging horizontal/vertical wire segments meeting the requirements of an included angle and a distance to obtain a horizontal/vertical wire;

the merging unit further includes:

On the basis of the foregoing embodiments, as an optional embodiment, the first merging sub-unit includes a horizontal merging secondary unit for merging the horizontal line segments of each sub-set in the first set into a horizontal frame line, and the horizontal merging secondary unit specifically includes:

On the basis of the foregoing embodiments, as an optional embodiment, the table module is specifically configured to:

On the basis of the above embodiments, as an alternative embodiment, the proportion of the line feature map loss in the improved L-CNN model to the calculated loss and the proportion of the point feature map loss in the calculated loss are 8: 1.

Fig. 4 is a schematic entity structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call a computer program stored on the memory 330 and operable on the processor 310 to execute the table structure identification method provided by the above embodiments, for example, including: inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image; obtaining a table structure according to the table line segments in the target image; the neural network model is an improved L-CNN model, and the improved L-CNN model is trained by taking a sample image with a table structure as a sample and taking a table line segment in the sample image as a sample label; the curved table line segments in the sample image are characterized by a broken line formed by a plurality of continuous straight line segments; the line profile loss in the improved L-CNN model accounts for a greater proportion of the calculated loss than the point profile loss.

In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or make a contribution to the prior art, or may be implemented in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

An embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the method for identifying a table structure provided in the foregoing embodiments when executed by a processor, and the method includes: inputting a target image with a table structure into a pre-trained neural network model to obtain a table line segment in the target image; obtaining a table structure according to the table line segments in the target image; the neural network model is an improved L-CNN model, and the improved L-CNN model is trained by taking a sample image with a table structure as a sample and taking a table line segment in the sample image as a sample label; the curved table line segments in the sample image are characterized by a broken line formed by a plurality of continuous straight line segments; the line profile loss in the improved L-CNN model accounts for a greater proportion of the calculated loss than the point profile loss.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for identifying a table structure, comprising:

the line characteristic diagram loss in the improved L-CNN model accounts for the proportion of calculation loss which is greater than the point characteristic diagram loss;

the obtaining of the table structure according to the table line segment in the target image specifically includes:

2. The method for identifying a table structure according to claim 1, wherein the table line segment is defined as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment, specifically as follows:

3. The method for identifying a table structure according to claim 2, wherein the step of combining the horizontal/vertical line segments meeting the requirements of the included angle and the distance to obtain the horizontal/vertical frame line comprises:

4. The method according to claim 3, wherein the merging condition is that an included angle between two line segments is less than 15 ° and a minimum distance between two line segments is less than 2 pixels.

5. The method for identifying a table structure according to claim 3, wherein the step of merging the horizontal line segments of each subset in the first set into a horizontal frame line comprises:

6. The method for identifying a table structure according to claim 1, wherein the determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from the all intersection points as a starting point, and searching for other intersection points which form the minimum table unit with the intersection point until the minimum table unit where all the intersection points are located is searched, specifically comprises:

7. An apparatus for identifying a table structure, comprising:

the reprocessing module includes: the frame line module is used for defining the table line segment as a horizontal line segment or a vertical line segment according to the inclination angle of the table line segment; combining the horizontal/vertical line sections meeting the requirements of the included angle and the distance to obtain a horizontal/vertical frame line; and the table module is used for determining all intersection points of the horizontal frame line and the vertical frame line, selecting one intersection point from all the intersection points as a starting point, and searching other intersection points which form the minimum table unit with the intersection points until the minimum table unit where all the intersection points are located is searched.

8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method for identifying a table structure according to any of claims 1 to 6 are implemented by the processor when executing the program.

9. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the table structure identification method according to any one of claims 1 to 6.