CN112712014B - Method, system, device and readable storage medium for parsing table picture structure - Google Patents

Method, system, device and readable storage medium for parsing table picture structure Download PDF

Info

Publication number
CN112712014B
CN112712014B CN202011589490.5A CN202011589490A CN112712014B CN 112712014 B CN112712014 B CN 112712014B CN 202011589490 A CN202011589490 A CN 202011589490A CN 112712014 B CN112712014 B CN 112712014B
Authority
CN
China
Prior art keywords
target
picture
spliced
coordinates
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011589490.5A
Other languages
Chinese (zh)
Other versions
CN112712014A (en
Inventor
孙杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Health Insurance Company of China Ltd
Original Assignee
Ping An Health Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Health Insurance Company of China Ltd filed Critical Ping An Health Insurance Company of China Ltd
Priority to CN202011589490.5A priority Critical patent/CN112712014B/en
Publication of CN112712014A publication Critical patent/CN112712014A/en
Application granted granted Critical
Publication of CN112712014B publication Critical patent/CN112712014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Abstract

The invention relates to the field of picture processing, and provides a table picture structure analysis method, which comprises the following steps: acquiring a target picture comprising a target table; detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure; determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content; creating a target table document according to the target table structure; and according to the position coordinates of the text contents, filling the text contents into the target table document. The invention improves the conversion efficiency and the accuracy of the table pictures to the tables such as Excel and the like; the efficiency of table picture conversion in relevant scenes is improved, and the processing cost is reduced.

Description

Method, system, device and readable storage medium for parsing table picture structure
Technical Field
The embodiment of the invention relates to the field of picture processing, in particular to a method, a system, equipment and a readable storage medium for analyzing a table picture structure.
Background
To ensure the integrity of the contents of a form document, the form document is often converted into a form document picture and transmitted. The form document picture can be fax, scanning or mobile phone picture taking of business related documents, etc. The receiver of the table document picture often files the data in the picture, and when the data in the table document picture is filed, the current common practice is as follows: the text content in the form document picture is extracted by using a text OCR technology, and then the form is arranged into an electronic document such as Excel by adopting a manual supplementary entry mode, but the efficiency is very low, and the accuracy is difficult to guarantee. Therefore, how to improve the accuracy of analyzing the table pictures and improve the efficiency of arranging the tables into electronic documents such as Excel at the same time becomes a technical problem to be solved currently.
Disclosure of Invention
In view of this, there is a need to provide a method, a system, a device and a readable storage medium for parsing a table picture structure, so as to solve the problems of low efficiency and low accuracy in converting a table document picture into an electronic document such as Excel.
In order to achieve the above object, an embodiment of the present invention provides a method for parsing a table picture structure, where the method includes:
Acquiring a target picture comprising a target table;
Detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure;
Determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content;
creating a target table document according to the target table structure; and
And filling the text content into the target table document according to the position coordinates of the text content.
Illustratively, the step of obtaining a target picture including a target table includes:
Acquiring a picture to be processed comprising a table;
Preprocessing the picture to be processed to obtain a picture to be cut;
Detecting a form area of the picture to be cut; and
And cutting the picture to be cut according to the table area of the picture to be cut to obtain the target picture.
Illustratively, the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.
Illustratively, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates; the step of detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure comprises the following steps:
detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines;
Extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;
Defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and
And constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment.
Illustratively, the constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment includes:
splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections; and
And determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.
Illustratively, the determining the target table structure according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments includes:
constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table;
constructing a target intersection point graph matrix according to the intersection point coordinates; and
And determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target cells.
The text content comprises a plurality of cell text contents, and one cell position coordinate corresponding to each cell text content;
The step of filling the text content into the target table document according to the position coordinates of the text content comprises the following steps:
Matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and
And filling the text content of each cell into the corresponding target cell.
In order to achieve the above object, an embodiment of the present invention further provides a system for parsing a table picture structure, including:
the acquisition module is used for acquiring the picture to be processed comprising the form;
The detection module is used for detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure;
the determining module is used for determining the position coordinates of the text content of the target picture according to the target table structure and extracting the text content;
The creation module is used for creating a target table document according to the target table structure; and
And the filling module is used for filling the text content into the target form document according to the position coordinates of the text content.
To achieve the above object, an embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the computer program when executed by the processor implements the steps of the table picture structure parsing method as described above.
To achieve the above object, an embodiment of the present invention further provides a computer readable storage medium having a computer program stored therein, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the table picture structure parsing method as described above.
According to the table picture structure analysis method, the system, the computer equipment and the computer readable storage medium, the target table structure is constructed according to the detected linear structure by detecting the intersection point of the table lines of the target table, and the table structure is accurately restored through the target table structure, so that the conversion efficiency and the accuracy of the table picture to the table such as Excel are improved; and the text content is filled into the target table document according to the position coordinates of the text content, so that the table picture conversion efficiency in the related scene is improved, and the processing cost is reduced.
Drawings
FIG. 1 is a flow chart of a method for resolving a table picture structure according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a picture to be processed according to a first embodiment of the invention;
FIG. 3 is a schematic diagram of intersecting points of a plurality of grid lines according to a first embodiment of the present invention;
FIG. 4 is a schematic diagram of a plurality of spliced line segments according to a first embodiment of the present invention;
FIG. 5 is a schematic diagram of an initial table in accordance with a first embodiment of the present invention;
FIG. 6 is a schematic diagram showing a sequence of estimating a plurality of cells according to a first embodiment of the present invention;
FIG. 7 is a diagram of a target table structure in accordance with a first embodiment of the present invention;
FIG. 8 is a schematic diagram showing cell merging according to a first embodiment of the present invention;
FIG. 9 is a diagram of a target table document according to a first embodiment of the present invention;
FIG. 10 is a schematic diagram illustrating a program module of a second embodiment of a table-picture structure parsing system according to the present invention;
fig. 11 is a schematic diagram of a hardware structure of a third embodiment of the computer device of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
In the following embodiments, an exemplary description will be made with the computer device 2 as an execution subject.
Example 1
Referring to fig. 1, a flowchart illustrating steps of a table picture structure parsing method according to an embodiment of the invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. An exemplary description will be made below with the computer device 2 as an execution subject. Specifically, the following is described.
Step S100, obtaining a target picture including a target table.
The target picture may be any picture containing a table.
In an actual business scenario, the form document may convert a document format such as Excel into a picture format. For example, the picture containing the table may be fax, scanning or mobile phone shooting picture of the document related to the business, etc.
In an exemplary embodiment, the step S100 may further include steps S200 to S204, wherein:
Step S200, obtaining a to-be-processed picture including a table.
The target picture may also be a picture cut out of the pictures to be processed, which are obtained in advance by the computer device 2.
When the computer device 2 receives the picture to be processed, the picture to be processed may be detected once to determine whether a table exists in the picture to be processed. And if the table exists in the to-be-processed picture, intercepting a target picture comprising the table from the to-be-processed picture.
The picture to be processed may also be a picture having a plurality of tables, for example. When the computer device 2 detects that the to-be-processed picture has a plurality of tables, the computer device 2 may intercept a corresponding target picture according to each table to obtain a plurality of target pictures.
Step S202, preprocessing the picture to be processed to obtain the picture to be cut.
In an exemplary embodiment, the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.
The to-be-processed picture can be a picture which is manually shot, namely, the original shot picture can be possibly subjected to the problems that the intercepted target picture is inclined, the text content in the target picture is unclear and the like due to factors such as shooting angles, shooting rays and the like.
In order to avoid the above problem, the computer device 2 may pre-process the to-be-processed picture after receiving the to-be-processed picture, for example, may perform operations such as angle pointing, size scaling, and pixel value normalization on the to-be-processed picture to obtain the to-be-cropped picture (i.e., the processed to-be-processed picture).
Step S204, detecting the table area of the picture to be cut.
The computer device 2 may detect the table locale where the table of the picture to be cropped is located. Wherein, this step is implemented by a table local area detection algorithm, for example, the table local area detection algorithm may be a YOLO algorithm, cascadeTabNet algorithm, and the like.
Step S206, clipping the picture to be clipped according to the table area of the picture to be clipped to obtain the target picture.
It can be understood that the tables of some pictures occupy only a small part of the pictures, and in order to improve the accuracy of the later picture detection, the computer device 2 may also crop the picture to be cropped according to the table area, so as to obtain the target picture only including the target table.
If the to-be-cut picture includes a plurality of tables, the computer device 2 may further cut the to-be-cut picture according to the table areas of the tables, so as to obtain a plurality of target pictures.
In this embodiment, the computer device 2 performs the preprocessing operation and the cropping operation on the to-be-processed picture, so that the accuracy of picture detection is improved, and the efficiency of picture detection is further improved.
As shown in fig. 2, for better understanding, a schematic diagram of the picture to be processed is also provided in the present embodiment.
Step S102, detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure.
The computer device 2 may input the target picture into the line detection model to determine a target table structure of the target picture from a linear structure detected by the line detection model for the target picture. Wherein, the line detection result may include a plurality of unit cell line segments, each unit cell line segment including two end point coordinates.
In an exemplary embodiment, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates. The cell line segments are line segments of table cells of a target table in the target picture.
The step S102 may further include steps S300 to S306, where:
Step S300, detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines.
The line detection model may be an L-CNN deep learning model, for example, in which the line detection model may extract image high-dimensional features with HourglassNet (a network structure of a convolutional neural network) as a back bone (a neural network model).
Specifically, the computer device 2 may input the target picture into the line detection model, so as to extract suspected table line intersections in the target picture through an intersection detection link preset by the line detection model, so as to obtain the plurality of table line intersections.
As shown in fig. 3, for better understanding, the present embodiment further provides a schematic diagram of intersection points of a plurality of grid lines of the target picture.
Step S302, extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain the plurality of cell line segments.
The computer device 2 may extract from each two adjacent table line intersections one cell line segment by means of the line detection model to obtain a plurality of cell line segments. After obtaining the plurality of cell line segments, the line detection model may output the plurality of table line intersections and the plurality of cell line segments to the computer device 2.
And step S304, defining coordinates of intersection points of table grid lines at two ends of each cell line segment according to the position of each cell line segment so as to obtain coordinates of two end points of each cell line segment.
After receiving the intersection points of the table grid lines and the plurality of cell line segments output by the line detection model, the computer device 2 may define coordinates of the intersection points of the table grid lines at two ends of each cell line segment according to the position of each cell line segment, so as to obtain coordinates of two end points of each cell line segment.
Step S306, constructing the target table structure according to the plurality of cell line segments and the coordinates of two end points of each cell line segment.
For example, after the computer device 2 obtains the plurality of cell line segments and the two end coordinates of each cell line segment, the label table structure may also be constructed according to the plurality of cell line segments and the two end coordinates of each cell line segment. According to the implementation, the L-CNN deep learning model is used for carrying out line detection on the target table of the target picture, so that the problems of difficult labeling, complex processing and weak learning and expression capability on complex problems in the existing table line extraction method (such as the table line extraction method of the deep learning model based on image segmentation) are solved.
When the implementation carries out line detection on the target table of the target picture through the L-CNN deep learning model, only segment segmentation points are required to be marked, and level marking is not required to be changed; the L-CNN deep learning model can also detect the intersection points in the table and judge whether to form a table boundary line, and the learning and expression capacity of the model is stronger. In addition, the output result (the intersection points of the table grid lines and the line segments of the unit grid) of the L-CNN deep learning model is an accurate representation of the end points of the table lines of the target table of the target picture, and can be directly input into the next alternative use.
The step S306 may further include steps S400 to S402, where:
and step S400, splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of the coordinates of each endpoint, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections.
After the computer device 2 obtains the plurality of cell line segments and the coordinates of the two end points of each cell line segment, the computer device may further perform merging processing on the plurality of cell line segments according to the coordinate sequence of the coordinates of the two end points of each cell line segment, so that the continuous cell line segments are merged into a spliced line segment. The spliced line segments are long line segments of the table in the target table.
Illustratively, the unit grid line section includes a unit grid horizontal line section and a unit grid vertical line section, wherein the unit grid horizontal line sections may be combined into one spliced horizontal line section, and the unit grid vertical line sections may be combined into one spliced vertical line section.
For example, the computer device 2 may further determine whether a misjudgment line segment exists in the plurality of spliced line segments, and if the misjudgment line segment exists, the computer device 2 may reject the misjudgment line.
Step S402, determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.
For example, after the computer device 2 obtains the plurality of stitching lateral line segments and the plurality of stitching longitudinal line segments, the target table structure may also be determined according to the plurality of stitching lateral line segments and the plurality of stitching longitudinal line segments.
In this embodiment, the computer device 2 may determine the target table structure according to the plurality of spliced line segments by splicing the plurality of unit grid line segments into the plurality of spliced line segments, so as to improve accuracy of constructing the target table structure and efficiency of constructing the target table structure.
As shown in fig. 4, for better understanding, the present solution also provides a schematic diagram of a plurality of spliced line segments of the target picture.
In some embodiments, to ensure the integrity of the target form, the computer device 2 may also lengthen or supplement the four-sided boundary lines of the initial form to become a closed line frame. According to the embodiment, four boundary lines of the initial table can be prolonged or supplemented to form a closed line frame, so that the loss of the outer frame boundary line due to scanning quality or shooting reasons is prevented, and the table analysis integrity is improved.
The step S402 may further include steps S500 to S504, where:
And S500, constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table.
The computer device 2 constructs an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and extracts a plurality of intersection coordinates in the initial table.
The target table may also be an irregular table, for example.
Taking the irregular table as an example, where the line segments are represented by endpoint coordinates:
the plurality of spliced transverse line segments are as follows:
[((10,10),(60,10))、((20,20),(60,20))、((10,30),(30,30))、 ((50,30),(60,30))、((10,40),(50,40))、((10,50),(60,50))];
The plurality of spliced longitudinal line segments are as follows:
[((10,10),(10,50))、((20,10),(20,30))、((30,20),(30,40))、 ((40,10),(40,20))、((50,20),(50,50))、((60,10),(60,50))].
The table structure of the initial table constructed according to the above-mentioned multiple spliced horizontal line segments and the above-mentioned multiple spliced vertical line segments may be shown in fig. 5, where the coordinates of the points at the upper left of fig. 5 are taken as zero points, and the coordinates of the multiple points at the intersection are shown in fig. 5.
And step S502, constructing a target intersection point diagram matrix according to the plurality of intersection point coordinates.
After obtaining the plurality of intersection coordinates, the computer device 2 may construct the target intersection map matrix M according to the plurality of intersection coordinates:
In the target cross point diagram matrix M, i and j are serial numbers of the spliced horizontal line segments and the spliced vertical line segments respectively. Wherein Intersection (l 1,l2) represents the coordinate value of the intersection point of two straight lines.
Taking the example table as an example, "×" indicates no intersection points, and the intersection point diagram matrix is shown in table 1.
A B C D E F
a (10,10) (20,10) × (40,10) × (60,10)
b × (20,20) (30,20) (40,20) (50,20) (60,20)
c × × × × (50,30) (60,30)
d (10,30) (20,30) 30,30) × × ×
e (10,40) × (30,40) × (50,40) ×
f (10,50) × × × (50,50) (60,50)
TABLE 1
Wherein, the letters A-F represent the numbers of the spliced longitudinal line segments, and the letters a-F represent the numbers of the spliced transverse line segments. The horizontal positions of the spliced transverse line segments c and d are the same as each other due to the same line, and the number sequence of the spliced transverse line segments c and d does not affect the final result in the method.
In some embodiments, to ensure the integrity of the target table, the computer device 2 may also lengthen or supplement the four-sided boundary line of the initial table to become a closed line frame. For example, each line segment may be extended slightly to ensure that it can intersect correspondingly, and the intersection coordinates obtained. That is, the position "x" in table 1 is filled with the corresponding intersection coordinates.
Step S504, determining a plurality of target cells according to the target intersecting point map matrix, and determining the target table structure according to the plurality of target cells.
The computer device 2 may calculate a plurality of target cells based on the target scatter plot matrix M. The calculating process may start from the upper left corner of table 1, and find the existing intersection point through the dynamic programming concept, so as to sequentially form the upper left corner, the upper right corner, the lower right corner and the lower left corner of the target cell. If the four corner points can be found, the rectangle with the smallest area is the reasonable cell, and the position of the target cell can be represented by four corner point coordinates.
For example, the order of deriving the plurality of target cells may be as shown in fig. 6, where numbers ① to ⑨ in fig. 6 are the target cell detection orders.
Recording the detected target cells with four corner coordinates, the following target cells can be obtained, wherein the target cell set can be expressed as:
boxes=[((10,10),(20,10),(20,30),(10,30)), ((20,10),(40,10),(40,20),(20,20)), ((40,10),(60,10),(60,20),(40,20)), ((20,20),(30,20),(30,30),(20,30)),((30,20),(50,20),(50,40),(30,40)), ((50,20),(60,20),(60,30),(50,30)), ((10,30),(30,30),(30,40),(10,40)), ((50,30),(60,30),(60,50),(50,50)),((10,40),(50,40),(50,50),(10,50))]
the target cell coordinates described above can be used to obtain the target table structure shown in fig. 7.
According to the method and the device, the target cells can be searched through the target cross point diagram matrix, the restoration of the target table structure can be achieved on the basis of complete online detection, the algorithm is simple and low in complexity, and the analysis efficiency of the table structure is improved.
Step S104, determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content.
Performing character recognition on the target picture, and marking the recognized character information according to the target table structure so as to determine the actual position of the character information; the table picture structure analysis system can locate text slice coordinates in the table picture by a general OCR algorithm and identify text contents. Based on the center coordinates or corner coordinates of the text slice.
Step S106, creating a target table document according to the target table structure.
And step S108, filling the text content into the target table document according to the position coordinates of the text content.
The text content comprises a plurality of cell text contents, and one cell position coordinate corresponding to each cell text content;
the step S108 may further include steps S600 to S602, where: step S600, matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and step S602, filling the text content of each cell into the corresponding target cell.
The computer device 2 may match the text content of each cell with a target cell, and fill the text information into the target table document according to the actual position of the text information.
Illustratively, the horizontal and vertical lines are first combined in rows and columns, and the line segments that are nearly collinear within a certain range should be combined into a table line of the same row (column).
In this embodiment, the spliced transverse line segment c and the spliced transverse line segment d are determined to be collinear, and thus both are the lower edge lines of the second-row target cells. Based on the combined number of horizontal lines N and vertical lines M, a table range of (N-1) X (M-1) is first determined in Excel. And obtaining a crossing row-column range according to the peripheral edge line of each target cell, and merging the regions. The target cell in fig. 8 has a splice horizontal segment span of 2-4 and a splice vertical segment span of 3-5. The target cell span may be found to be 2 to 3 rows, 3 to 4 rows. The target cell region merging can be completed by using the write_merge () function of xlwt. Finally, an Excel file effect is generated, taking fig. 2 as an example, and the result is shown in fig. 9.
According to the line-based structure reasoning method provided by the embodiment of the invention, the search of the cell is realized by establishing the intersection point graph matrix of the transverse line and the longitudinal line, the restoration of any table structure can be realized on the basis of complete on-line detection, and the algorithm is simple and low in complexity. The method can match the characters in the form picture with the cells by combining the universal character OCR technology, and finally the generated electronic form document can accurately restore the form content and the form structure. The process efficiency of electronization of the table pictures in the related scenes is improved, the business automation degree of enterprises is improved, and the manual processing cost is reduced.
The table picture structure parsing method further includes: uploading the target table structure and the target table document into a blockchain.
Illustratively, uploading the target table structure and the target table document to a blockchain may ensure its security and fair transparency. The blockchain referred to in this example is a novel mode of application for computer technology such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Example two
Fig. 10 is a schematic diagram of a program module of a table-picture structure parsing system according to a second embodiment of the invention. The form picture structure parsing system 20 may include or be partitioned into one or more program modules stored in a storage medium and executed by one or more processors to accomplish the present invention and to implement the form picture structure parsing method described above. Program modules in accordance with the embodiments of the present invention are directed to a series of computer program instruction segments capable of performing particular functions, and are more suitable than programs themselves for describing the execution of the form picture structure parsing system 20 in a storage medium. The following description will specifically describe functions of each program module of the present embodiment:
the obtaining module 200 is configured to obtain a to-be-processed picture including a table.
Illustratively, the acquiring module 200 is further configured to: acquiring a picture to be processed comprising a table; preprocessing the picture to be processed to obtain a picture to be cut; detecting a form area of the picture to be cut; and cutting the picture to be cut according to the table area of the picture to be cut to obtain the target picture.
Illustratively, the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.
The detection module 202 is configured to detect a linear structure of a target table in the target picture, and determine a target table structure of the target picture according to the linear structure.
Illustratively, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates; the detection module 202 is further configured to: detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines; extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments; defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment.
Illustratively, the detection module 202 is further configured to: splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections; and determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.
Illustratively, the detection module 202 is further configured to: constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table; constructing a target intersection point graph matrix according to the intersection point coordinates; and determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target plurality of cells.
The text content comprises a plurality of cell text contents, and one cell position coordinate corresponding to each cell text content; the detection module 202 is further configured to: matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and filling the text content of each cell into the corresponding target cell.
The determining module 204 is configured to determine a position coordinate of the text content of the target picture according to the target table structure, and extract the text content.
A creating module 206, configured to create a target table document according to the target table structure.
And a filling module 208, configured to fill the text content into the target table document according to the position coordinates of the text content.
Illustratively, the table picture structure parsing system 20 further includes an uploading module, configured to upload the calling result into a blockchain.
Example III
Referring to fig. 11, a hardware architecture diagram of a computer device according to a third embodiment of the invention is shown. In this embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster made up of multiple servers), or the like. As shown, the computer device 2 includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a table picture structure parsing system 20 communicatively coupled to each other via a system bus.
In this embodiment, the memory 21 includes at least one type of computer-readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk provided on the computer device 2, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like. Of course, the memory 21 may also include both internal storage units of the computer device 2 and external storage devices. In this embodiment, the memory 21 is generally used for storing an operating system and various application software installed on the computer device 2, for example, program codes of the table picture structure analysis system 20 of the second embodiment. Further, the memory 21 may be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code or process data stored in the memory 21, for example, execute the table picture structure parsing system 20, so as to implement the table picture structure parsing method of the first embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, which network interface 23 is typically used for establishing a communication connection between the computer apparatus 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be an Intranet (Intranet), the Internet (Internet), the Global System for Mobile communications (Global System of Mobile communicatI/On, GSM), wideband code division multiple Access (Wideband Code DivisI/On Multiple Access, WCDMA), 4G network, 5G network, bluetooth (Bluetooth), wi-Fi, or other wireless or wireline network.
It is noted that fig. 11 shows only the computer device 2 with components 20-23, but it is understood that not all of the illustrated components are required to be implemented, and that more or fewer components may be implemented instead.
In the present embodiment, the table picture structure parsing system 20 stored in the memory 21 may be further divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (the processor 22 in the present embodiment) to complete the present invention.
For example, fig. 10 shows a schematic diagram of a program module for implementing the form picture structure parsing system 20 according to the second embodiment of the present invention, where the form picture structure parsing system 20 may be divided into an obtaining module 200, a detecting module 202, a determining module 204, a creating module 206 and a filling module 208. Program modules in the present invention are understood to be a series of computer program instruction segments capable of performing a specific function, more preferably than a program, for describing the execution of the table picture structure parsing system 20 in the computer device 2. The specific functions of the program modules 200-208 are described in detail in the second embodiment, and are not described herein.
Example IV
The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used in the table picture structure parsing system 20, and when executed by a processor, implements the table picture structure parsing method of the first embodiment.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (7)

1. A method for parsing a table picture structure, the method comprising:
Acquiring a target picture comprising a target table;
Detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure;
Determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content;
creating a target table document according to the target table structure; and
Filling the text content into the target table document according to the position coordinates of the text content;
The linear structure comprises a plurality of unit grid line segments, wherein each unit grid line segment comprises two endpoint coordinates; the step of detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure comprises the following steps:
detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines;
Extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;
Defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and
Constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment;
wherein the constructing the target table structure according to the plurality of cell line segments and coordinates of two end points of each cell line segment includes:
splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections;
Determining the target table structure according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments;
Wherein the determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments includes:
Constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table; extending or supplementing four boundary lines of the initial form to form a closed line frame;
constructing a target intersection point graph matrix according to the intersection point coordinates; and
And determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target cells.
2. The method of claim 1, wherein the step of obtaining a target picture including a target table comprises:
Acquiring a picture to be processed comprising a table;
Preprocessing the picture to be processed to obtain a picture to be cut;
Detecting a form area of the picture to be cut; and
And cutting the picture to be cut according to the table area of the picture to be cut to obtain the target picture.
3. The tabular picture structure parsing method of claim 2, wherein the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.
4. The method for parsing a table picture structure according to claim 1, wherein the text contents include a plurality of cell text contents, and each cell text content corresponds to a cell position coordinate;
The step of filling the text content into the target table document according to the position coordinates of the text content comprises the following steps:
Matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and
And filling the text content of each cell into the corresponding target cell.
5. A form picture structure parsing system, comprising:
the acquisition module is used for acquiring the picture to be processed comprising the form;
The detection module is used for detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure;
the determining module is used for determining the position coordinates of the text content of the target picture according to the target table structure and extracting the text content;
The creation module is used for creating a target table document according to the target table structure; and
The filling module is used for filling the text content into the target form document according to the position coordinates of the text content;
The linear structure comprises a plurality of unit grid line segments, wherein each unit grid line segment comprises two endpoint coordinates; the detection module is used for:
detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines;
Extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;
defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment;
splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections;
Constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table; extending or supplementing four boundary lines of the initial form to form a closed line frame;
constructing a target intersection point graph matrix according to the intersection point coordinates; and
And determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target cells.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when executed by the processor implements the steps of the table picture structure parsing method according to any one of claims 1 to 4.
7. A computer-readable storage medium, in which a computer program is stored, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the table picture structure parsing method according to any one of claims 1 to 4.
CN202011589490.5A 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure Active CN112712014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011589490.5A CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011589490.5A CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Publications (2)

Publication Number Publication Date
CN112712014A CN112712014A (en) 2021-04-27
CN112712014B true CN112712014B (en) 2024-04-30

Family

ID=75546243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011589490.5A Active CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Country Status (1)

Country Link
CN (1) CN112712014B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343866A (en) * 2021-06-15 2021-09-03 杭州数梦工场科技有限公司 Identification method and device of form information and electronic equipment
CN113536951B (en) * 2021-06-22 2023-11-24 科大讯飞股份有限公司 Form identification method, related device, electronic equipment and storage medium
CN113591746A (en) * 2021-08-05 2021-11-02 上海金仕达软件科技有限公司 Document table structure detection method and device
CN116612487B (en) * 2023-07-21 2023-10-13 亚信科技(南京)有限公司 Table identification method and device, electronic equipment and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07182459A (en) * 1993-12-22 1995-07-21 Matsushita Electric Ind Co Ltd Table structure extracting device
JPH1153466A (en) * 1997-07-31 1999-02-26 Hitachi Ltd Table recognizing method
KR101811581B1 (en) * 2016-11-15 2017-12-26 주식회사 셀바스에이아이 Aparatus and method for cell decomposition for a table recognition in document image
CN108132916A (en) * 2017-11-30 2018-06-08 厦门市美亚柏科信息股份有限公司 Parse method, the storage medium of PDF list datas
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
CN110008809A (en) * 2019-01-04 2019-07-12 阿里巴巴集团控股有限公司 Acquisition methods, device and the server of list data
CN110059687A (en) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 Pictorial information recognition methods, device, computer equipment and storage medium
CN110110313A (en) * 2019-05-09 2019-08-09 上海微投股权投资基金管理有限公司 Document form analysis method and device
CN110472208A (en) * 2019-06-26 2019-11-19 上海恒生聚源数据服务有限公司 The method, system of form analysis, storage medium and electronic equipment in PDF document
WO2020164281A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Form parsing method based on character location and recognition, and medium and computer device
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4235286B2 (en) * 1998-09-11 2009-03-11 キヤノン株式会社 Table recognition method and apparatus
JP6690089B2 (en) * 2016-04-25 2020-04-28 富士通株式会社 Form recognition method, form recognition device, and form recognition program
US11650970B2 (en) * 2018-03-09 2023-05-16 International Business Machines Corporation Extracting structure and semantics from tabular data
CN108446264B (en) * 2018-03-26 2022-02-15 阿博茨德(北京)科技有限公司 Method and device for analyzing table vector in PDF document

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07182459A (en) * 1993-12-22 1995-07-21 Matsushita Electric Ind Co Ltd Table structure extracting device
JPH1153466A (en) * 1997-07-31 1999-02-26 Hitachi Ltd Table recognizing method
KR101811581B1 (en) * 2016-11-15 2017-12-26 주식회사 셀바스에이아이 Aparatus and method for cell decomposition for a table recognition in document image
CN108132916A (en) * 2017-11-30 2018-06-08 厦门市美亚柏科信息股份有限公司 Parse method, the storage medium of PDF list datas
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
WO2020140698A1 (en) * 2019-01-04 2020-07-09 阿里巴巴集团控股有限公司 Table data acquisition method and apparatus, and server
CN110008809A (en) * 2019-01-04 2019-07-12 阿里巴巴集团控股有限公司 Acquisition methods, device and the server of list data
WO2020164281A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Form parsing method based on character location and recognition, and medium and computer device
CN110059687A (en) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 Pictorial information recognition methods, device, computer equipment and storage medium
WO2020186779A1 (en) * 2019-03-19 2020-09-24 平安科技(深圳)有限公司 Image information identification method and apparatus, and computer device and storage medium
CN110110313A (en) * 2019-05-09 2019-08-09 上海微投股权投资基金管理有限公司 Document form analysis method and device
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium
CN110472208A (en) * 2019-06-26 2019-11-19 上海恒生聚源数据服务有限公司 The method, system of form analysis, storage medium and electronic equipment in PDF document

Also Published As

Publication number Publication date
CN112712014A (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN112712014B (en) Method, system, device and readable storage medium for parsing table picture structure
WO2020192391A1 (en) Ocr-based image conversion method and apparatus, device and readable storage medium
CN111931784B (en) Bill recognition method, system, computer device and computer-readable storage medium
CN111595850B (en) Slice defect detection method, electronic device and readable storage medium
CN107688789B (en) Document chart extraction method, electronic device and computer readable storage medium
CN110569341B (en) Method and device for configuring chat robot, computer equipment and storage medium
CN113343740B (en) Table detection method, device, equipment and storage medium
CN110728687B (en) File image segmentation method and device, computer equipment and storage medium
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
CN112613506A (en) Method and device for recognizing text in image, computer equipment and storage medium
CN111639648A (en) Certificate identification method and device, computing equipment and storage medium
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN114241499A (en) Table picture identification method, device and equipment and readable storage medium
CN111310710A (en) Face detection method and system
CN113592720A (en) Image scaling processing method, device, equipment, storage medium and program product
CN113537184A (en) OCR (optical character recognition) model training method and device, computer equipment and storage medium
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
CN111651971A (en) Form information transcription method, system, electronic equipment and storage medium
CN112418206A (en) Picture classification method based on position detection model and related equipment thereof
CN113936130A (en) Document information intelligent acquisition and error correction method, system and equipment based on OCR technology
CN113901768A (en) Standard file generation method, device, equipment and storage medium
CN112257718A (en) Text recognition method and device for radiology department films
CN111222517A (en) Test sample generation method, system, computer device and storage medium
CN115797942B (en) Propaganda information interaction method and system
CN113177532B (en) Method, device, equipment and medium for identifying paragraph boundary of characters in image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant