CN112712014A - Table picture structure analysis method, system, equipment and readable storage medium - Google Patents

Table picture structure analysis method, system, equipment and readable storage medium Download PDF

Info

Publication number
CN112712014A
CN112712014A CN202011589490.5A CN202011589490A CN112712014A CN 112712014 A CN112712014 A CN 112712014A CN 202011589490 A CN202011589490 A CN 202011589490A CN 112712014 A CN112712014 A CN 112712014A
Authority
CN
China
Prior art keywords
target
picture
line segments
cell
text content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011589490.5A
Other languages
Chinese (zh)
Other versions
CN112712014B (en
Inventor
孙杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Health Insurance Company of China Ltd
Original Assignee
Ping An Health Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Health Insurance Company of China Ltd filed Critical Ping An Health Insurance Company of China Ltd
Priority to CN202011589490.5A priority Critical patent/CN112712014B/en
Publication of CN112712014A publication Critical patent/CN112712014A/en
Application granted granted Critical
Publication of CN112712014B publication Critical patent/CN112712014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of picture processing, and provides a table picture structure analysis method, which comprises the following steps: acquiring a target picture comprising a target table; detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure; determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content; creating a target form document according to the target form structure; and filling the text content into the target table document according to the position coordinates of the text content. The invention improves the conversion efficiency and accuracy of the table picture to the Excel and other tables; the efficiency of the conversion of the table pictures in the related scenes is improved, and the processing cost is reduced.

Description

Table picture structure analysis method, system, equipment and readable storage medium
Technical Field
The embodiment of the invention relates to the field of picture processing, in particular to a table picture structure analysis method, a table picture structure analysis system, table picture structure analysis equipment and a readable storage medium.
Background
To ensure the integrity of the contents of the form document, the form document is often converted into a form document picture and transmitted. The form document picture can be a fax, a scan or a mobile phone picture of a business-related document. While the receiving party of the table document picture usually files the data in the picture, when filing the data in the table document picture, the current common practice is: the method comprises the steps of extracting character contents in a form document picture by using a character OCR technology, and then arranging a form into an electronic document such as Excel and the like by adopting a manual supplementary input mode, but the efficiency is very low, and the accuracy is difficult to guarantee. Therefore, how to improve the accuracy of analyzing a table picture and improve the efficiency of organizing a table into an electronic document such as Excel becomes a technical problem which needs to be solved at present.
Disclosure of Invention
In view of the above, it is desirable to provide a method, a system, a device and a readable storage medium for analyzing a table picture structure, so as to solve the problems of low efficiency and low accuracy in converting a table document picture into an electronic document such as Excel.
In order to achieve the above object, an embodiment of the present invention provides a table picture structure parsing method, where the method includes:
acquiring a target picture comprising a target table;
detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure;
determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content;
creating a target form document according to the target form structure; and
and filling the text content into the target table document according to the position coordinates of the text content.
Illustratively, the step of obtaining a target picture including a target table includes:
acquiring a picture to be processed comprising a form;
preprocessing the picture to be processed to obtain a picture to be cut;
detecting a table area of the picture to be cut; and
and according to the table area of the picture to be cut, cutting the picture to be cut to obtain the target picture.
Illustratively, the preprocessing operations include an angle rectification operation, a size scaling operation, and a pixel value normalization operation.
Illustratively, the linear structure comprises a plurality of cell line segments, each cell line segment comprising two end point coordinates; the step of detecting a linear structure of a target table in the target picture and determining a target table structure of the target picture according to the linear structure includes:
performing table line intersection point detection on the target table to obtain a plurality of table line intersection points;
extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;
defining coordinates of table line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and
and constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment.
Illustratively, the constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment includes:
splicing the plurality of cell line segments into a plurality of spliced line segments according to the coordinate position sequence of each endpoint coordinate, wherein the spliced line segments comprise a plurality of spliced transverse line segments and a plurality of spliced longitudinal line segments; and
and determining the target table structure according to the splicing transverse line segments and the splicing longitudinal line segments.
Illustratively, the determining the target table structure according to the plurality of splicing horizontal line segments and the plurality of splicing vertical line segments includes:
constructing an initial table according to the splicing horizontal line segments and the splicing vertical line segments, and acquiring a plurality of intersection point coordinates in the initial table;
constructing a target intersection point graph matrix according to the intersection point coordinates; and
and determining a plurality of target cells according to the target intersection point matrix, and determining the target table structure according to the target cells.
Illustratively, the text content comprises a plurality of unit cell text contents, and each unit cell text content corresponds to one unit cell position coordinate;
the step of filling the text content into the target form document according to the position coordinates of the text content comprises the following steps:
matching a corresponding target cell for the text content of each cell according to the position coordinate of each cell; and
and filling the text content of each cell into the corresponding target cell.
In order to achieve the above object, an embodiment of the present invention further provides a table picture structure parsing system, including:
the acquisition module is used for acquiring a picture to be processed comprising a form;
the detection module is used for detecting a linear structure of a target table in the target picture and determining the target table structure of the target picture according to the linear structure;
the determining module is used for determining the position coordinates of the text content of the target picture according to the target table structure and extracting the text content;
a creation module for creating a target form document according to the target form structure; and
and the filling module is used for filling the text content into the target form document according to the position coordinates of the text content.
In order to achieve the above object, an embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the computer program is executed by the processor, the computer program implements the steps of the table picture structure parsing method as described above.
To achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, in which a computer program is stored, where the computer program is executable by at least one processor, so as to cause the at least one processor to execute the steps of the table picture structure parsing method as described above.
According to the table picture structure analysis method, the table picture structure analysis system, the computer device and the computer readable storage medium, provided by the embodiment of the invention, the table line intersection point detection is carried out on the target table, the target table structure is constructed according to the detected linear structure, and the table structure is accurately restored through the target table structure, so that the conversion efficiency and the accuracy of the table picture to the Excel and other tables are improved; and the text content is filled into the target table document according to the position coordinates of the text content, so that the conversion efficiency of the table picture in the related scene is improved, and the processing cost is reduced.
Drawings
Fig. 1 is a schematic flow chart of a table picture structure analysis method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a to-be-processed picture according to a first embodiment of the present invention;
FIG. 3 is a schematic diagram of intersection points of a plurality of table lines according to a first embodiment of the present invention;
FIG. 4 is a schematic diagram of a plurality of splicing segments according to a first embodiment of the present invention;
FIG. 5 is a diagram illustrating an initial table according to a first embodiment of the present invention;
FIG. 6 is a schematic diagram illustrating a sequence of calculating a plurality of cells according to a first embodiment of the present invention;
FIG. 7 is a diagram of a target table structure according to a first embodiment of the present invention;
FIG. 8 is a diagram illustrating cell merging according to a first embodiment of the present invention;
FIG. 9 is a diagram of a target form document according to a first embodiment of the present invention;
FIG. 10 is a schematic diagram of program modules of a table picture structure parsing system according to a second embodiment of the present invention;
fig. 11 is a schematic diagram of a hardware structure of a third embodiment of the computer apparatus according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
In the following embodiments, the computer device 2 will be exemplarily described as an execution subject.
Example one
Referring to fig. 1, a flowchart illustrating steps of a table picture structure parsing method according to an embodiment of the invention is shown. It is to be understood that the flow charts in the embodiments of the present method are not intended to limit the order in which the steps are performed. The following description is made by way of example with the computer device 2 as the execution subject. The details are as follows.
Step S100, a target picture including a target table is acquired.
The target picture can be any picture containing a table.
In an actual business scenario, a form document may convert a document format such as Excel into a picture format. For example, the picture containing the table may be a fax, a scan or a cell phone picture of a document related to the business, etc.
In an exemplary embodiment, the step S100 may further include a step S200 to a step S204, where:
step S200, acquiring a picture to be processed including a table.
The target picture may also be a cut picture in a picture to be processed acquired in advance by the computer device 2.
When the computer device 2 receives the to-be-processed picture, it may perform a detection on the to-be-processed picture to determine whether a table exists in the to-be-processed picture. And if the table exists in the picture to be processed, intercepting a target picture comprising the table from the picture to be processed.
For example, the picture to be processed may also be a picture with a plurality of tables. When the computer device 2 detects that there are multiple tables in the to-be-processed picture, the computer device 2 may intercept a corresponding target picture according to each table to obtain multiple target pictures.
Step S202, preprocessing the picture to be processed to obtain a picture to be cut.
In an exemplary embodiment, the preprocessing operations include an angle rectification operation, a size scaling operation, and a pixel value normalization operation.
The picture to be processed can be a picture shot manually, namely, the original shot picture can be subjected to factors such as shooting angle and shooting light to cause the problems that the intercepted target picture is inclined, the text content in the target picture is not clear and the like.
In order to avoid the above problem, after receiving the to-be-processed picture, the computer device 2 may perform preprocessing on the to-be-processed picture, for example, perform operations such as angle correction, size scaling, and pixel value normalization on the to-be-processed picture, so as to obtain the to-be-cropped picture (i.e., the processed to-be-processed picture).
And step S204, detecting a table area of the picture to be cut.
The computer device 2 may detect a table local area where the table of the picture to be cropped is located. The step is implemented by a table local area detection algorithm, for example, the table local area detection algorithm may be a YOLO algorithm, a cascadeetabnt algorithm, or the like.
Step S206, according to the table area of the picture to be cut, cutting the picture to be cut to obtain the target picture.
It can be understood that tables of some pictures only occupy a small part of the pictures, and in order to improve the accuracy of the later-stage picture detection, the computer device 2 may further crop the picture to be cropped according to the table area, so as to obtain the target picture only including the target table.
If the picture to be cropped includes a plurality of tables, the computer device 2 may also crop the picture to be cropped according to the table area of each table to obtain a plurality of target pictures.
In this embodiment, the computer device 2 performs the preprocessing operation and the clipping operation on the to-be-processed picture, so that the accuracy of picture detection is improved, and the efficiency of picture detection is further improved.
As shown in fig. 2, for better understanding, a schematic diagram of a picture to be processed is also provided in the present embodiment.
Step S102, detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure.
The computer device 2 may input the target picture into the line detection model to determine a target table structure of the target picture through a linear structure detected by the line detection model for the target picture. Wherein the line detection result may include a plurality of cell line segments, each cell line segment including two endpoint coordinates.
In an exemplary embodiment, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates. And the cell line segment is a line segment of a table cell of a target table in the target picture.
The step S102 may further include a step S300 to a step S306, where:
step S300, performing table line intersection detection on the target table to obtain a plurality of table line intersections.
For example, the line detection model may be an L-CNN deep learning model, wherein the line detection model may extract image high-dimensional features with HourglassNet (network structure of convolutional neural network) as a backbone (neural network model).
Specifically, the computer device 2 may input the target picture into the line detection model, so as to extract suspected table line intersections in the target picture through an intersection detection link preset by the line detection model, so as to obtain the plurality of table line intersections.
As shown in fig. 3, for better understanding, the present embodiment also provides a schematic diagram of intersection points of a plurality of table lines of the target picture.
Step S302, extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain the plurality of cell line segments.
The computer device 2 may extract one cell line segment from the doctor from every two adjacent table line intersections by the line detection model to obtain a plurality of cell line segments. After obtaining the plurality of cell line segments, the line detection model may output the plurality of table line intersections and the plurality of cell line segments to the computer device 2.
Step S304, defining coordinates of intersection points of table lines at two ends of each cell line segment according to the position of each cell line segment to obtain coordinates of two end points of each cell line segment.
After receiving the intersection points of the table lines and the cell line segments output by the line detection model, the computer device 2 may define coordinates of the intersection points of the table lines at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment.
Step S306, constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment.
For example, after the computer device 2 obtains the coordinates of the end points of the cell line segments and each of the cell line segments, the table structure may be further constructed according to the coordinates of the end points of the cell line segments and each of the cell line segments. The method carries out line detection on the target table of the target picture through the L-CNN deep learning model, and solves the problems of difficult labeling, complex processing and weak learning and expression capacity of complex problems of the existing table line extraction method (such as the table line extraction method of the deep learning model based on image segmentation).
When the line detection is carried out on the target table of the target picture through the L-CNN deep learning model, only segment segmentation points are marked, and level marking is not required to be changed; the L-CNN deep learning model can also detect intersection points in the table and judge whether a table side line is formed, and the model has stronger learning and expression capabilities. In addition, the L-CNN deep learning model outputs the result (the intersection point of the plurality of table lines and the plurality of cell line segments) as an accurate representation of the end point of the table line of the target table of the target picture, and can directly input the next alternative use.
The step S306 may further include a step S400 to a step S402, where:
and S400, splicing the plurality of cell line segments into a plurality of spliced line segments according to the coordinate position sequence of each endpoint coordinate, wherein the spliced line segments comprise a plurality of spliced transverse line segments and a plurality of spliced longitudinal line segments.
After the computer device 2 obtains the plurality of cell line segments and the two end point coordinates of each cell line segment, the plurality of cell line segments can be merged according to the coordinate sequence of the two end point coordinates of each cell line segment, so that the continuous cell line segments are merged into a splicing line segment. And the splicing line segment is a table long line segment in the target table.
Illustratively, the cell line segments include cell horizontal line segments and cell vertical line segments, wherein the cell horizontal line segments can be combined into one splicing horizontal line segment, and the cell vertical line segments can be combined into one splicing vertical line segment.
For example, the computer device 2 may further determine whether there is a misjudged line segment in the plurality of spliced line segments, and if there is the misjudged line segment, the computer device 2 may remove the misjudged line.
Step S402, determining the target table structure according to the splicing horizontal line segments and the splicing vertical line segments.
For example, after the computer device 2 obtains the plurality of splicing horizontal line segments and the plurality of splicing vertical line segments, the target table structure may be determined according to the plurality of splicing horizontal line segments and the plurality of splicing vertical line segments.
In this embodiment, the computer device 2 may determine the target table structure according to the plurality of splicing segments by splicing the plurality of cell segments into the plurality of splicing segments, so as to improve the accuracy of constructing the target table structure and the efficiency of constructing the target table structure.
As shown in fig. 4, for better understanding, the present solution also provides a schematic diagram of a plurality of spliced segments of a target picture.
In some embodiments, to ensure the integrity of the target form, the computer device 2 may also extend or supplement the four-sided boundary line of the initial form, making it a closed wire frame. In the embodiment, the four-side boundary line of the initial table can be extended or supplemented to form a closed wire frame, so that the outer frame boundary line is prevented from being lost due to scanning quality or shooting reasons, and the integrity of table analysis is improved.
The step S402 may further include a step S500 to a step S504, where:
and S500, constructing an initial table according to the splicing horizontal line segments and the splicing vertical line segments, and acquiring a plurality of intersection point coordinates in the initial table.
The computer device 2 constructs an initial table according to the plurality of splicing horizontal line segments and the plurality of splicing vertical line segments, and extracts a plurality of intersection point coordinates in the initial table.
Illustratively, the target table may also be an irregular table.
Taking the table of the target picture as an example, which is an irregular table, wherein the line segment is represented by the coordinates of the end point:
the plurality of spliced cross segments are:
[((10,10),(60,10))、((20,20),(60,20))、((10,30),(30,30))、 ((50,30),(60,30))、((10,40),(50,40))、((10,50),(60,50))];
the plurality of spliced longitudinal line segments are:
[((10,10),(10,50))、((20,10),(20,30))、((30,20),(30,40))、 ((40,10),(40,20))、((50,20),(50,50))、((60,10),(60,50))]。
the table structure of the initial table constructed from the plurality of transverse spliced segments and the plurality of longitudinal spliced segments may be as shown in fig. 5, in which the coordinates of the upper left point of fig. 5 is taken as a zero point, and the coordinates of the plurality of intersection points are as shown in fig. 5.
And step S502, constructing a target intersection point graph matrix according to the intersection point coordinates.
After obtaining the intersection coordinates, the computer device 2 may construct the target intersection point map matrix M according to the intersection coordinates:
Figure BDA0002868191270000111
and i and j in the target intersection point graph matrix M are respectively the serial numbers of the splicing horizontal line segments and the splicing vertical line segments. Wherein, interaction (l)1,l2) And represents the coordinate value of the intersection of the two straight lines.
Taking an example table as an example, an "x" indicates no intersection, and the intersection map matrix is shown in table 1.
A B C D E F
a (10,10) (20,10) × (40,10) × (60,10)
b × (20,20) (30,20) (40,20) (50,20) (60,20)
c × × × × (50,30) (60,30)
d (10,30) (20,30) 30,30) × × ×
e (10,40) × (30,40) × (50,40) ×
f (10,50) × × × (50,50) (60,50)
TABLE 1
Wherein, letters A-F represent the numbers of splicing longitudinal line segments, and letters a-F represent the numbers of splicing transverse line segments. The horizontal positions of the splicing cross line sections c and d are the same due to the same row, and the numbering sequence does not influence the final result in the method.
In some embodiments, to ensure the integrity of the target form, computer device 2 may also extend or supplement the four-sided boundary line of the initial form, making it a closed wire frame. For example, the intersection coordinates may be obtained by slightly extending each line segment to ensure that the corresponding intersections are obtained. That is, the position of "x" in table 1 is filled with the corresponding intersection coordinates.
Step S504, a plurality of target cells are determined according to the target intersection point diagram matrix, and the target table structure is determined according to the target cells.
The computer device 2 may derive a plurality of target cells based on the target intersection point map matrix M. The calculation process may be started from the top left corner of table 1, and the intersection points are found through the dynamic planning idea, so as to form the top left corner, the top right corner, the bottom right corner, and the bottom left corner of the target cell in sequence. If a rectangle with the smallest area and the intersection points of all the four corner points can be found, the reasonable cell is obtained, and the position of the target cell can be represented by coordinates of the four corner points.
For example, the order of calculating the target cells may be as shown in fig. 6, where the numbers (r) to (n) in fig. 6 are the target cell detection order.
Recording the detected target cells with the coordinates of the four corner points, the following target cells can be obtained, wherein the target cell set can be expressed as:
boxes=[((10,10),(20,10),(20,30),(10,30)), ((20,10),(40,10),(40,20),(20,20)), ((40,10),(60,10),(60,20),(40,20)), ((20,20),(30,20),(30,30),(20,30)), ((30,20),(50,20),(50,40),(30,40)), ((50,20),(60,20),(60,30),(50,30)), ((10,30),(30,30),(30,40),(10,40)), ((50,30),(60,30),(60,50),(50,50)), ((10,40),(50,40),(50,50),(10,50))]
from the target cell coordinates, the target table structure diagram shown in fig. 7 can be obtained.
According to the embodiment, the target cells can be searched through the target cross-point diagram matrix, the target table structure can be restored on the basis of complete online detection, the algorithm is simple, the complexity is low, and the table structure analysis efficiency is improved.
And step S104, determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content.
Performing character recognition on the target picture, and marking recognized character information according to the target table structure so as to determine the actual position of the character information; the table picture structure analysis system can locate the coordinates of the character slices in the table picture by a common OCR algorithm and identify the character content. Based on the center coordinates or corner coordinates of the text slice.
And step S106, creating a target table document according to the target table structure.
And step S108, filling the text content into the target table document according to the position coordinates of the text content.
Illustratively, the text content comprises a plurality of unit cell text contents, and each unit cell text content corresponds to one unit cell position coordinate;
the step S108 may further include a step S600 to a step S602, where: step S600, matching a corresponding target cell for the text content of each cell according to the position coordinate of each cell; and step S602, filling the character content of each cell into the corresponding target cell.
The computer device 2 may match the text content of each cell with a target cell, and fill the text information into the target form document according to the actual position of the text information.
Illustratively, the horizontal lines and the vertical lines are merged into rows and columns, and the line segments which are close to the same line in a certain range should be merged into table lines in the same row (column).
In the present embodiment, the stitching bar c and the stitching bar d are determined to be collinear, and therefore both are the lower lines of the target cells in the second row. According to the combined row number N of the horizontal lines and the column number M of the vertical lines, the table range of (N-1) × (M-1) is determined in Excel. And obtaining a crossing row-column range according to the peripheral lines of all the target cells, and merging the areas. The target cell in fig. 8 has a splice cross segment span of 2-4 and a splice longitudinal segment span of 3-5. The target cell span can be obtained as 2 rows to 3 rows, 3 rows to 4 rows. Merging the target cell areas can be completed by using the write _ merge () function of xlwt. Finally, the Excel file effect is generated, and the result is shown in fig. 9 by taking fig. 2 as an example.
The line-based structure reasoning method provided by the implementation method realizes the search of the cells by establishing the intersection point graph matrix of the horizontal lines and the vertical lines, can realize the reduction of any table structure on the basis of complete online detection, and has simple algorithm and low complexity. The scheme can also match the characters in the table picture with the cells by combining a general character OCR technology, and the finally generated electronic table document can accurately restore the table content and the table structure. The method improves the process efficiency of the electronization of the table pictures in the related scenes, improves the automation degree of enterprise business, and reduces the manual processing cost.
Illustratively, the table picture structure parsing method further includes: uploading the target table structure and the target table document into a blockchain.
For example, uploading the target table structure and the target table document to a blockchain may ensure security and fair transparency. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Example two
Fig. 10 is a schematic diagram of program modules of a table picture structure analysis system according to a second embodiment of the present invention. The table picture structure parsing system 20 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to implement the present invention and implement the above table picture structure parsing method. The program module referred to in the embodiments of the present invention refers to a series of computer program instruction segments capable of performing specific functions, and is more suitable for describing the execution process of the table picture structure parsing system 20 in the storage medium than the program itself. The following description will specifically describe the functions of the program modules of the present embodiment:
an obtaining module 200, configured to obtain a to-be-processed picture including a table.
Illustratively, the obtaining module 200 is further configured to: acquiring a picture to be processed comprising a form; preprocessing the picture to be processed to obtain a picture to be cut; detecting a table area of the picture to be cut; and according to the table area of the picture to be cut, cutting the picture to be cut to obtain the target picture.
Illustratively, the preprocessing operations include an angle rectification operation, a size scaling operation, and a pixel value normalization operation.
The detecting module 202 is configured to detect a linear structure of a target table in the target picture, and determine a target table structure of the target picture according to the linear structure.
Illustratively, the linear structure comprises a plurality of cell line segments, each cell line segment comprising two end point coordinates; the detection module 202 is further configured to: performing table line intersection point detection on the target table to obtain a plurality of table line intersection points; extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments; defining coordinates of table line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment.
Illustratively, the detecting module 202 is further configured to: splicing the plurality of cell line segments into a plurality of spliced line segments according to the coordinate position sequence of each endpoint coordinate, wherein the spliced line segments comprise a plurality of spliced transverse line segments and a plurality of spliced longitudinal line segments; and determining the target table structure according to the splicing horizontal line segments and the splicing vertical line segments.
Illustratively, the detecting module 202 is further configured to: constructing an initial table according to the splicing horizontal line segments and the splicing vertical line segments, and acquiring a plurality of intersection point coordinates in the initial table; constructing a target intersection point graph matrix according to the intersection point coordinates; and determining a plurality of target cells according to the target intersection point matrix, and determining the target table structure according to the plurality of target cells.
Illustratively, the text content comprises a plurality of unit cell text contents, and each unit cell text content corresponds to one unit cell position coordinate; the detection module 202 is further configured to: matching a corresponding target cell for the text content of each cell according to the position coordinate of each cell; and filling the text content of each cell into the corresponding target cell.
The determining module 204 is configured to determine the position coordinates of the text content of the target picture according to the target table structure, and extract the text content.
A creation module 206 for creating a target form document according to the target form structure.
And a filling module 208, configured to fill the text content into the target form document according to the position coordinates of the text content.
Illustratively, the table picture structure parsing system 20 further includes an uploading module, where the uploading module is configured to upload the calling result to a block chain.
EXAMPLE III
Fig. 11 is a schematic diagram of a hardware architecture of a computer device according to a third embodiment of the present invention. In the present embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers), and the like. As shown, the computer device 2 includes, but is not limited to, at least a memory 21, a processor 22, a network interface 23, and a table picture structure parsing system 20, which are communicatively connected to each other through a system bus.
In this embodiment, the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both internal and external memory units of the computer device 2. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 2 and various application software, such as a program code of the table picture structure parsing system 20 in the second embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to run a program code stored in the memory 21 or process data, for example, run the table picture structure parsing system 20, so as to implement the table picture structure parsing method according to the first embodiment.
The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is generally used for establishing communication connection between the computer device 2 and other electronic apparatuses. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication i/On (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like.
It is noted that fig. 11 only shows the computer device 2 with components 20-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the table picture structure parsing system 20 stored in the memory 21 can be further divided into one or more program modules, and the one or more program modules are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 22) to complete the present invention.
For example, fig. 10 is a schematic diagram of program modules for implementing the form picture structure parsing system 20 according to a second embodiment of the present invention, in which the form picture structure parsing system 20 may be divided into an obtaining module 200, a detecting module 202, a determining module 204, a creating module 206, and a filling module 208. The program module referred to in the present invention refers to a series of computer program instruction segments capable of performing specific functions, and is more suitable than a program for describing the execution process of the table picture structure analysis system 20 in the computer device 2. The specific functions of the program modules 200 and 208 have been described in detail in the second embodiment, and are not described herein again.
Example four
The present embodiment also provides a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., on which a computer program is stored, which when executed by a processor implements corresponding functions. The computer-readable storage medium of this embodiment is used in the table picture structure parsing system 20, and when executed by a processor, the table picture structure parsing method of the first embodiment is implemented.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A table picture structure analysis method is characterized by comprising the following steps:
acquiring a target picture comprising a target table;
detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure;
determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content;
creating a target form document according to the target form structure; and
and filling the text content into the target table document according to the position coordinates of the text content.
2. The table picture structure parsing method of claim 1, wherein the step of obtaining a target picture including a target table comprises:
acquiring a picture to be processed comprising a form;
preprocessing the picture to be processed to obtain a picture to be cut;
detecting a table area of the picture to be cut; and
and according to the table area of the picture to be cut, cutting the picture to be cut to obtain the target picture.
3. A table picture structure parsing method as claimed in claim 2, wherein the preprocessing operations include an angle rectification operation, a size scaling operation, and a pixel value normalization operation.
4. The table picture structure parsing method of claim 1, wherein the linear structure includes a plurality of cell line segments, each cell line segment including two end point coordinates; the step of detecting a linear structure of a target table in the target picture and determining a target table structure of the target picture according to the linear structure includes:
performing table line intersection point detection on the target table to obtain a plurality of table line intersection points;
extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;
defining coordinates of table line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and
and constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment.
5. The table picture structure parsing method according to claim 4, wherein the constructing the target table structure according to the plurality of cell line segments and the two endpoint coordinates of each cell line segment comprises:
splicing the plurality of cell line segments into a plurality of spliced line segments according to the coordinate position sequence of each endpoint coordinate, wherein the spliced line segments comprise a plurality of spliced transverse line segments and a plurality of spliced longitudinal line segments; and
and determining the target table structure according to the splicing transverse line segments and the splicing longitudinal line segments.
6. The table picture structure parsing method according to claim 5, wherein the determining the target table structure according to the plurality of splicing horizontal line segments and the plurality of splicing vertical line segments comprises:
constructing an initial table according to the splicing horizontal line segments and the splicing vertical line segments, and acquiring a plurality of intersection point coordinates in the initial table;
constructing a target intersection point graph matrix according to the intersection point coordinates; and
and determining a plurality of target cells according to the target intersection point matrix, and determining the target table structure according to the plurality of target cells.
7. The table picture structure parsing method of claim 6, wherein the text content includes a plurality of cell text contents, and a cell position coordinate corresponding to each cell text content;
the step of filling the text content into the target form document according to the position coordinates of the text content comprises the following steps:
matching a corresponding target cell for the text content of each cell according to the position coordinate of each cell; and
and filling the text content of each cell into the corresponding target cell.
8. A table picture structure parsing system, comprising:
the acquisition module is used for acquiring a picture to be processed comprising a form;
the detection module is used for detecting a linear structure of a target table in the target picture and determining the target table structure of the target picture according to the linear structure;
the determining module is used for determining the position coordinates of the text content of the target picture according to the target table structure and extracting the text content;
a creation module for creating a target form document according to the target form structure; and
and the filling module is used for filling the text content into the target form document according to the position coordinates of the text content.
9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program, when executed by the processor, implements the steps of the table picture structure parsing method according to any one of claims 1 to 7.
10. A computer-readable storage medium, having stored therein a computer program, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the table picture structure parsing method according to any one of claims 1-7.
CN202011589490.5A 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure Active CN112712014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011589490.5A CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011589490.5A CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Publications (2)

Publication Number Publication Date
CN112712014A true CN112712014A (en) 2021-04-27
CN112712014B CN112712014B (en) 2024-04-30

Family

ID=75546243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011589490.5A Active CN112712014B (en) 2020-12-29 2020-12-29 Method, system, device and readable storage medium for parsing table picture structure

Country Status (1)

Country Link
CN (1) CN112712014B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343866A (en) * 2021-06-15 2021-09-03 杭州数梦工场科技有限公司 Identification method and device of form information and electronic equipment
CN113536951A (en) * 2021-06-22 2021-10-22 科大讯飞股份有限公司 Form recognition method and related device, electronic equipment and storage medium
CN113591746A (en) * 2021-08-05 2021-11-02 上海金仕达软件科技有限公司 Document table structure detection method and device
CN113591739A (en) * 2021-08-04 2021-11-02 广联达科技股份有限公司 Method and device for identifying face area in drawing, computer equipment and storage medium
CN116612487A (en) * 2023-07-21 2023-08-18 亚信科技(南京)有限公司 Table identification method and device, electronic equipment and storage medium
CN118095202A (en) * 2024-04-29 2024-05-28 江西易卫云信息技术有限公司 Experimental data recording method and system

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07182459A (en) * 1993-12-22 1995-07-21 Matsushita Electric Ind Co Ltd Table structure extracting device
JPH1153466A (en) * 1997-07-31 1999-02-26 Hitachi Ltd Table recognizing method
US20030123727A1 (en) * 1998-09-11 2003-07-03 Tomotoshi Kanatsu Table recognition method and apparatus, and storage medium
US20170308746A1 (en) * 2016-04-25 2017-10-26 Fujitsu Limited Form recognition method, form recognition device, and non-transitory computer-readable medium
KR101811581B1 (en) * 2016-11-15 2017-12-26 주식회사 셀바스에이아이 Aparatus and method for cell decomposition for a table recognition in document image
CN108132916A (en) * 2017-11-30 2018-06-08 厦门市美亚柏科信息股份有限公司 Parse method, the storage medium of PDF list datas
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
CN110008809A (en) * 2019-01-04 2019-07-12 阿里巴巴集团控股有限公司 Acquisition methods, device and the server of list data
CN110059687A (en) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 Pictorial information recognition methods, device, computer equipment and storage medium
CN110110313A (en) * 2019-05-09 2019-08-09 上海微投股权投资基金管理有限公司 Document form analysis method and device
US20190278853A1 (en) * 2018-03-09 2019-09-12 International Business Machines Corporation Extracting Structure and Semantics from Tabular Data
US20190294399A1 (en) * 2018-03-26 2019-09-26 Abc Fintech Co., Ltd. Method and device for parsing tables in pdf document
CN110472208A (en) * 2019-06-26 2019-11-19 上海恒生聚源数据服务有限公司 The method, system of form analysis, storage medium and electronic equipment in PDF document
WO2020164281A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Form parsing method based on character location and recognition, and medium and computer device
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07182459A (en) * 1993-12-22 1995-07-21 Matsushita Electric Ind Co Ltd Table structure extracting device
JPH1153466A (en) * 1997-07-31 1999-02-26 Hitachi Ltd Table recognizing method
US20030123727A1 (en) * 1998-09-11 2003-07-03 Tomotoshi Kanatsu Table recognition method and apparatus, and storage medium
US20170308746A1 (en) * 2016-04-25 2017-10-26 Fujitsu Limited Form recognition method, form recognition device, and non-transitory computer-readable medium
KR101811581B1 (en) * 2016-11-15 2017-12-26 주식회사 셀바스에이아이 Aparatus and method for cell decomposition for a table recognition in document image
CN108132916A (en) * 2017-11-30 2018-06-08 厦门市美亚柏科信息股份有限公司 Parse method, the storage medium of PDF list datas
US20190278853A1 (en) * 2018-03-09 2019-09-12 International Business Machines Corporation Extracting Structure and Semantics from Tabular Data
US20190294399A1 (en) * 2018-03-26 2019-09-26 Abc Fintech Co., Ltd. Method and device for parsing tables in pdf document
CN109726643A (en) * 2018-12-13 2019-05-07 北京金山数字娱乐科技有限公司 The recognition methods of form data, device, electronic equipment and storage medium in image
WO2020140698A1 (en) * 2019-01-04 2020-07-09 阿里巴巴集团控股有限公司 Table data acquisition method and apparatus, and server
CN110008809A (en) * 2019-01-04 2019-07-12 阿里巴巴集团控股有限公司 Acquisition methods, device and the server of list data
WO2020164281A1 (en) * 2019-02-13 2020-08-20 平安科技(深圳)有限公司 Form parsing method based on character location and recognition, and medium and computer device
CN110059687A (en) * 2019-03-19 2019-07-26 平安科技(深圳)有限公司 Pictorial information recognition methods, device, computer equipment and storage medium
WO2020186779A1 (en) * 2019-03-19 2020-09-24 平安科技(深圳)有限公司 Image information identification method and apparatus, and computer device and storage medium
CN110110313A (en) * 2019-05-09 2019-08-09 上海微投股权投资基金管理有限公司 Document form analysis method and device
WO2020232872A1 (en) * 2019-05-22 2020-11-26 平安科技(深圳)有限公司 Table recognition method and apparatus, computer device, and storage medium
CN110472208A (en) * 2019-06-26 2019-11-19 上海恒生聚源数据服务有限公司 The method, system of form analysis, storage medium and electronic equipment in PDF document

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343866A (en) * 2021-06-15 2021-09-03 杭州数梦工场科技有限公司 Identification method and device of form information and electronic equipment
CN113536951A (en) * 2021-06-22 2021-10-22 科大讯飞股份有限公司 Form recognition method and related device, electronic equipment and storage medium
CN113536951B (en) * 2021-06-22 2023-11-24 科大讯飞股份有限公司 Form identification method, related device, electronic equipment and storage medium
CN113591739A (en) * 2021-08-04 2021-11-02 广联达科技股份有限公司 Method and device for identifying face area in drawing, computer equipment and storage medium
CN113591746A (en) * 2021-08-05 2021-11-02 上海金仕达软件科技有限公司 Document table structure detection method and device
CN116612487A (en) * 2023-07-21 2023-08-18 亚信科技(南京)有限公司 Table identification method and device, electronic equipment and storage medium
CN116612487B (en) * 2023-07-21 2023-10-13 亚信科技(南京)有限公司 Table identification method and device, electronic equipment and storage medium
CN118095202A (en) * 2024-04-29 2024-05-28 江西易卫云信息技术有限公司 Experimental data recording method and system

Also Published As

Publication number Publication date
CN112712014B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN112712014B (en) Method, system, device and readable storage medium for parsing table picture structure
CN109933756B (en) Image file transferring method, device and equipment based on OCR (optical character recognition), and readable storage medium
CN111595850B (en) Slice defect detection method, electronic device and readable storage medium
CN107688789B (en) Document chart extraction method, electronic device and computer readable storage medium
US9710704B2 (en) Method and apparatus for finding differences in documents
CN110569341B (en) Method and device for configuring chat robot, computer equipment and storage medium
CN107689070B (en) Chart data structured extraction method, electronic device and computer-readable storage medium
CN110807454B (en) Text positioning method, device, equipment and storage medium based on image segmentation
CN110728687B (en) File image segmentation method and device, computer equipment and storage medium
CN113221632A (en) Document picture identification method and device and computer equipment
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
CN114241499A (en) Table picture identification method, device and equipment and readable storage medium
CN111222517A (en) Test sample generation method, system, computer device and storage medium
CN113837151A (en) Table image processing method and device, computer equipment and readable storage medium
CN114005126A (en) Table reconstruction method and device, computer equipment and readable storage medium
CN112232336A (en) Certificate identification method, device, equipment and storage medium
CN113592720A (en) Image scaling processing method, device, equipment, storage medium and program product
CN113177542A (en) Method, device and equipment for identifying characters of seal and computer readable medium
CN117351511A (en) High-precision map detection method, device and equipment
CN116050379A (en) Document comparison method and storage medium
CN114511862A (en) Form identification method and device and electronic equipment
CN113259538A (en) Electronic file transferring system
CN112464892A (en) Bill region identification method and device, electronic equipment and readable storage medium
CN116090422B (en) Method and device for inputting electric power industry expansion form
CN113177995B (en) Text reorganization method of CAD drawing and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant