CN112712014B

CN112712014B - Method, system, device and readable storage medium for parsing table picture structure

Info

Publication number: CN112712014B
Application number: CN202011589490.5A
Authority: CN
Inventors: 孙杰
Original assignee: Ping An Health Insurance Company of China Ltd
Current assignee: Ping An Health Insurance Company of China Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2024-04-30
Anticipated expiration: 2040-12-29
Also published as: CN112712014A

Abstract

The invention relates to the field of picture processing, and provides a table picture structure analysis method, which comprises the following steps: acquiring a target picture comprising a target table; detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure; determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content; creating a target table document according to the target table structure; and according to the position coordinates of the text contents, filling the text contents into the target table document. The invention improves the conversion efficiency and the accuracy of the table pictures to the tables such as Excel and the like; the efficiency of table picture conversion in relevant scenes is improved, and the processing cost is reduced.

Description

Method, system, device and readable storage medium for parsing table picture structure

Technical Field

The embodiment of the invention relates to the field of picture processing, in particular to a method, a system, equipment and a readable storage medium for analyzing a table picture structure.

Background

To ensure the integrity of the contents of a form document, the form document is often converted into a form document picture and transmitted. The form document picture can be fax, scanning or mobile phone picture taking of business related documents, etc. The receiver of the table document picture often files the data in the picture, and when the data in the table document picture is filed, the current common practice is as follows: the text content in the form document picture is extracted by using a text OCR technology, and then the form is arranged into an electronic document such as Excel by adopting a manual supplementary entry mode, but the efficiency is very low, and the accuracy is difficult to guarantee. Therefore, how to improve the accuracy of analyzing the table pictures and improve the efficiency of arranging the tables into electronic documents such as Excel at the same time becomes a technical problem to be solved currently.

Disclosure of Invention

In view of this, there is a need to provide a method, a system, a device and a readable storage medium for parsing a table picture structure, so as to solve the problems of low efficiency and low accuracy in converting a table document picture into an electronic document such as Excel.

In order to achieve the above object, an embodiment of the present invention provides a method for parsing a table picture structure, where the method includes:

Acquiring a target picture comprising a target table;

Detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure;

Determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content;

creating a target table document according to the target table structure; and

And filling the text content into the target table document according to the position coordinates of the text content.

Illustratively, the step of obtaining a target picture including a target table includes:

Acquiring a picture to be processed comprising a table;

Preprocessing the picture to be processed to obtain a picture to be cut;

Detecting a form area of the picture to be cut; and

And cutting the picture to be cut according to the table area of the picture to be cut to obtain the target picture.

Illustratively, the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.

Illustratively, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates; the step of detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure comprises the following steps:

detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines;

Extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments;

Defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and

And constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment.

Illustratively, the constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment includes:

splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections; and

And determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.

Illustratively, the determining the target table structure according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments includes:

constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table;

constructing a target intersection point graph matrix according to the intersection point coordinates; and

And determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target cells.

The text content comprises a plurality of cell text contents, and one cell position coordinate corresponding to each cell text content;

The step of filling the text content into the target table document according to the position coordinates of the text content comprises the following steps:

Matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and

And filling the text content of each cell into the corresponding target cell.

In order to achieve the above object, an embodiment of the present invention further provides a system for parsing a table picture structure, including:

the acquisition module is used for acquiring the picture to be processed comprising the form;

The detection module is used for detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure;

the determining module is used for determining the position coordinates of the text content of the target picture according to the target table structure and extracting the text content;

The creation module is used for creating a target table document according to the target table structure; and

And the filling module is used for filling the text content into the target form document according to the position coordinates of the text content.

To achieve the above object, an embodiment of the present invention further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the computer program when executed by the processor implements the steps of the table picture structure parsing method as described above.

To achieve the above object, an embodiment of the present invention further provides a computer readable storage medium having a computer program stored therein, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the table picture structure parsing method as described above.

According to the table picture structure analysis method, the system, the computer equipment and the computer readable storage medium, the target table structure is constructed according to the detected linear structure by detecting the intersection point of the table lines of the target table, and the table structure is accurately restored through the target table structure, so that the conversion efficiency and the accuracy of the table picture to the table such as Excel are improved; and the text content is filled into the target table document according to the position coordinates of the text content, so that the table picture conversion efficiency in the related scene is improved, and the processing cost is reduced.

Drawings

FIG. 1 is a flow chart of a method for resolving a table picture structure according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a picture to be processed according to a first embodiment of the invention;

FIG. 3 is a schematic diagram of intersecting points of a plurality of grid lines according to a first embodiment of the present invention;

FIG. 4 is a schematic diagram of a plurality of spliced line segments according to a first embodiment of the present invention;

FIG. 5 is a schematic diagram of an initial table in accordance with a first embodiment of the present invention;

FIG. 6 is a schematic diagram showing a sequence of estimating a plurality of cells according to a first embodiment of the present invention;

FIG. 7 is a diagram of a target table structure in accordance with a first embodiment of the present invention;

FIG. 8 is a schematic diagram showing cell merging according to a first embodiment of the present invention;

FIG. 9 is a diagram of a target table document according to a first embodiment of the present invention;

FIG. 10 is a schematic diagram illustrating a program module of a second embodiment of a table-picture structure parsing system according to the present invention;

fig. 11 is a schematic diagram of a hardware structure of a third embodiment of the computer device of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.

In the following embodiments, an exemplary description will be made with the computer device 2 as an execution subject.

Example 1

Referring to fig. 1, a flowchart illustrating steps of a table picture structure parsing method according to an embodiment of the invention is shown. It will be appreciated that the flow charts in the method embodiments are not intended to limit the order in which the steps are performed. An exemplary description will be made below with the computer device 2 as an execution subject. Specifically, the following is described.

Step S100, obtaining a target picture including a target table.

The target picture may be any picture containing a table.

In an actual business scenario, the form document may convert a document format such as Excel into a picture format. For example, the picture containing the table may be fax, scanning or mobile phone shooting picture of the document related to the business, etc.

In an exemplary embodiment, the step S100 may further include steps S200 to S204, wherein:

Step S200, obtaining a to-be-processed picture including a table.

The target picture may also be a picture cut out of the pictures to be processed, which are obtained in advance by the computer device 2.

When the computer device 2 receives the picture to be processed, the picture to be processed may be detected once to determine whether a table exists in the picture to be processed. And if the table exists in the to-be-processed picture, intercepting a target picture comprising the table from the to-be-processed picture.

The picture to be processed may also be a picture having a plurality of tables, for example. When the computer device 2 detects that the to-be-processed picture has a plurality of tables, the computer device 2 may intercept a corresponding target picture according to each table to obtain a plurality of target pictures.

Step S202, preprocessing the picture to be processed to obtain the picture to be cut.

In an exemplary embodiment, the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.

The to-be-processed picture can be a picture which is manually shot, namely, the original shot picture can be possibly subjected to the problems that the intercepted target picture is inclined, the text content in the target picture is unclear and the like due to factors such as shooting angles, shooting rays and the like.

In order to avoid the above problem, the computer device 2 may pre-process the to-be-processed picture after receiving the to-be-processed picture, for example, may perform operations such as angle pointing, size scaling, and pixel value normalization on the to-be-processed picture to obtain the to-be-cropped picture (i.e., the processed to-be-processed picture).

Step S204, detecting the table area of the picture to be cut.

The computer device 2 may detect the table locale where the table of the picture to be cropped is located. Wherein, this step is implemented by a table local area detection algorithm, for example, the table local area detection algorithm may be a YOLO algorithm, cascadeTabNet algorithm, and the like.

Step S206, clipping the picture to be clipped according to the table area of the picture to be clipped to obtain the target picture.

It can be understood that the tables of some pictures occupy only a small part of the pictures, and in order to improve the accuracy of the later picture detection, the computer device 2 may also crop the picture to be cropped according to the table area, so as to obtain the target picture only including the target table.

If the to-be-cut picture includes a plurality of tables, the computer device 2 may further cut the to-be-cut picture according to the table areas of the tables, so as to obtain a plurality of target pictures.

In this embodiment, the computer device 2 performs the preprocessing operation and the cropping operation on the to-be-processed picture, so that the accuracy of picture detection is improved, and the efficiency of picture detection is further improved.

As shown in fig. 2, for better understanding, a schematic diagram of the picture to be processed is also provided in the present embodiment.

Step S102, detecting a linear structure of a target table in the target picture, and determining the target table structure of the target picture according to the linear structure.

The computer device 2 may input the target picture into the line detection model to determine a target table structure of the target picture from a linear structure detected by the line detection model for the target picture. Wherein, the line detection result may include a plurality of unit cell line segments, each unit cell line segment including two end point coordinates.

In an exemplary embodiment, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates. The cell line segments are line segments of table cells of a target table in the target picture.

The step S102 may further include steps S300 to S306, where:

Step S300, detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines.

The line detection model may be an L-CNN deep learning model, for example, in which the line detection model may extract image high-dimensional features with HourglassNet (a network structure of a convolutional neural network) as a back bone (a neural network model).

Specifically, the computer device 2 may input the target picture into the line detection model, so as to extract suspected table line intersections in the target picture through an intersection detection link preset by the line detection model, so as to obtain the plurality of table line intersections.

As shown in fig. 3, for better understanding, the present embodiment further provides a schematic diagram of intersection points of a plurality of grid lines of the target picture.

Step S302, extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain the plurality of cell line segments.

The computer device 2 may extract from each two adjacent table line intersections one cell line segment by means of the line detection model to obtain a plurality of cell line segments. After obtaining the plurality of cell line segments, the line detection model may output the plurality of table line intersections and the plurality of cell line segments to the computer device 2.

And step S304, defining coordinates of intersection points of table grid lines at two ends of each cell line segment according to the position of each cell line segment so as to obtain coordinates of two end points of each cell line segment.

After receiving the intersection points of the table grid lines and the plurality of cell line segments output by the line detection model, the computer device 2 may define coordinates of the intersection points of the table grid lines at two ends of each cell line segment according to the position of each cell line segment, so as to obtain coordinates of two end points of each cell line segment.

Step S306, constructing the target table structure according to the plurality of cell line segments and the coordinates of two end points of each cell line segment.

For example, after the computer device 2 obtains the plurality of cell line segments and the two end coordinates of each cell line segment, the label table structure may also be constructed according to the plurality of cell line segments and the two end coordinates of each cell line segment. According to the implementation, the L-CNN deep learning model is used for carrying out line detection on the target table of the target picture, so that the problems of difficult labeling, complex processing and weak learning and expression capability on complex problems in the existing table line extraction method (such as the table line extraction method of the deep learning model based on image segmentation) are solved.

When the implementation carries out line detection on the target table of the target picture through the L-CNN deep learning model, only segment segmentation points are required to be marked, and level marking is not required to be changed; the L-CNN deep learning model can also detect the intersection points in the table and judge whether to form a table boundary line, and the learning and expression capacity of the model is stronger. In addition, the output result (the intersection points of the table grid lines and the line segments of the unit grid) of the L-CNN deep learning model is an accurate representation of the end points of the table lines of the target table of the target picture, and can be directly input into the next alternative use.

The step S306 may further include steps S400 to S402, where:

and step S400, splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of the coordinates of each endpoint, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections.

After the computer device 2 obtains the plurality of cell line segments and the coordinates of the two end points of each cell line segment, the computer device may further perform merging processing on the plurality of cell line segments according to the coordinate sequence of the coordinates of the two end points of each cell line segment, so that the continuous cell line segments are merged into a spliced line segment. The spliced line segments are long line segments of the table in the target table.

Illustratively, the unit grid line section includes a unit grid horizontal line section and a unit grid vertical line section, wherein the unit grid horizontal line sections may be combined into one spliced horizontal line section, and the unit grid vertical line sections may be combined into one spliced vertical line section.

For example, the computer device 2 may further determine whether a misjudgment line segment exists in the plurality of spliced line segments, and if the misjudgment line segment exists, the computer device 2 may reject the misjudgment line.

Step S402, determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.

For example, after the computer device 2 obtains the plurality of stitching lateral line segments and the plurality of stitching longitudinal line segments, the target table structure may also be determined according to the plurality of stitching lateral line segments and the plurality of stitching longitudinal line segments.

In this embodiment, the computer device 2 may determine the target table structure according to the plurality of spliced line segments by splicing the plurality of unit grid line segments into the plurality of spliced line segments, so as to improve accuracy of constructing the target table structure and efficiency of constructing the target table structure.

As shown in fig. 4, for better understanding, the present solution also provides a schematic diagram of a plurality of spliced line segments of the target picture.

In some embodiments, to ensure the integrity of the target form, the computer device 2 may also lengthen or supplement the four-sided boundary lines of the initial form to become a closed line frame. According to the embodiment, four boundary lines of the initial table can be prolonged or supplemented to form a closed line frame, so that the loss of the outer frame boundary line due to scanning quality or shooting reasons is prevented, and the table analysis integrity is improved.

The step S402 may further include steps S500 to S504, where:

And S500, constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table.

The computer device 2 constructs an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and extracts a plurality of intersection coordinates in the initial table.

The target table may also be an irregular table, for example.

Taking the irregular table as an example, where the line segments are represented by endpoint coordinates:

the plurality of spliced transverse line segments are as follows:

[((10,10),(60,10))、((20,20),(60,20))、((10,30),(30,30))、 ((50,30),(60,30))、((10,40),(50,40))、((10,50),(60,50))];

The plurality of spliced longitudinal line segments are as follows:

[((10,10),(10,50))、((20,10),(20,30))、((30,20),(30,40))、 ((40,10),(40,20))、((50,20),(50,50))、((60,10),(60,50))].

The table structure of the initial table constructed according to the above-mentioned multiple spliced horizontal line segments and the above-mentioned multiple spliced vertical line segments may be shown in fig. 5, where the coordinates of the points at the upper left of fig. 5 are taken as zero points, and the coordinates of the multiple points at the intersection are shown in fig. 5.

And step S502, constructing a target intersection point diagram matrix according to the plurality of intersection point coordinates.

After obtaining the plurality of intersection coordinates, the computer device 2 may construct the target intersection map matrix M according to the plurality of intersection coordinates:

In the target cross point diagram matrix M, i and j are serial numbers of the spliced horizontal line segments and the spliced vertical line segments respectively. Wherein Intersection (l ₁,l₂) represents the coordinate value of the intersection point of two straight lines.

Taking the example table as an example, "×" indicates no intersection points, and the intersection point diagram matrix is shown in table 1.

	A	B	C	D	E	F
							a	(10,10)	(20,10)	×	(40,10)	×	(60,10)
b	×	(20,20)	(30,20)	(40,20)	(50,20)	(60,20)
							c	×	×	×	×	(50,30)	(60,30)
d	(10,30)	(20,30)	30,30)	×	×	×
							e	(10,40)	×	(30,40)	×	(50,40)	×
f	(10,50)	×	×	×	(50,50)	(60,50)

TABLE 1

Wherein, the letters A-F represent the numbers of the spliced longitudinal line segments, and the letters a-F represent the numbers of the spliced transverse line segments. The horizontal positions of the spliced transverse line segments c and d are the same as each other due to the same line, and the number sequence of the spliced transverse line segments c and d does not affect the final result in the method.

In some embodiments, to ensure the integrity of the target table, the computer device 2 may also lengthen or supplement the four-sided boundary line of the initial table to become a closed line frame. For example, each line segment may be extended slightly to ensure that it can intersect correspondingly, and the intersection coordinates obtained. That is, the position "x" in table 1 is filled with the corresponding intersection coordinates.

Step S504, determining a plurality of target cells according to the target intersecting point map matrix, and determining the target table structure according to the plurality of target cells.

The computer device 2 may calculate a plurality of target cells based on the target scatter plot matrix M. The calculating process may start from the upper left corner of table 1, and find the existing intersection point through the dynamic programming concept, so as to sequentially form the upper left corner, the upper right corner, the lower right corner and the lower left corner of the target cell. If the four corner points can be found, the rectangle with the smallest area is the reasonable cell, and the position of the target cell can be represented by four corner point coordinates.

For example, the order of deriving the plurality of target cells may be as shown in fig. 6, where numbers ① to ⑨ in fig. 6 are the target cell detection orders.

Recording the detected target cells with four corner coordinates, the following target cells can be obtained, wherein the target cell set can be expressed as:

boxes＝[((10,10),(20,10),(20,30),(10,30)), ((20,10),(40,10),(40,20),(20,20)), ((40,10),(60,10),(60,20),(40,20)), ((20,20),(30,20),(30,30),(20,30)),((30,20),(50,20),(50,40),(30,40)), ((50,20),(60,20),(60,30),(50,30)), ((10,30),(30,30),(30,40),(10,40)), ((50,30),(60,30),(60,50),(50,50)),((10,40),(50,40),(50,50),(10,50))]

the target cell coordinates described above can be used to obtain the target table structure shown in fig. 7.

According to the method and the device, the target cells can be searched through the target cross point diagram matrix, the restoration of the target table structure can be achieved on the basis of complete online detection, the algorithm is simple and low in complexity, and the analysis efficiency of the table structure is improved.

Step S104, determining the position coordinates of the text content of the target picture according to the target table structure, and extracting the text content.

Performing character recognition on the target picture, and marking the recognized character information according to the target table structure so as to determine the actual position of the character information; the table picture structure analysis system can locate text slice coordinates in the table picture by a general OCR algorithm and identify text contents. Based on the center coordinates or corner coordinates of the text slice.

Step S106, creating a target table document according to the target table structure.

And step S108, filling the text content into the target table document according to the position coordinates of the text content.

the step S108 may further include steps S600 to S602, where: step S600, matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and step S602, filling the text content of each cell into the corresponding target cell.

The computer device 2 may match the text content of each cell with a target cell, and fill the text information into the target table document according to the actual position of the text information.

Illustratively, the horizontal and vertical lines are first combined in rows and columns, and the line segments that are nearly collinear within a certain range should be combined into a table line of the same row (column).

In this embodiment, the spliced transverse line segment c and the spliced transverse line segment d are determined to be collinear, and thus both are the lower edge lines of the second-row target cells. Based on the combined number of horizontal lines N and vertical lines M, a table range of (N-1) X (M-1) is first determined in Excel. And obtaining a crossing row-column range according to the peripheral edge line of each target cell, and merging the regions. The target cell in fig. 8 has a splice horizontal segment span of 2-4 and a splice vertical segment span of 3-5. The target cell span may be found to be 2 to 3 rows, 3 to 4 rows. The target cell region merging can be completed by using the write_merge () function of xlwt. Finally, an Excel file effect is generated, taking fig. 2 as an example, and the result is shown in fig. 9.

According to the line-based structure reasoning method provided by the embodiment of the invention, the search of the cell is realized by establishing the intersection point graph matrix of the transverse line and the longitudinal line, the restoration of any table structure can be realized on the basis of complete on-line detection, and the algorithm is simple and low in complexity. The method can match the characters in the form picture with the cells by combining the universal character OCR technology, and finally the generated electronic form document can accurately restore the form content and the form structure. The process efficiency of electronization of the table pictures in the related scenes is improved, the business automation degree of enterprises is improved, and the manual processing cost is reduced.

The table picture structure parsing method further includes: uploading the target table structure and the target table document into a blockchain.

Illustratively, uploading the target table structure and the target table document to a blockchain may ensure its security and fair transparency. The blockchain referred to in this example is a novel mode of application for computer technology such as distributed data storage, point-to-point transmission, consensus mechanisms, encryption algorithms, and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

Example two

Fig. 10 is a schematic diagram of a program module of a table-picture structure parsing system according to a second embodiment of the invention. The form picture structure parsing system 20 may include or be partitioned into one or more program modules stored in a storage medium and executed by one or more processors to accomplish the present invention and to implement the form picture structure parsing method described above. Program modules in accordance with the embodiments of the present invention are directed to a series of computer program instruction segments capable of performing particular functions, and are more suitable than programs themselves for describing the execution of the form picture structure parsing system 20 in a storage medium. The following description will specifically describe functions of each program module of the present embodiment:

the obtaining module 200 is configured to obtain a to-be-processed picture including a table.

Illustratively, the acquiring module 200 is further configured to: acquiring a picture to be processed comprising a table; preprocessing the picture to be processed to obtain a picture to be cut; detecting a form area of the picture to be cut; and cutting the picture to be cut according to the table area of the picture to be cut to obtain the target picture.

The detection module 202 is configured to detect a linear structure of a target table in the target picture, and determine a target table structure of the target picture according to the linear structure.

Illustratively, the linear structure includes a plurality of cell line segments, each cell line segment including two endpoint coordinates; the detection module 202 is further configured to: detecting the intersection points of the table lines of the target table to obtain a plurality of intersection points of the table lines; extracting a cell line segment according to the intersection point of every two adjacent table lines to obtain a plurality of cell line segments; defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment; and constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment.

Illustratively, the detection module 202 is further configured to: splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections; and determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments.

Illustratively, the detection module 202 is further configured to: constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table; constructing a target intersection point graph matrix according to the intersection point coordinates; and determining a plurality of target cells according to the target cross point diagram matrix, and determining the target table structure according to the target plurality of cells.

The text content comprises a plurality of cell text contents, and one cell position coordinate corresponding to each cell text content; the detection module 202 is further configured to: matching corresponding target cells for the text content of each cell according to the position coordinates of each cell; and filling the text content of each cell into the corresponding target cell.

The determining module 204 is configured to determine a position coordinate of the text content of the target picture according to the target table structure, and extract the text content.

A creating module 206, configured to create a target table document according to the target table structure.

And a filling module 208, configured to fill the text content into the target table document according to the position coordinates of the text content.

Illustratively, the table picture structure parsing system 20 further includes an uploading module, configured to upload the calling result into a blockchain.

Example III

Referring to fig. 11, a hardware architecture diagram of a computer device according to a third embodiment of the invention is shown. In this embodiment, the computer device 2 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The computer device 2 may be a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster made up of multiple servers), or the like. As shown, the computer device 2 includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a table picture structure parsing system 20 communicatively coupled to each other via a system bus.

In this embodiment, the memory 21 includes at least one type of computer-readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk provided on the computer device 2, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD), or the like. Of course, the memory 21 may also include both internal storage units of the computer device 2 and external storage devices. In this embodiment, the memory 21 is generally used for storing an operating system and various application software installed on the computer device 2, for example, program codes of the table picture structure analysis system 20 of the second embodiment. Further, the memory 21 may be used to temporarily store various types of data that have been output or are to be output.

Processor 22 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 22 is typically used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is configured to execute the program code or process data stored in the memory 21, for example, execute the table picture structure parsing system 20, so as to implement the table picture structure parsing method of the first embodiment.

The network interface 23 may comprise a wireless network interface or a wired network interface, which network interface 23 is typically used for establishing a communication connection between the computer apparatus 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be an Intranet (Intranet), the Internet (Internet), the Global System for Mobile communications (Global System of Mobile communicatI/On, GSM), wideband code division multiple Access (Wideband Code DivisI/On Multiple Access, WCDMA), 4G network, 5G network, bluetooth (Bluetooth), wi-Fi, or other wireless or wireline network.

It is noted that fig. 11 shows only the computer device 2 with components 20-23, but it is understood that not all of the illustrated components are required to be implemented, and that more or fewer components may be implemented instead.

In the present embodiment, the table picture structure parsing system 20 stored in the memory 21 may be further divided into one or more program modules, which are stored in the memory 21 and executed by one or more processors (the processor 22 in the present embodiment) to complete the present invention.

For example, fig. 10 shows a schematic diagram of a program module for implementing the form picture structure parsing system 20 according to the second embodiment of the present invention, where the form picture structure parsing system 20 may be divided into an obtaining module 200, a detecting module 202, a determining module 204, a creating module 206 and a filling module 208. Program modules in the present invention are understood to be a series of computer program instruction segments capable of performing a specific function, more preferably than a program, for describing the execution of the table picture structure parsing system 20 in the computer device 2. The specific functions of the program modules 200-208 are described in detail in the second embodiment, and are not described herein.

Example IV

The present embodiment also provides a computer-readable storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by a processor, performs the corresponding functions. The computer readable storage medium of the present embodiment is used in the table picture structure parsing system 20, and when executed by a processor, implements the table picture structure parsing method of the first embodiment.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A method for parsing a table picture structure, the method comprising:

Acquiring a target picture comprising a target table;

creating a target table document according to the target table structure; and

Filling the text content into the target table document according to the position coordinates of the text content;

The linear structure comprises a plurality of unit grid line segments, wherein each unit grid line segment comprises two endpoint coordinates; the step of detecting the linear structure of the target table in the target picture and determining the target table structure of the target picture according to the linear structure comprises the following steps:

Constructing the target table structure according to the plurality of cell line segments and the coordinates of two endpoints of each cell line segment;

wherein the constructing the target table structure according to the plurality of cell line segments and coordinates of two end points of each cell line segment includes:

splicing the plurality of unit grid line sections into a plurality of spliced line sections according to the coordinate position sequence of each endpoint coordinate, wherein the plurality of spliced line sections comprise a plurality of spliced transverse line sections and a plurality of spliced longitudinal line sections;

Determining the target table structure according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments;

Wherein the determining the target table structure according to the plurality of spliced horizontal line segments and the plurality of spliced vertical line segments includes:

Constructing an initial table according to the plurality of spliced transverse line segments and the plurality of spliced longitudinal line segments, and acquiring a plurality of intersection coordinates in the initial table; extending or supplementing four boundary lines of the initial form to form a closed line frame;

2. The method of claim 1, wherein the step of obtaining a target picture including a target table comprises:

Acquiring a picture to be processed comprising a table;

Preprocessing the picture to be processed to obtain a picture to be cut;

Detecting a form area of the picture to be cut; and

3. The tabular picture structure parsing method of claim 2, wherein the preprocessing operation includes an angle correction operation, a size scaling operation, and a pixel value normalization operation.

4. The method for parsing a table picture structure according to claim 1, wherein the text contents include a plurality of cell text contents, and each cell text content corresponds to a cell position coordinate;

And filling the text content of each cell into the corresponding target cell.

5. A form picture structure parsing system, comprising:

The filling module is used for filling the text content into the target form document according to the position coordinates of the text content;

The linear structure comprises a plurality of unit grid line segments, wherein each unit grid line segment comprises two endpoint coordinates; the detection module is used for:

defining coordinates of grid line intersection points at two ends of each cell line segment according to the position of each cell line segment to obtain two endpoint coordinates of each cell line segment;

6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program when executed by the processor implements the steps of the table picture structure parsing method according to any one of claims 1 to 4.

7. A computer-readable storage medium, in which a computer program is stored, the computer program being executable by at least one processor to cause the at least one processor to perform the steps of the table picture structure parsing method according to any one of claims 1 to 4.