CN113836878A - Table generation method and device combining RPA and AI, electronic equipment and storage medium - Google Patents

Table generation method and device combining RPA and AI, electronic equipment and storage medium Download PDF

Info

Publication number
CN113836878A
CN113836878A CN202111026974.3A CN202111026974A CN113836878A CN 113836878 A CN113836878 A CN 113836878A CN 202111026974 A CN202111026974 A CN 202111026974A CN 113836878 A CN113836878 A CN 113836878A
Authority
CN
China
Prior art keywords
target
cell
rpa system
cells
rpa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111026974.3A
Other languages
Chinese (zh)
Inventor
黄安
汪冠春
胡一川
褚瑞
李玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Laiye Network Technology Co Ltd
Laiye Technology Beijing Co Ltd
Original Assignee
Beijing Laiye Network Technology Co Ltd
Laiye Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Laiye Network Technology Co Ltd, Laiye Technology Beijing Co Ltd filed Critical Beijing Laiye Network Technology Co Ltd
Priority to CN202111026974.3A priority Critical patent/CN113836878A/en
Publication of CN113836878A publication Critical patent/CN113836878A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/174Form filling; Merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a table generation method and device combining RPA and AI, electronic equipment and a storage medium, and relates to the field of artificial intelligence. The scheme is as follows: the method comprises the steps that the RPA system extracts horizontal lines and vertical lines of a first table from an image based on artificial intelligence AI; the RPA system acquires an intersection point set of a horizontal line and a vertical line, wherein the intersection point set comprises a first type intersection point formed by intersecting the horizontal line and the vertical line and a second type intersection point formed by intersecting an extension line of the horizontal line and/or an extension line of the vertical line; the RPA system generates a blank second table consistent with the first table according to the intersection point set; and the RPA system fills the text entries recognized from the image based on the OCR into the blank second table to obtain a target table. The method and the device have the advantages that the RPA technology is used for identifying the table in the picture and reducing the table into the table document with the same table structure, offline data are automatically converted into online data, a complex manual processing flow is replaced, and the table generation efficiency is improved.

Description

Table generation method and device combining RPA and AI, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to a table generating method and apparatus, an electronic device, and a storage medium in combination with an RPA and an AI.
Background
Robot Process Automation (RPA) is a process task that simulates human operations on a computer by specific "robot software" and automatically executes according to rules.
Artificial Intelligence (AI) is a technical science that studies, develops theories, methods, techniques and application systems for simulating, extending and expanding human intelligence.
In the related technology, the form in the picture is converted into an on-line form document by typing, copying and pasting by workers, the work is repeated mechanically, the labor is wasted, and the efficiency is not high. Therefore, how to improve the efficiency of table generation and release manpower is an urgent issue.
Disclosure of Invention
The disclosure provides a table generation method and device combining RPA and AI, an electronic device and a storage medium.
According to an aspect of the present disclosure, there is provided a table generating method combining an RPA and an AI, including:
the RPA system extracts the horizontal line and the vertical line of the first table from the image based on artificial intelligence AI;
the RPA system acquires an intersection point set of a horizontal line and a vertical line, wherein the intersection point set comprises a first type intersection point formed by intersecting the horizontal line and the vertical line and a second type intersection point formed by intersecting an extension line of the horizontal line and/or an extension line of the vertical line;
the RPA system generates a blank second table consistent with the first table according to the intersection point set;
and the RPA system fills the text entries recognized from the image based on the OCR into the blank second table to obtain a target table.
The embodiment of the disclosure uses the RPA technology to identify the table in the picture, and reduces the table into the table document with the same table structure, and automatically converts the offline data into the online data, thereby replacing the fussy manual processing flow and improving the efficiency of table generation.
According to another aspect of the present disclosure, there is provided a table generating apparatus combining an RPA and an AI, including:
the extraction module is used for extracting the horizontal line and the vertical line of the first table from the image based on artificial intelligence AI;
the intersection point acquisition module is used for acquiring an intersection point set of the transverse lines and the vertical lines, wherein the intersection point set comprises a first type of intersection point formed by intersecting the transverse lines and the vertical lines and a second type of intersection point formed by intersecting extension lines of the transverse lines and/or extension lines of the vertical lines;
the generating module is used for generating a blank second table consistent with the first table according to the intersection point set;
and the filling module is used for filling the text entries recognized from the image based on the OCR into the blank second table to obtain the target table.
According to another aspect of the present disclosure, there is provided an electronic device comprising a memory, a processor;
the processor reads the executable program code stored in the memory to run a program corresponding to the executable program code, so as to implement the table generating method combining the RPA and the AI according to the embodiment of the first aspect of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a table generation method combining RPA and AI of an embodiment of the first aspect of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the table generation method in combination with RPA and AI of the embodiment of the first aspect of the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
FIG. 1 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 2 is a schematic illustration of a table to be identified and detected horizontal and vertical lines;
FIG. 3 is a schematic view of a set of intersections of detected horizontal and vertical lines;
FIG. 4 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 5 is a schematic diagram of enumerating candidate cells;
FIG. 6 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 7 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 8 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 9 is a schematic diagram of a pair of intersections formed by intersections;
FIG. 10 is a schematic diagram of a base cell composed of pairs of intersections;
FIG. 11 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 12 is a schematic diagram of updating matrix element values after a target cell is determined;
FIG. 13 is a schematic diagram of a target row being detected and split into two sub-target Boolean matrices;
FIG. 14 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 15 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 16 is a flow diagram of a table generation method incorporating RPA and AI according to one embodiment of the present disclosure;
FIG. 17 is a block diagram of a table creation device incorporating RPA and AI according to one embodiment of the present disclosure;
fig. 18 is a block diagram of an electronic device for implementing a table generation method incorporating RPA and AI according to an embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
A table generating method, apparatus, electronic device, and storage medium combining RPA and AI of the present disclosure are described below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a table generation method combining RPA and AI according to an embodiment of the present disclosure, as shown in fig. 1, the method including the steps of:
s101, the RPA system extracts horizontal lines and vertical lines of the first table from the image based on artificial intelligence AI.
RPA is a technology that simulates human operation on a PC and is now beginning to be used in enterprise production offices. The core of the RPA is that the automatic technology is used for replacing people to perform fixed flow operations such as repeatability, low value, no need of manual decision and the like, so that the working efficiency is effectively improved, and errors are reduced.
The RPA system acquires an image with tables, the first table is a table to be identified in the image, and horizontal and vertical table lines in the image are respectively detected by using a table model, and the table lines can be inclined and bent to a certain degree. The horizontal and vertical lines of the first table are extracted based on artificial intelligence AI, alternatively the horizontal and vertical lines in the graph can be extracted using a conventional Computer Vision (CV) algorithm. And connecting line segments with close end points to prevent identification errors and obtain final horizontal and vertical lines. Taking fig. 2 as an example, the original image, the detected horizontal line, and the detected vertical line are shown from left to right.
S102, the RPA system acquires an intersection point set of the horizontal line and the vertical line, wherein the intersection point set comprises a first type intersection point formed by intersecting the horizontal line and the vertical line and a second type intersection point formed by intersecting an extension line of the horizontal line and/or an extension line of the vertical line.
The RPA system obtains a set of intersection points of the horizontal lines and the vertical lines, assuming that x horizontal lines and y vertical lines are obtained in the above steps, calculates coordinates of the intersection points of each pair of horizontal lines and vertical lines, and obtains x y points no matter whether the line segments intersect or not, as shown in fig. 3.
The intersection point set comprises a first type of intersection point formed by intersecting a horizontal line and a vertical line, and the intersection points are simultaneously penetrated by the horizontal line and the vertical line.
Wherein, the intersection point set also comprises a second type of intersection point formed by intersecting extension lines of the lines and/or extension lines of the vertical lines, and the intersection points are only penetrated by one transverse line/one vertical line.
S103, the RPA system generates a blank second table consistent with the first table according to the intersection point set.
Referring to fig. 3, the intersections can form a plurality of tables, and the RPA system identifies the table structure of the first table according to the set of the first type intersections and the second type intersections, and performs operations such as filtering and deduplication on the tables, so as to filter out the second table, which is the blank table consistent with the first table structure.
And S104, the RPA system fills the text entries recognized from the image based on the OCR into the blank second table to obtain a target table.
The Optical Character Recognition (ORC) technology is a computer input technology that converts characters of various bills, newspapers, books, manuscripts, and other printed matters into image information by an Optical input method such as scanning, and then converts the image information into usable computer information by using a Character Recognition technology.
And the RPA system identifies the text entry of the first table in the picture based on the ORC technology, and fills the text entry to the corresponding position of the second table to obtain the target table. The table format and table content of the target table are the same as those of the first table, the first table is papery table data, and the target table is electronic table data.
In the embodiment of the disclosure, an RPA system extracts a horizontal line and a vertical line of a first table from an image based on artificial intelligence AI, the RPA system acquires an intersection point set of the horizontal line and the vertical line, wherein the intersection point set comprises a first type of intersection point formed by intersecting the horizontal line and the vertical line and a second type of intersection point formed by intersecting an extension line of the horizontal line and/or an extension line of the vertical line, the RPA system generates a blank second table consistent with the first table according to the intersection point set, and the RPA system fills a text entry identified from the image based on OCR into the blank second table to obtain a target table. In the embodiment of the disclosure, the RPA technology is used for identifying the table in the picture, reducing the table into the table document with the same table structure, automatically converting offline data into online data, replacing a complex manual processing flow, and improving the table generation efficiency.
Fig. 4 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, further with fig. 4, a process of generating a blank second table consistent with the first table according to the intersection set by the RPA system is explained, which includes the following steps:
s401, the RPA system enumerates the cells according to the intersection points in the intersection point set, and obtains the candidate cells and the attribute information of the candidate cells.
Assuming that x horizontal lines and y vertical lines are obtained in the above steps, x y intersection points are obtained, and all possible cells are enumerated as candidate cells for the x y intersection points, wherein each cell is composed of a start row, an end row, a start column, an end column and four intersection point coordinates.
The attribute information of the candidate cell comprises the area of the cell, the coordinates of the four corners of the cell, the starting and ending points of the horizontal line and the starting and ending points of the vertical line corresponding to the cell.
The process of cell enumeration is explained by taking three pairs of intersections in fig. 5 as an example, and for convenience of description, the three pairs of intersections are respectively numbered as a, b, and c. As shown in fig. 5, a and b may form a candidate cell, and accordingly, b and c may also form a candidate cell, and finally, a and c may also form a candidate cell.
S402, the RPA system identifies a target cell of the second table for generating blanks from the candidate cells according to the attribute information of the candidate cells.
And traversing all the candidate cells by the RPA system according to the attribute information of the candidate cells, judging whether four edges of each candidate cell exist one by one, and taking the candidate cell as a target cell when the four edges exist.
And S403, generating a blank second table for the target cells according to the position arrangement by the RPA system.
And the RPA system acquires all the target cells and the attribute information of the target cells, and arranges the positions of the cells according to the four-corner coordinates of the target cells to generate a blank second table.
In the embodiment of the disclosure, the RPA system enumerates the cells according to the intersections in the intersection set to obtain the candidate cells and the attribute information of the candidate cells, the RPA system identifies the target cells for generating the blank second table from the candidate cells according to the attribute information of the candidate cells, and the RPA system generates the blank second table for the target cells according to the position arrangement. According to the intersection point set, the blank second table with the same structure as the first table is generated, detection and generation of the table structure are achieved, the table structure is the most important step in table generation, and a foundation is laid for filling of subsequent text entries.
Fig. 6 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, further with reference to fig. 6, a process of identifying, by an RPA system, a target cell of a second table for generating a blank from candidate cells according to attribute information of the candidate cells is explained, including the following steps:
s601, the RPA system sorts all enumerated candidate cells according to the sizes of the cells from small to large.
According to the four-corner coordinates of the candidate cells, the area of the cell can be obtained through calculation, and all enumerated candidate cells are sorted from small to large according to the area of the cell.
And S602, the RPA system sequentially traverses the candidate cells and judges the existence of the traversed target candidate cells.
And traversing the candidate cells according to the sequence of the small area to the large area, and judging the existence of the traversed target candidate cells.
And the RPA system acquires a first starting and ending point of a horizontal line and a second starting and ending point of a vertical line corresponding to the target candidate cell, judges whether four edges of the target candidate cell exist or not according to the four-corner coordinates, the first starting and ending point and the second starting and ending point of the target candidate cell, and determines that the target candidate cell exists when the four edges exist.
And S603, deleting the cells overlapped with the target candidate cells from the candidate cells which are not traversed each time the RPA system judges that the target candidate cells exist, and determining the target candidate cells judged to exist as the target cells.
When a target candidate cell is determined to exist, the candidate cell overlapping with the target candidate cell does not exist. The RPA system deletes the cells overlapped with the target candidate cells from the candidate cells which are not traversed, thereby reducing the workload of subsequent traversal.
And the RPA system determines the target candidate cell which is judged to exist as a target cell and is used for generating a blank second table.
And S604, the RPA system continuously traverses the candidate cells which are not traversed after deletion in sequence until all target cells are obtained after the traversal is finished.
And the RPA system continues to sequentially traverse the candidate cells which are not traversed after deletion, stops traversing after a new target cell is obtained, deletes the candidate cells which are not traversed at the moment and overlapped with the new target cell, and repeats the operation until all the target cells are obtained after the traversal is finished.
In the embodiment of the disclosure, the RPA system sorts all enumerated candidate cells according to the cell area from small to large, the RPA system traverses the candidate cells in sequence, judges the existence of the traversed target candidate cell, and deletes a cell overlapping with the target candidate cell from the candidate cells that are not traversed whenever the RPA system judges that the target candidate cell exists, determines the target candidate cell that is judged to exist as a target cell, and the RPA system continues to traverse the candidate cells that are not traversed after deletion in sequence until the traversal is finished to obtain all target cells. In the embodiment of the disclosure, the target cell for generating the second table is obtained from the candidate cells, the structural composition of the second table is preliminarily determined, and a foundation is laid for the generation of the second table.
Fig. 7 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and with reference to fig. 7, on the basis of the above embodiment, a process of traversing candidate cells in sequence by an RPA system and determining existence of traversed target candidate cells is explained, including the following steps:
s701, the RPA system acquires a first starting and ending point of a horizontal line and a second starting and ending point of a vertical line corresponding to the target candidate cell.
The RPA system acquires a first starting and ending point of a horizontal line and a second starting and ending point of a vertical line corresponding to the target candidate cell, namely the spatial direction and the length of the corresponding horizontal and vertical lines are obtained.
And S702, judging whether the four edges of the target candidate cell exist or not by the RPA system according to the four-corner coordinates, the first starting and ending point and the second starting and ending point of the target candidate cell.
According to the coordinates of the four corners of the candidate cell, four edges formed by the four corners can be obtained, and the edge formed by two points of the upper left corner and the lower left corner is taken as an example:
the position and length on the Y axis are obtained by projecting the side to the Y axis, and the position and length on the Y axis are obtained by projecting the scanned vertical line corresponding to the side to the Y axis. And (4) judging whether the two have overlapped parts or not, if so, proving that the edge exists, and if not, proving that the edge does not exist. And similarly, whether an edge formed by two points of the upper right corner and the lower right corner exists can be verified.
Correspondingly, take the edge formed by two points at the upper left corner and the upper right corner as an example:
the position and the length on the X axis are obtained by projecting the side to the X axis, and the position and the length on the X axis are obtained by projecting the scanned transverse line corresponding to the side to the X axis. And (4) judging whether the two have overlapped parts or not, if so, proving that the edge exists, and if not, proving that the edge does not exist. Similarly, whether an edge formed by two points of the lower left corner and the lower right corner exists can be verified.
And S703, when judging that the four edges exist, the RPA system determines that the target candidate cell exists.
And if four edges of the candidate cell exist, determining that the target candidate cell exists, and taking the target candidate cell as the target cell.
In the embodiment of the disclosure, the RPA system obtains a first starting point and a second starting point of a horizontal line and a vertical line corresponding to a target candidate cell, the RPA system judges whether four edges of the target candidate cell exist according to four corner coordinates, the first starting point and the second starting point of the target candidate cell, and when judging that the four edges exist, the RPA system determines that the target candidate cell exists. The embodiment of the disclosure provides a method for judging whether a target candidate cell exists, which lays a foundation for an RPA system to traverse the candidate cell to obtain all target cells.
Fig. 8 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, before the RPA system identifies a target cell of a second table for generating a blank from candidate cells according to attribute information of the candidate cells, as shown in fig. 8, the method further includes:
s801, the RPA system arranges the intersection points in the intersection point set in sequence, and forms a plurality of intersection point pairs by adjacent intersection points in the intersection point set according to the same direction.
The RPA system arranges the intersection points of each row and each column in the intersection point set according to a spatial sequence, and forms a plurality of intersection point pairs from adjacent intersection points in the intersection point set according to the same direction, optionally, the horizontal direction or the vertical direction.
As shown in fig. 9, when the intersection point pair is formed in the lateral direction, one intersection point will form an intersection point pair with the intersection point on its left and another intersection point pair with the intersection point on its right, and one intersection point may not exist only in one intersection point pair.
Accordingly, when the pair of intersections is formed in the longitudinal direction, one intersection forms one pair of intersections with the intersection above it, and the intersection below it forms another pair of intersections.
And S802, the RPA system sequentially acquires adjacent intersection point pairs according to the same direction, and the adjacent intersection point pairs form a basic cell.
And the RPA system sequentially acquires adjacent intersection point pairs according to the transverse direction or the longitudinal direction, and the base unit grids are formed by the adjacent intersection point pairs. When the adjacent intersection points are combined into an intersection point pair according to the horizontal direction, the adjacent intersection point pair is acquired according to the vertical direction. Accordingly, when the adjacent intersection points are formed into the intersection point pairs in the longitudinal direction, the adjacent intersection point pairs are acquired in the lateral direction.
As shown in fig. 10, when the base cell is composed by acquiring adjacent pairs of intersection points in the horizontal direction, a set of the pair of intersection points b may constitute one base cell with the pair of intersection points a on its left and another base cell with the pair of intersection points c on its right, and a set of the pair of intersection points may not constitute only one base cell.
Accordingly, when the base cell is composed of adjacent pairs of intersection points acquired in the longitudinal direction, one pair of intersection points e will constitute one base cell with the pair of intersection points d above it, and the pair of intersection points f below it will constitute another base cell.
And S803, the RPA system takes the basic cell as a matrix element to construct a Boolean matrix.
The RPA system takes the basic cells as matrix elements to construct a two-dimensional Boolean matrix. Assuming that x horizontal lines and y vertical lines are extracted in the above steps, x × y intersection points can be obtained, and at this time, a boolean matrix of (x-1) × (y-1) can be constructed to identify whether a cell at each position exists.
In the embodiment of the disclosure, the RPA system arranges the intersection points in the intersection point set in sequence, and forms a plurality of intersection point pairs from adjacent intersection points in the intersection point set according to the same direction, the RPA system sequentially obtains adjacent intersection point pairs according to the same direction, and forms a basic cell from the adjacent intersection point pairs, and the RPA system constructs a boolean matrix by using the basic cell as a matrix element. In the implementation of the method, a Boolean matrix is constructed for identifying whether the cells at each position exist, so that the traversing of the determined target cells and the positions of the target cells can be assisted to record, and convenience is provided for subsequent operations.
Fig. 11 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, further with reference to fig. 11, a process of generating a second blank table according to a location arrangement for a target cell by an RPA system is explained, including the following steps:
and S1101, the RPA system assigns values to each matrix element in the Boolean matrix according to the target cell to generate a target Boolean matrix.
All initial matrix elements in the Boolean matrix are F, and the F is taken as a first value. When a target cell is determined, the RPA system acquires a first basic cell included in the target cell according to the four-corner coordinates of the target cell and the four-corner coordinates, and updates a matrix element corresponding to the first basic cell from a first value F to a second value T.
As shown in fig. 12, for example, if the target cell in the second row includes three basic cells, the corresponding matrix elements of the three basic cells need to be updated from the first value F to the second value T.
And assigning values to each matrix element in the Boolean matrix according to the target cell to generate the target Boolean matrix, wherein the matrix element corresponding to the target cell is updated to a second value, and the rest matrix elements are kept unchanged from the initial first value.
S1102, the RPA system identifies whether the target Boolean matrix comprises a target row and/or a target column which are both the first values.
And the RPA system detects the value of each matrix element in the target Boolean matrix row by row and column by column, and identifies whether the target Boolean matrix comprises a target row and/or a target column which are the first values.
S1103, the RPA system splits the Boolean matrix according to the target row and/or the target column when the target row and/or the target column exist, and generates a sub-target Boolean matrix.
And if the existence of the target row and/or the target column is detected, the row and/or the column are proved not to contain the target cell, the two tables with the vertical/horizontal relation exist in the graph, the Boolean matrix is split according to the target row and/or the target column, and the sub-target Boolean matrix is generated.
As shown in fig. 13, if the target row is detected, it is proved that the target cell is not contained in the row, and the two tables having the upper and lower relations in the figure are illustrated.
And if the target column is detected to exist, the column is proved to contain no target cell, and the two tables with left and right relations in the graph are illustrated.
And detecting that a target row and a target column exist, and proving that the target row and the target column do not contain the target cell, wherein the four tables are illustrated as having the relation of the field words.
And S1104, the RPA system acquires the target cells corresponding to the sub-target Bohr matrix and arranges the target cells according to the positions to generate a blank second table corresponding to the sub-target Bohr matrix.
For the process of generating the blank second table corresponding to the sub-target boolean matrix in step S1104, reference may be made to the process of generating the blank second table according to the position arrangement of the target cells in step S403, and details are not repeated here.
In the embodiment of the disclosure, the RPA system assigns values to each matrix element in the boolean matrix according to the target cells to generate a target boolean matrix, identifies whether the target boolean matrix includes target rows and/or target columns which are both the first values, splits the boolean matrix according to the target rows and/or the target columns when the RPA system exists in the target rows and/or the target columns to generate sub-target boolean matrices, and acquires the target cells corresponding to the sub-target boolean matrices and arranges the target cells according to the positions to generate a second blank table corresponding to the sub-target boolean matrices. In the embodiment of the present disclosure, the RPA system further determines the structure of the table according to whether there are two tables having a vertical/horizontal relationship in the numeric judgment diagram of the value of each matrix element in the boolean matrix.
Fig. 14 is a flowchart of a table generating method combining RPA and AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, after the RPA system splits the boolean matrix according to the target row and/or the target column in the presence of the target row and/or the target column to generate the sub-target boolean matrix, as shown in fig. 14, the method further includes:
s1401, the RPA system identifies whether the target matrix element with the first value exists in the sub-target Boolean matrix.
And the RPA system sequentially traverses each matrix element in the sub-target Boolean matrix and judges whether the value of each matrix element is a first value F. And if the matrix element traversed currently is judged to be the first value, marking the matrix element as a target matrix element. And continuously traversing the rest matrix elements until the last matrix element is traversed to obtain all target matrix elements included in the sub-target Boolean matrix.
And S1402, the RPA system updates the value of the target matrix element from the first value to the second value.
In order to ensure the integrity of the table, the value of the target matrix element needs to be updated from the first value to the second value, so as to obtain the final second table. However, there may be two cases of the target matrix element with the first value, which may be an independent target cell that is missed, or may belong to an identified target cell. In the embodiment of the present application, the determination may be performed on the target matrix element, as shown in fig. 15.
And the RPA system acquires a target basic cell corresponding to the target matrix element. In this disclosure, one matrix element in the boolean matrix corresponds to one basic cell, and after determining the target matrix element, the RPA system may determine the basic cell corresponding to the target matrix element, which is referred to herein as the target basic cell, based on the position of the target matrix element in the boolean matrix.
And the RPA system judges whether the target basic cell needs to be merged into the target cell corresponding to the adjacent matrix element adjacent to the target matrix element. The RPA system can acquire the four-corner coordinates of the target basic cell, and judges whether the target basic cell needs to be merged into the existing target cell or not based on the four-corner coordinates. And if the coverage range of the existing target cell includes the four-corner coordinates of the target basic cell, determining that the existing target cell needs to be merged. And if the coverage range of the existing target cell does not include the four-corner coordinates of the target basic cell, determining that the existing target cell does not need to be merged, wherein the target basic cell is an independent target cell. Wherein, the existing target cell is the target cell corresponding to the adjacent matrix element adjacent to the target matrix element.
And if the RPA system judges that the combination is needed, combining the target basic cells into the target cells corresponding to the adjacent matrix elements to form a larger target cell.
And if the RPA system judges that merging is not needed, determining the target basic cell as a missing target cell, complementing the value of the place into T to be used as an independent target cell, and completing the missing target cell when the target cells are arranged according to the positions.
In the embodiment of the present disclosure, the RPA system identifies whether a target matrix element whose value is a first value exists in the sub-target boolean matrix, and the RPA system updates the value of the target matrix element from the first value to a second value. In the embodiment of the present disclosure, the target matrix element whose value is the first value in the sub-target boolean matrix is updated to the second value, thereby completing the missing cells and determining the table structure.
Fig. 16 is a flowchart of a table generating method combining an RPA and an AI according to an embodiment of the present disclosure, and on the basis of the above embodiment, further with reference to fig. 16, a process of an RPA system to fill a text entry recognized from an image based on OCR into a blank second table to obtain a target table is explained, including the following steps:
s1601, the RPA system acquires four-corner coordinates of the text entry.
And the RPA system acquires a text recognition box where the text entry is located, and the four-corner coordinates of the text recognition box are used as the four-corner coordinates of the text entry.
Alternatively, text entries may be extracted from a text image containing a table by an algorithm that extracts foreground pixels by scanning a template to form a rectangular box surrounding the foreground pixels of the image, and then combines the rectangular boxes to form a pattern chain. And classifying the modes by utilizing the three statistical characteristics of the maximum black run, the length and the width of the modes, and extracting characters.
And S1602, the RPA system matches the four-corner coordinates of the text entry with the four-corner coordinates of the target cell to obtain the target text entry corresponding to the target cell.
And the RPA system matches the four-corner coordinates of the text entries with the four-corner coordinates of the target unit grids to judge whether the text entries have a first text entry occupying at least two target unit grids.
The four-corner coordinates of the text entry can form an area, and if all the four-corner coordinates of one cell fall inside the area, and two of the four-corner coordinates of another cell also fall inside the area, two cells are occupied.
In the same way, the coordinates of the four corners of two cells fall inside, and the coordinates of the third cell fall inside, so that three cells are occupied.
If the first text entry exists, the RPA system divides the first text entry according to at least two occupied cells to obtain a target text entry of each target cell occupied by the first text entry.
For example, if the first text entry occupies two adjacent cells, the first text entry is segmented by using the shared edge of the adjacent cells to obtain the respective text entries of the two occupied target cells.
And if the text entry only occupies one target cell, directly taking the text entry as the target text entry corresponding to the target cell.
S1603, the RPA system fills the target text entries into corresponding target units to obtain a target table.
For specific implementation of step S1603, reference may be made to relevant descriptions in embodiments of the present disclosure, and details are not described here.
In the embodiment of the disclosure, the RPA system acquires four-corner coordinates of a text entry, the RPA system matches the four-corner coordinates of the text entry with the four-corner coordinates of a target cell to acquire a target text entry corresponding to the target cell, and the RPA system fills the target text entry into the corresponding target cell to acquire a target table. According to the embodiment of the invention, the text entries crossing the cells are split, the text entries can be accurately filled into the corresponding target cells, and a complete table document is generated.
Fig. 17 is a block diagram of a table creation apparatus combining RPA and AI according to an embodiment of the present disclosure, and as shown in fig. 17, a table creation apparatus 1700 combining RPA and AI includes:
an extraction module 1710, configured to extract a horizontal line and a vertical line of the first table from the image based on the artificial intelligence AI;
the intersection point acquisition module 1720 is used for acquiring an intersection point set of the horizontal lines and the vertical lines, wherein the intersection point set comprises a first type of intersection point formed by intersecting the horizontal lines and the vertical lines and a second type of intersection point formed by intersecting extension lines of the horizontal lines and/or extension lines of the vertical lines;
a generating module 1730, configured to generate a blank second table consistent with the first table according to the intersection set;
a filling module 1740, configured to fill the text entry identified based on the OCR in the image into the blank second table, so as to obtain the target table.
In the embodiment of the disclosure, the RPA and AI technologies are used to identify the tables in the picture, and restore the tables to the table documents with the same table structure, so as to automatically convert offline data into online data, thereby replacing a fussy manual processing flow and improving the efficiency of table generation.
It should be noted that the foregoing explanation of the embodiment of the table generating method combining RPA and AI also applies to the table generating device combining RPA and AI of this embodiment, and is not repeated here.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: enumerating the units according to the intersection points in the intersection point set to obtain candidate cells and attribute information of the candidate cells; identifying a target cell of a second table for generating a blank from the candidate cells according to the attribute information of the candidate cells; and generating a second blank table according to the position arrangement of the target cells.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: sorting all enumerated candidate cells from small to large according to the area of the cells; traversing the candidate cells in sequence, and judging the existence of the traversed target candidate cells; when the target candidate cell is judged to exist, deleting the cell overlapped with the target candidate cell from the candidate cells which are not traversed, and determining the target candidate cell which is judged to exist as a target cell; and continuously traversing the candidate cells which are not traversed after the deletion in sequence until all the target cells are obtained after the traversal is finished.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: acquiring a first starting and ending point of a horizontal line and a second starting and ending point of a vertical line corresponding to the target candidate cell; judging whether four edges of the target candidate cell exist or not according to the four-corner coordinates, the first starting end point and the second starting end point of the target candidate cell; and when judging that the four edges exist, determining that the target candidate cell exists.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: arranging the intersection points in the intersection point set in sequence, and forming a plurality of intersection point pairs by adjacent intersection points in the intersection point set according to the same direction; sequentially acquiring adjacent intersection point pairs in the same direction, and forming a basic cell by the adjacent intersection point pairs; and constructing a Boolean matrix by taking the basic cells as matrix elements.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: assigning values to each matrix element in the Boolean matrix according to the target cell to generate a target Boolean matrix; identifying whether the target Boolean matrix comprises a target row and/or a target column which are both first values; splitting the Boolean matrix according to the target row and/or the target column to generate a sub-target Boolean matrix when the target row and/or the target column exist; and acquiring target cells corresponding to the sub-target Bohr matrix, and arranging according to the positions to generate a blank second table corresponding to the sub-target Bohr matrix.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: and according to the four-corner coordinates of the target cell, acquiring a first basic cell included in the target cell according to the four-corner coordinates, and updating a matrix element corresponding to the first basic cell from a first value to a second value.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: identifying whether a target matrix element with a first value exists in the sub-target Boolean matrix; and updating the value of the target matrix element from the first value to the second value.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: acquiring a target basic cell corresponding to a target matrix element; judging whether the target basic cell needs to be merged into a target cell corresponding to an adjacent matrix element adjacent to the target matrix element; and merging the target basic cells into the target cells corresponding to the adjacent matrix elements when the merging is judged to be needed.
Further, in a possible implementation manner of the embodiment of the present disclosure, the generating module 1730 is further configured to: and determining the target basic cell as a missing target cell if the combination is not needed, and completing the missing target cell when the target cells are arranged according to the positions.
Further, in a possible implementation manner of the embodiment of the present disclosure, the filling module 1740 is further configured to: acquiring four-corner coordinates of a text entry; matching the four-corner coordinates of the text items with the four-corner coordinates of the target cell to obtain a target text item corresponding to the target cell; and filling the target text entries into the corresponding target units to obtain a target table.
Further, in a possible implementation manner of the embodiment of the present disclosure, the filling module 1740 is further configured to: matching the four-corner coordinates of the text items with the four-corner coordinates of the target cells, and judging whether the text items have first text items occupying at least two target cells; and segmenting the first text entry according to the occupied at least two cells to obtain the target text entry of each target cell occupied by the first text entry.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 18 shows a schematic block diagram of an example electronic device 1800 with which embodiments of the present disclosure may be practiced. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 18, the table generating apparatus includes a memory 181, a processor 182, and a computer program stored in the memory 181 and executable on the processor 182, and when the processor 182 executes the computer program, the table generating method described above is implemented.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and are not to be considered limiting of the invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (20)

1. A table generation method combining RPA and AI, performed by an RPA system, the method comprising:
the RPA system extracts a horizontal line and a vertical line of a first table from an image based on artificial intelligence AI;
the RPA system acquires an intersection point set of the transverse line and the vertical line, wherein the intersection point set comprises a first type intersection point formed by intersecting the transverse line and the vertical line and a second type intersection point formed by intersecting an extension line of the transverse line and/or an extension line of the vertical line;
the RPA system generates a blank second table consistent with the first table according to the intersection point set;
and the RPA system fills the text entries identified by the optical character recognition ORC in the image into the blank second table to obtain a target table.
2. The method of claim 1, wherein the RPA system generates a second table of blanks consistent with the first table from the set of intersection points, comprising:
the RPA system enumerates units according to the intersection points in the intersection point set to obtain candidate cells and attribute information of the candidate cells;
the RPA system identifies a target cell of a second table for generating the blank from the candidate cells according to the attribute information of the candidate cells;
and the RPA system generates a blank second table for the target cells according to the position arrangement.
3. The method of claim 2, wherein the RPA system identifies a target cell of the second table for generating the blank from the candidate cells according to the attribute information of the candidate cells, comprising:
the RPA system sorts all enumerated candidate cells according to the sizes of the cells;
the RPA system sequentially traverses the candidate cells and judges the existence of the traversed target candidate cells;
when the RPA system judges that the target candidate cell exists, deleting the cell overlapped with the target candidate cell from the candidate cells which are not traversed, and determining the target candidate cell which is judged to exist as the target cell;
and the RPA system continuously traverses the candidate cells which are not traversed after deletion in sequence until all the target cells are obtained after the traversal is finished.
4. The method of claim 3, wherein traversing the candidate cells in order by the RPA system, and determining the existence of a traversed target candidate cell comprises:
the RPA system acquires a first starting and ending point of the horizontal line and a second starting and ending point of the vertical line corresponding to the target candidate cell;
the RPA system judges whether four edges of the target candidate cell exist according to the four-corner coordinates of the target candidate cell, the first starting point and the second starting point;
and when judging that the four edges exist, the RPA system determines that the target candidate cell exists.
5. The method according to any of claims 2-4, wherein before the RPA system identifies a target cell of the second table for generating the void from the candidate cells according to the attribute information of the candidate cells, the method further comprises:
the RPA system arranges the intersection points in the intersection point set in sequence, and forms a plurality of intersection point pairs by adjacent intersection points in the intersection point set according to the same direction;
the RPA system sequentially acquires adjacent intersection point pairs according to the same direction, and the adjacent intersection point pairs form a basic cell;
and the RPA system takes the basic cell as a matrix element to construct a Boolean matrix.
6. The method of claim 5, wherein the RPA system generates the second table of blanks for the target cell by a location arrangement, comprising:
the RPA system assigns values to each matrix element in the Boolean matrix according to the target cell to generate a target Boolean matrix;
the RPA system identifies whether the target Boolean matrix comprises a target row and/or a target column which are both first values;
when the RPA system has the target row and/or the target column, splitting the Boolean matrix according to the target row and/or the target column to generate a sub-target Boolean matrix;
and the RPA system acquires the target cells corresponding to the sub-target Bohr matrix and arranges the target cells according to positions to generate a second blank table corresponding to the sub-target Bohr matrix.
7. The method of claim 6, wherein the RPA system assigns values to each of the matrix elements in the boolean matrix according to the target cell to generate a target boolean matrix, comprising:
and the RPA system acquires a first basic cell included by the target cell according to the four-corner coordinates of the target cell and the four-corner coordinates, and updates a matrix element corresponding to the first basic cell from the first value to a second value.
8. The method of claim 6, wherein the RPA system, after the target row and/or the target column exist, and the Boolean matrix is split according to the target row and/or the target column to generate a sub-target Boolean matrix, further comprises:
the RPA system identifies whether a target matrix element with the first value exists in the sub-target Boolean matrix;
and the RPA system updates the value of the target matrix element from the first value to the second value.
9. The method of claim 8, wherein before the RPA system updates the value of the matrix element from the first value to the second value, the method further comprises:
the RPA system acquires a target basic cell corresponding to the target matrix element;
the RPA system judges whether the target basic cell needs to be merged into the target cell corresponding to the adjacent matrix element adjacent to the target matrix element;
and the RPA system merges the target basic cells into the target cells corresponding to the adjacent matrix elements when judging that merging is needed.
10. The method of claim 9, further comprising:
and the RPA system determines the target basic cell as a missing target cell if the RPA system judges that the combination is not needed, and completes the missing target cell when the target cell is arranged according to the position.
11. The method according to any of claims 1-4, wherein the RPA system fills the second table of blanks with text entries recognized from the image based on OCR, resulting in a target table, comprising:
the RPA system acquires the four-corner coordinates of the text entry;
the RPA system matches the four-corner coordinates of the text entry with the four-corner coordinates of the target cell to obtain a target text entry corresponding to the target cell;
and the RPA system fills the target text entry into the corresponding target unit to obtain the target table.
12. The method of claim 11, wherein the RPA system matches the four-corner coordinates of the text entry with the four-corner coordinates of the target cell to obtain a target text entry corresponding to the target cell, and comprises:
the RPA system matches the four-corner coordinates of the text entries with the four-corner coordinates of the target cells, and judges whether the text entries have a first text entry occupying at least two target cells;
and the RPA system divides the first text entry according to the at least two occupied cells to obtain a target text entry of each target cell occupied by the first text entry.
13. A table creation apparatus that combines RPA and AI, comprising:
the extraction module is used for extracting the horizontal line and the vertical line of the first table from the image based on artificial intelligence AI;
the intersection point acquisition module is used for acquiring an intersection point set of the transverse lines and the vertical lines, wherein the intersection point set comprises a first type of intersection point formed by intersecting the transverse lines and the vertical lines and a second type of intersection point formed by intersecting extension lines of the transverse lines and/or extension lines of the vertical lines;
the generating module is used for generating a blank second table consistent with the first table according to the intersection point set;
and the filling module is used for filling the text entries identified from the image based on the OCR into the blank second table to obtain a target table.
14. The apparatus of claim 13, wherein the generating module is further configured to:
enumerating units according to the intersection points in the intersection point set to obtain candidate cells and attribute information of the candidate cells;
according to the attribute information of the candidate cells, identifying target cells of a second table for generating the blank from the candidate cells;
and generating a second blank table for the target cell according to the position arrangement.
15. The apparatus of claim 14, wherein the generating module is further configured to:
sorting all enumerated candidate cells from small to large according to the area of the cells;
traversing the candidate cells in sequence, and judging the existence of the traversed target candidate cells;
when the target candidate cell is judged to exist, deleting the cell overlapped with the target candidate cell from the candidate cells which are not traversed, and determining the target candidate cell which is judged to exist as the target cell;
and continuously traversing the candidate cells which are not traversed after the deletion in sequence until all the target cells are obtained after the traversal is finished.
16. The apparatus of claim 15, wherein the generating module is further configured to:
acquiring a first starting and ending point of the transverse line and a second starting and ending point of the vertical line corresponding to the target candidate cell;
judging whether four edges of the target candidate cell exist or not according to the four-corner coordinates of the target candidate cell, the first starting point and the second starting point;
and when the four edges are judged to exist, determining that the target candidate cell exists.
17. The apparatus of claims 13-16, wherein the fill module is further configured to:
acquiring four-corner coordinates of the text entries;
matching the four-corner coordinates of the text entry with the four-corner coordinates of the target cell to obtain a target text entry corresponding to the target cell;
and filling the target text entries into the corresponding target units to obtain the target table.
18. An electronic device comprising a memory, a processor;
wherein the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory for implementing the method according to any one of claims 1 to 12.
19. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-12.
20. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-12.
CN202111026974.3A 2021-09-02 2021-09-02 Table generation method and device combining RPA and AI, electronic equipment and storage medium Pending CN113836878A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111026974.3A CN113836878A (en) 2021-09-02 2021-09-02 Table generation method and device combining RPA and AI, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111026974.3A CN113836878A (en) 2021-09-02 2021-09-02 Table generation method and device combining RPA and AI, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113836878A true CN113836878A (en) 2021-12-24

Family

ID=78962035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111026974.3A Pending CN113836878A (en) 2021-09-02 2021-09-02 Table generation method and device combining RPA and AI, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113836878A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190294399A1 (en) * 2018-03-26 2019-09-26 Abc Fintech Co., Ltd. Method and device for parsing tables in pdf document
CN110796031A (en) * 2019-10-11 2020-02-14 腾讯科技(深圳)有限公司 Table identification method and device based on artificial intelligence and electronic equipment
CN111640130A (en) * 2020-05-29 2020-09-08 深圳壹账通智能科技有限公司 Table reduction method and device
CN111797685A (en) * 2020-05-27 2020-10-20 贝壳技术有限公司 Identification method and device of table structure
CN111860502A (en) * 2020-07-15 2020-10-30 北京思图场景数据科技服务有限公司 Picture table identification method and device, electronic equipment and storage medium
CN112115774A (en) * 2020-08-07 2020-12-22 北京来也网络科技有限公司 Character recognition method and device combining RPA and AI, electronic equipment and storage medium
CN113065536A (en) * 2021-06-03 2021-07-02 北京欧应信息技术有限公司 Method of processing table, computing device, and computer-readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190294399A1 (en) * 2018-03-26 2019-09-26 Abc Fintech Co., Ltd. Method and device for parsing tables in pdf document
CN110796031A (en) * 2019-10-11 2020-02-14 腾讯科技(深圳)有限公司 Table identification method and device based on artificial intelligence and electronic equipment
CN111797685A (en) * 2020-05-27 2020-10-20 贝壳技术有限公司 Identification method and device of table structure
CN111640130A (en) * 2020-05-29 2020-09-08 深圳壹账通智能科技有限公司 Table reduction method and device
CN111860502A (en) * 2020-07-15 2020-10-30 北京思图场景数据科技服务有限公司 Picture table identification method and device, electronic equipment and storage medium
CN112115774A (en) * 2020-08-07 2020-12-22 北京来也网络科技有限公司 Character recognition method and device combining RPA and AI, electronic equipment and storage medium
CN113065536A (en) * 2021-06-03 2021-07-02 北京欧应信息技术有限公司 Method of processing table, computing device, and computer-readable storage medium

Similar Documents

Publication Publication Date Title
US10824801B2 (en) Interactively predicting fields in a form
US7350142B2 (en) Method and system for creating a table version of a document
CN102117269B (en) Apparatus and method for digitizing documents
US5848186A (en) Feature extraction system for identifying text within a table image
EP2807608B1 (en) Borderless table detection engine
JP2536966B2 (en) Text editing system
US7149967B2 (en) Method and system for creating a table version of a document
CN102194123B (en) Method and device for defining table template
US8824798B2 (en) Information processing device, computer readable medium storing information processing program, and information processing method
US10691936B2 (en) Column inferencer based on generated border pieces and column borders
CN113343740B (en) Table detection method, device, equipment and storage medium
CN104067292A (en) Formula detection engine
CN110633660B (en) Document identification method, device and storage medium
CN105574524A (en) Cartoon image page identification method and system based on dialogue and storyboard united identification
US11908215B2 (en) Information processing apparatus, information processing method, and storage medium
CN101840582B (en) Boundary digitizing method of cadastral plot
CN114419647A (en) Table information extraction method and system
CN115828874A (en) Industry table digital processing method based on image recognition technology
CN112329548A (en) Document chapter segmentation method and device and storage medium
JP2010108208A (en) Document processing apparatus
Das et al. Heuristic based script identification from multilingual text documents
US9798711B2 (en) Method and system for generating a graphical organization of a page
CN102883085B (en) Image processing apparatus and image processing method
CN113836878A (en) Table generation method and device combining RPA and AI, electronic equipment and storage medium
CN115147858A (en) Method, device, equipment and medium for generating image data of handwritten form

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination