CN114283436A - Table identification method, device, equipment and storage medium - Google Patents
Table identification method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN114283436A CN114283436A CN202111577374.6A CN202111577374A CN114283436A CN 114283436 A CN114283436 A CN 114283436A CN 202111577374 A CN202111577374 A CN 202111577374A CN 114283436 A CN114283436 A CN 114283436A
- Authority
- CN
- China
- Prior art keywords
- determining
- area
- lines
- information
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
The embodiment of the invention relates to a table identification method, a table identification device and a table identification storage medium, wherein the table identification method comprises the following steps: determining category information of each table line in a picture frame and determining coordinate information of each text block in the picture frame; projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area; determining at least one table area in the drawing frame according to the projection area; determining the intersection point coordinates of the table lines in each table area according to the category information; determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information; and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area. Thus, the table in the frame and the text information in the table can be quickly identified. The accuracy and the efficiency of discernment have been improved.
Description
Technical Field
The embodiment of the invention relates to the field of data analysis, in particular to a table identification method, a table identification device, table identification equipment and a storage medium.
Background
In the process of electronic drawing filing and arrangement and identification of form information in drawings, contents in drawings and labels and forms are often required to be accurately identified. The work has high requirements on the identification accuracy of the contents, but the definition and drawing method of the tables and the labels in the drawing is five-fold, so that the nearly 100% accurate identification of the contents has a larger challenge.
In the prior art, the assistance of an artificial intelligence model is required, the position and angle characteristics of a table in a picture frame are not utilized, table lines are not grouped, and the identification efficiency and the identification accuracy are low.
Disclosure of Invention
In view of this, in order to solve the technical problem of low recognition efficiency and low recognition accuracy, embodiments of the present invention provide a table recognition method, apparatus, device, and storage medium.
In a first aspect, an embodiment of the present invention provides a table identification method, including:
determining category information of each table line in a picture frame and determining coordinate information of each text block in the picture frame;
projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
determining at least one table area in the drawing frame according to the projection area;
determining the intersection point coordinates of the table lines in each table area according to the category information;
determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
In one possible embodiment, the determining the category information of each table line in the frame and the determining the coordinate information of each text tile in the frame includes:
determining the category information of a table line parallel to a horizontal frame of the picture frame as a first category;
determining the category information of the table line parallel to the vertical frame of the picture frame as a second category;
and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
In a possible implementation manner, the projecting the table line to a corresponding coordinate axis according to the category information to obtain a projection area includes:
projecting the table lines of the first category to an abscissa axis parallel to a horizontal frame of the picture frame to obtain at least one horizontal projection area;
based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
In a possible embodiment, the determining at least one table area in the drawing frame according to the projection area includes:
and determining at least one table area in the drawing frame according to each horizontal projection area and at least one vertical projection area corresponding to the horizontal projection area.
In a possible embodiment, the determining coordinates of intersection points of table lines in each table area according to the category information includes:
and determining the intersection point coordinates of the table lines in the same layer in each table area according to the category information.
In one possible embodiment, the determining the coordinate information of each cell area in each table area according to the intersection point coordinate and the category information includes:
dividing the table lines into unit grid lines according to the intersection point coordinates and the category information;
determining the cells corresponding to the cell lines according to a second preset rule;
and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
In one possible embodiment, the method further comprises:
determining a text image block in each unit cell area according to the mapping relation;
and combining the text blocks according to the first preset rule to obtain the text information.
In one possible embodiment, the method further comprises:
judging whether the distance between the mutually parallel table lines is smaller than a first threshold value; when the judgment result is yes, merging the mutually parallel table lines;
judging whether the distance between the end points of the two table lines is smaller than a second threshold value or not;
and when the judgment result is yes, connecting the end points of the two table lines.
In a second aspect, an embodiment of the present invention provides a table identification apparatus, including:
the determining module is used for determining the category information of each table line in a picture frame and determining the coordinate information of each text picture block in the picture frame;
the projection module is used for projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
the determining module is further configured to determine at least one table area in the drawing frame according to the projection area;
the determining module is further configured to determine intersection coordinates of the table lines in each table region according to the category information;
the determining module is further configured to determine coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and the generating module is used for generating the mapping relation between the text image blocks and each cell area according to the coordinate information of each text image block and the coordinate information of each cell area in each table area.
In a third aspect, an embodiment of the present invention provides an apparatus, including: a processor and a memory, the processor being configured to execute an identification program stored in the memory to implement the identification method of any one of the above first aspects.
In a fourth aspect, an embodiment of the present invention provides a storage medium, where one or more programs are stored, and the one or more programs are executable by one or more processors to implement the identification method according to any one of the first aspects.
According to the identification scheme provided by the embodiment of the invention, the category information of each table line in the picture frame is determined, and the coordinate information of each text picture block in the picture frame is determined; projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area; determining at least one table area in the drawing frame according to the projection area; determining the intersection point coordinates of the table lines in each table area according to the category information; determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information; and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area. The method for determining the vertex coordinates of each cell line segment according to the position information and the projection area of the table line, accurately constructing the cells of the table, mapping the text image blocks in the cells according to the text coordinates, grouping and splicing the text image blocks into accurate character contents according to the text coordinates improves the identification accuracy and the identification efficiency of the table.
Drawings
Fig. 1 is a schematic flowchart of a table identification method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another table identification method according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a table recognition method according to another embodiment of the present invention;
FIG. 4 is a diagram illustrating a text block combination method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a table identification apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For the convenience of understanding of the embodiments of the present invention, the following description will be further explained with reference to specific embodiments, which are not to be construed as limiting the embodiments of the present invention.
Fig. 1 is a schematic flowchart of a table identification method according to an embodiment of the present invention, as shown in fig. 1, including:
s11, determining the category information of each table line in the picture frame, and determining the coordinate information of each text block in the picture frame.
The identification method provided by the embodiment of the invention is applied to drawing software or office software, and the software can comprise the following steps: and the specific identification content of the CAD drawing software can be table content in the identification dwg drawing, and the table can be a vectorization table. By analyzing the content of the electronic drawing, the coordinate information and the text information of all text blocks can be obtained, and the start and end point coordinate information of all form lines can also be obtained. The identification method described in the scheme can be a method for acquiring information such as split text content, coordinate positions, table lines and the like, accurately constructing the cells of the table, splicing the cells into accurate character information according to text blocks, and specifically determining the regions of the cells and the corresponding text information by utilizing the angle characteristics and intersection point information of the table lines in a picture frame so as to realize identification of the table information in the drawing, wherein the table information can comprise information such as the contents in the drawing, a drawing label, the table lines, the cells and the table.
In this embodiment, the drawing frame is a line frame defining a drawing area on the drawing, the drawing frames are drawn around the drawing to mark the drawing range of the drawing, and one CAD drawing may include a plurality of drawing frames. The table lines are line segments forming a table in the picture frame, all straight lines and multiple lines in the picture frame range are obtained, the multiple lines are scattered into straight lines according to the bending points to obtain each table line, and the multiple lines are used for indicating that at least one bending point exists in one line. The category information is used for indicating the category of the table line, and may include: the horizontal table lines parallel to the horizontal frame of the frame, the vertical table lines parallel to the vertical frame of the frame, and other table lines may determine the corresponding category information according to the coordinates of the vertices of each table line or the positions in the frame, for example, when the horizontal coordinates of two vertices of a table line are the same, the category information is determined as a vertical table line, when the vertical coordinates of two vertices of a table line are the same, the category information is determined as a horizontal table line, and when the vertical coordinates and the horizontal coordinates of two vertices of a table line are different, the category information is determined as other table lines. Dividing the text in the picture frame into text picture blocks according to a preset text splitting rule, wherein the text splitting rule is used for splitting the text according to the combination rule of the characters, the reading habit of a user and the recognition rule of a computer, and using a single character or a single word group obtained after splitting as a text picture block. The coordinate information is used to indicate coordinates of the text tile in the frame.
Further, identifying table lines in the picture frame, determining the category information of each table line according to a preset classification rule, splitting a text in the picture frame into text picture blocks according to a character splitting rule, and acquiring coordinates of the text picture blocks in the picture frame as coordinate information.
And S12, projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area.
In this embodiment, the coordinate axes are an abscissa axis parallel to a horizontal frame of the frame and an ordinate axis parallel to a vertical frame of the frame, and the projection manner may be: the method comprises the steps of projecting table lines parallel to an abscissa axis to the abscissa axis, combining projection areas overlapped on the abscissa axis into one projection area to obtain a horizontal projection area, projecting table lines parallel to an ordinate axis to the ordinate axis, and combining projection areas overlapped on the ordinate axis into one projection area to obtain a vertical projection area.
Further, for the table lines not parallel to the coordinate axes, the coordinate axes capable of being projected can be determined according to the horizontal inclination angle, and if the horizontal inclination angle is more approximate to the horizontal axis, the horizontal inclination angle can be projected to the abscissa axis, and the horizontal inclination angle is more approximate to the vertical axis, the horizontal inclination angle can be projected to the ordinate axis. In an alternative embodiment, the table line whose horizontal tilt angle meets the predetermined angle threshold may also be ignored (e.g., the table line whose horizontal angle is greater than 20 degrees and less than eighty degrees is ignored).
S13, determining at least one table area in the drawing frame according to the projection area; and determining the intersection point coordinates of the table lines in each table area according to the category information.
In this embodiment, the table area is an area range where each table in the frame is located, rectangular areas correspondingly formed in the frame are determined according to each horizontal projection area and each vertical projection area, and overlapped portions in each rectangular area may be combined to obtain the table area. The table area can be one or more, and each table area contains a table line. And determining the intersection points of the horizontal table lines and the vertical table lines according to the category information of each table line, and determining the coordinates of the intersection points in each table area in the picture frame as intersection point coordinates.
And S14, determining the coordinate information of each cell area in each table area according to the intersection point coordinates and the category information.
In the embodiment, a vertical table line and a parallel table line where an intersection point is located are divided into single cell sub-lines according to intersection point positions according to intersection point coordinates, the cell sub-lines are line segments of each minimum rectangular cell in a table and comprise two vertical lattice lines and two horizontal lattice lines, the four lattice lines are connected through four adjacent intersection points to form a cell, a region defined by coordinates corresponding to the four intersection points of the cell is determined to be a cell region, each cell region is connected through the corresponding four intersection points to generate each corresponding table, and coordinate information of each cell region is determined according to two intersection point coordinates located on a diagonal line in the four intersection points.
And S15, generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
In this embodiment, the mapping relationship is that one or more text blocks correspond to one cell area, whether the coordinate information of the text block is in the coordinate range corresponding to the cell area is determined, and the mapping relationship is established between the text block and the corresponding cell in the coordinate range of the cell area. The mapping relation is used for determining the text image blocks corresponding to the unit cell areas.
Specifically, the coordinates of each text block are grouped according to a table, the text blocks with coordinate information in the same table area are grouped into a group, the text blocks falling into each cell are screened out based on each cell in each table area, and a mapping relation is established between the screened text blocks and the cells in which the text blocks are located. At the time of filtering, filtering may be performed based on text tiles within the current table region. The screening efficiency is improved.
In an alternative scheme of this embodiment, a first preset rule is preset, where the first preset rule is used to combine text blocks in each cell region, and the first preset rule may be to combine and arrange text blocks according to reading habits, character arrangement rules, phrase meanings, and the like, and to combine and arrange text blocks in corresponding cell regions according to the first preset rule to obtain the text information.
According to the identification scheme provided by the embodiment of the invention, the category information of each table line in the picture frame is determined, and the coordinate information of each text picture block in the picture frame is determined; projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area; determining at least one table area in the drawing frame according to the projection area; determining the intersection point coordinates of the table lines in each table area according to the category information; determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information; and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area. The table is divided and recognized according to the category of the table line, each cell and the text information corresponding to the cell are recognized, and the recognition accuracy and the recognition efficiency of the table are improved.
Fig. 2 is a schematic flow chart of another table identification method according to an embodiment of the present invention, as shown in fig. 2, including:
s21, determining the category information of the table line parallel to the horizontal frame of the drawing frame as a first category; determining the category information of the table line parallel to the vertical frame of the picture frame as a second category; and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
In this embodiment, a drawing may include a plurality of frames, and the table information in each frame of the drawing may be identified. The table lines are line segments forming a table in the picture frame, all the table lines in the picture frame are obtained, the table lines can comprise straight lines and multi-segment lines, the multi-segment lines are used for indicating that at least one bending point is arranged in one line, and the multi-segment lines are disassembled into straight lines according to the bending points. And determining the category information of all the table lines according to a preset classification rule.
Furthermore, the drawing frames in the drawing are all rectangular, so that the outer edge frames of the drawing frames comprise two horizontal frames and two vertical frames; according to the feature that the table lines in the frame are generally parallel to the outer border of the frame, the classification rule may be: determining a position in the frame according to the coordinates of the vertex of each table line, determining the table line with the position parallel to the horizontal frame of the frame as a first category, and determining the table line with the position parallel to the vertical frame of the frame as a second category, wherein the first category can be called pseudo-horizontal table lines, and the second category can be called pseudo-vertical table lines.
In an alternative embodiment, table lines parallel to the long sides of the frame and table lines parallel to the short sides of the frame are determined according to the long sides and the short sides of the frame, and the table lines in a frame range are divided into two groups, which are respectively parallel to the long sides or the short sides of the rectangular frame.
For example, in the case of a frame in which the long side of the frame is horizontally disposed, the table line parallel to the long side is referred to as a pseudo-horizontal table line, and the table line parallel to the short side is referred to as a pseudo-vertical table line.
In this embodiment, the text in the frame is divided into text blocks according to a preset text splitting rule, where the text splitting rule is used to split the text into single characters or single word groups as the text blocks according to a combination rule of the characters, a reading habit of a user, and an identification rule of a computer. The coordinate information is used to indicate coordinates of the text tile in the frame.
And S22, projecting the table lines of the first category to an abscissa axis parallel to a horizontal frame of the drawing frame to obtain at least one horizontal projection area.
In this embodiment, the abscissa axis is a coordinate axis parallel to a horizontal frame of the frame, the table lines of the first category are projected onto the abscissa axis, the horizontal projection lines are all line segments projected onto the abscissa axis, and the mutually overlapped horizontal projection lines are combined into a same projection line to obtain at least one horizontal projection area.
And S23, based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
In the present embodiment, the ordinate axis is a coordinate axis parallel to a vertical frame of a frame. The method comprises the steps of dividing a picture frame into rectangular areas with the horizontal projection area as the bottom and the picture frame height as the height according to the area corresponding to the horizontal projection area in the picture frame, projecting table lines of a second type with abscissa coordinates in the rectangular areas to the ordinate axis to obtain vertical projection lines, wherein the vertical projection lines are all line segments projected to the ordinate axis, and combining the vertical projection lines overlapped with each other into the same projection line to obtain at least one vertical projection area.
Further, for the table lines not parallel to the coordinate axes, the coordinate axes capable of being projected can be determined according to the horizontal inclination angle, and if the horizontal inclination angle is more approximate to the horizontal axis, the horizontal inclination angle can be projected to the abscissa axis, and the horizontal inclination angle is more approximate to the vertical axis, the horizontal inclination angle can be projected to the ordinate axis. In an alternative embodiment, the table line in which the horizontal tilt angle meets the predetermined angle threshold may be ignored.
S24, determining at least one table area in the drawing frame according to each horizontal projection area and at least one vertical projection area corresponding to the horizontal projection area; and determining the intersection point coordinates of the table lines in the same layer in each table area according to the category information.
In this embodiment, the table area is an area range where each table in the frame is located, a rectangular area correspondingly formed in the frame is determined according to each horizontal projection area and the corresponding vertical projection area, that is, the rectangular area is an outer enclosure box of the suspected table, all tables of the frame are in the corresponding table area, the number of the table areas may be one or more, an intersection point of a horizontal table line and a vertical table line is determined according to the category information of each table line in the table area, and a coordinate of the intersection point in the frame in each table area is determined as an intersection point coordinate.
Specifically, all the table lines may be divided into a plurality of groups according to the table area. And determining the table lines in the same table area and the same layer as a same group, gradually solving intersection points of the table lines in the same group according to the first category and the second category, and determining intersection point coordinates according to coordinates of the intersection points in a picture frame.
In an optional embodiment, a first threshold is preset, and whether the distance between mutually parallel table lines or adjacent table lines in the same layer is smaller than the first threshold is judged; when the distance is smaller than the first threshold, it is proved that one of the two table lines is caused by an error, and the two table lines are merged or the repeated table lines are removed to make the two table lines become one.
Presetting a second threshold value, and judging whether the distance between the end points of two table lines in the same layer is smaller than the second threshold value; and when the judgment result shows that the distance between the end points of the two table lines is smaller than the second threshold, the distance between the two end points is caused by errors, so that the end points of the two table lines are connected to fuse the two table lines into one table line, namely, the wrongly broken table lines are connected in a penetrating way.
S25, dividing the table lines into cell grid lines according to the intersection point coordinates and the category information; determining the cells corresponding to the cell lines according to a second preset rule; and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
In this embodiment, a second rule is preset, the second rule is used to generate a corresponding cell according to a cell sub-line, a vertical table line and a parallel table line where an intersection point is located are divided into a single cell sub-line according to intersection point positions according to intersection point coordinates, the cell sub-line is a line segment of each minimum rectangular cell in the table, and includes two vertical cell lines and two horizontal cell lines, the second rule is adopted to connect adjacent horizontal cell lines and vertical cell lines within a tolerance range according to a principle of whether the horizontal cell lines and the vertical cell lines are parallel and equal in length to form a cell, and a first vertex coordinate and a second vertex coordinate of the cell are recorded, where the first vertex coordinate and the second vertex coordinate may be a left lower vertex coordinate and a right upper vertex coordinate of the cell, or a left upper vertex coordinate and a right lower vertex coordinate of the cell, and a rectangular region surrounded by the first vertex coordinate and the second vertex coordinate is used as a cell region, and each cell area in the table area is connected with each other through the corresponding four intersection points to generate a corresponding table.
And S26, generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
In this embodiment, the related description please refer to S15, which is not repeated herein, in accordance with the content in step S15.
S27, determining text image blocks in each unit cell area according to the mapping relation; and combining the text blocks according to the first preset rule to obtain the text information of each table.
In this embodiment, a first preset rule is preset, where the first preset rule is used to arrange and combine text blocks in each cell area, and the first preset rule may be to arrange and combine text blocks according to reading habits, character arrangement rules, phrase meanings, coordinates of characters, and the like, and arrange and combine text blocks in corresponding cell areas according to the first preset rule to obtain the text information.
The identification method provided by the embodiment of the invention comprises the steps of projecting table lines onto corresponding coordinate axes to obtain projection areas, determining the table areas where the tables are located according to the projection areas, then obtaining intersection points of the table lines in each table area, determining the cell areas and the tables according to preset rules according to the intersection points, and then determining text information in each table according to the preset rules according to the mapping relation between the cell areas and text blocks. The identification of the table can be completed by utilizing the vectorization information of the table lines in the picture frame without utilizing a complex AI identification model, thereby simplifying the identification steps of the vectorization table and enhancing the accuracy and the identification efficiency of the table identification.
The following will be described by way of example in the boxes in fig. 3 and 4:
FIG. 3 is a diagram illustrating a table recognition method according to another embodiment of the present invention;
fig. 4 is a block diagram illustrating a text tile combination method according to an embodiment of the present invention.
The specific method comprises the following steps: all the table lines of the frame in fig. 3 are determined. The method comprises the steps of projecting a horizontal table line to the X axis, fusing overlapped projection line segments to obtain a horizontal projection area, and obtaining 2X area segments which are named as X1 and X2 respectively according to the horizontal projection area, wherein the horizontal table line is parallel to the X axis, and the vertical table line is parallel to the y axis, and is shown in FIG. 3; after the x-axis partition is finished, projecting the vertical table lines in the partitions to the y-axis according to the x-axis partition, fusing the overlapped projection line segments to obtain a vertical projection area, and segmenting according to the vertical projection area to obtain each segment along the y-axis. Taking the X1 partition as an example, the Y axis projection of the vertical line segment in the X1 partition can obtain two partitions, Y1 and Y2. Then the rectangle formed by the X1 and Y1 segments is the outer bounding box of the first table suspected to be the upper left corner, and the rectangle formed by the X1 and Y2 is the outer bounding box of the second table suspected to be the upper left corner. The main purpose of this step is to improve the performance of the recognition algorithm and to obtain an outer bounding box for the suspect table.
The coordinate ranges of the bounding boxes are obtained, and all the table lines can be divided into a plurality of groups according to the outer bounding box. And judging whether the table lines in the same layer in the same group have errors or repeated table lines, and removing the error table lines or fusing the repeated table lines into a straight line. Then, the horizontal table lines and the vertical table lines are sequentially subjected to intersection point determination.
After all the intersection points in the table are obtained, the table line where each intersection point is located is divided into cell grid lines according to the intersection points, all the cells are obtained according to the principle that whether the adjacent horizontal or vertical cell grid lines are parallel or not within the tolerance range and the lengths of the adjacent horizontal or vertical cell grid lines are equal, the coordinates of the lower left point and the upper right point of each cell are recorded, and each cell is connected together to obtain each table.
And acquiring coordinate position information and text information of all the text blocks in each outer bounding box, and matching the coordinate information of the text blocks with the coordinate information of the cells. And then, splicing the text information corresponding to the plurality of text blocks in the same cell according to a certain rule, and establishing a mapping relation between the text and the closed cell. The character splicing rule means that a plurality of text blocks in the same cell are spliced together in sequence according to the reading direction of characters and the coordinates of the characters to obtain text information in a table. As shown in fig. 4, the text blocks in the unit cell are arranged and combined from left to right and then from top to bottom according to the reading order, and the text information to be spliced is: "text a, text B, text C, text D, text E". And finally, obtaining the identified form and the text information of the form.
Fig. 5 is a table identifying apparatus according to an embodiment of the present invention, as shown in fig. 5, including:
a determining module 51, configured to determine category information of each table line in a drawing frame, and determine coordinate information of each text tile in the drawing frame;
the projection module 52 is configured to project the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
the determining module 51 is further configured to determine at least one table area in the drawing frame according to the projection area;
the determining module 51 is further configured to determine, according to the category information, intersection coordinates of table lines in each table region;
the determining module 51 is further configured to determine coordinate information of each cell area in each table area according to the intersection point coordinate and the category information;
and a generating module 53, configured to generate a mapping relationship between each text tile and each cell area according to the coordinate information of each text tile and the coordinate information of each cell area in each table area.
In a possible embodiment, the determining module 51 is specifically configured to determine that the category information of the table line parallel to the horizontal frame of the drawing frame is a first category;
determining the category information of the table line parallel to the vertical frame of the picture frame as a second category;
and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
In a possible embodiment, the projection module 52 is specifically configured to project the table lines of the first category onto an abscissa axis parallel to a horizontal frame of the drawing frame to obtain at least one horizontal projection area;
based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
In a possible embodiment, the determining module 51 is specifically configured to determine at least one table area in the drawing frame according to each of the horizontal projection areas and at least one vertical projection area corresponding to the horizontal projection area.
In a possible implementation manner, the determining module 51 is specifically configured to determine the intersection coordinates of the table lines in the same layer in each table area according to the category information.
In a possible embodiment, the determining module 51 is specifically configured to divide the table lines into cell grid lines according to the intersection coordinates and the category information;
determining the cells corresponding to the cell lines according to a second preset rule;
and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
In a possible embodiment, the determining module 51 is specifically configured to determine a text tile in each cell area according to the mapping relationship;
and combining the text blocks according to the first preset rule to obtain the text information.
In a possible embodiment, the determining module 51 is specifically configured to determine whether a distance between mutually parallel table lines is smaller than a first threshold; when the judgment result is yes, merging the mutually parallel table lines;
judging whether the distance between the end points of the two table lines is smaller than a second threshold value or not;
and when the judgment result is yes, connecting the end points of the two table lines.
The device control apparatus provided in this embodiment may be the apparatus shown in fig. 5, and may perform all the steps of the identification method shown in fig. 1 and 2, so as to achieve the technical effect of the device control method shown in fig. 5, and for brevity, please refer to the description related to fig. 1 and 2, which is not described herein again.
Fig. 6 is a schematic structural diagram of an apparatus according to an embodiment of the present invention, where the apparatus 600 shown in fig. 6 includes: at least one processor 601, memory 602, at least one network interface 604, and other user interfaces 603. The various components in the device 600 are coupled together by a bus system 605. It is understood that the bus system 605 is used to enable communications among the components. The bus system 605 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 605 in fig. 6.
The user interface 603 may include, among other things, a display, a keyboard, or a pointing device (e.g., a mouse, trackball, touch pad, or touch screen, among others.
It will be appreciated that the memory 602 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), synchlronous SDRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 602 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, memory 602 stores the following elements, executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system 6021 and application programs 6022.
The operating system 6021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application program 6022 includes various application programs such as a Media Player (Media Player), a Browser (Browser), and the like, and is used to implement various application services. A program implementing the method of an embodiment of the invention can be included in the application program 6022.
In the embodiment of the present invention, by calling a program or an instruction stored in the memory 602, specifically, a program or an instruction stored in the application program 6022, the processor 601 is configured to execute the method steps provided by the method embodiments, for example, including:
determining category information of each table line in a picture frame and determining coordinate information of each text block in the picture frame;
projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
determining at least one table area in the drawing frame according to the projection area;
determining the intersection point coordinates of the table lines in each table area according to the category information;
determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
In one possible embodiment, the category information of the table line parallel to the horizontal frame of the drawing frame is determined as a first category;
determining the category information of the table line parallel to the vertical frame of the picture frame as a second category;
and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
In a possible embodiment, the table lines of the first category are projected to an abscissa axis parallel to a horizontal border of the drawing frame to obtain at least one horizontal projection area;
based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
In one possible embodiment, at least one table area in the drawing frame is determined according to each of the horizontal projection areas and at least one vertical projection area corresponding to the horizontal projection area.
In a possible implementation manner, the table lines in the same layer in each table region are used to determine the intersection point coordinates according to the category information.
In one possible embodiment, the table lines are divided into cell table sub-lines according to the intersection point coordinates and the category information;
determining the cells corresponding to the cell lines according to a second preset rule;
and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
In one possible implementation, determining a text tile in each cell area according to the mapping relation;
and combining the text blocks according to the first preset rule to obtain the text information.
In one possible embodiment, it is determined whether the distance between mutually parallel table lines is less than a first threshold; when the judgment result is yes, merging the mutually parallel table lines;
judging whether the distance between the end points of the two table lines is smaller than a second threshold value or not;
and when the judgment result is yes, connecting the end points of the two table lines.
The method disclosed by the above-mentioned embodiment of the present invention can be applied to the processor 601, or implemented by the processor 601. The processor 601 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 601. The Processor 601 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in the memory 602, and the processor 601 reads the information in the memory 602 and completes the steps of the method in combination with the hardware thereof.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented by means of units performing the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
The device provided in this embodiment may be the device shown in fig. 6, and may perform all the steps of the recognition method shown in fig. 1 and 2, so as to achieve the technical effect of the recognition method shown in fig. 1 and 2, which is described with reference to fig. 1 and 2 for brevity, and is not described herein again.
The embodiment of the invention also provides a storage medium (computer readable storage medium). The storage medium herein stores one or more programs. Among others, the storage medium may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above.
When one or more programs in the storage medium are executable by one or more processors to implement the above-described recognition method performed on the device side.
The processor is configured to execute the identification program stored in the memory to implement the following steps of the identification method performed on the device side:
determining category information of each table line in a picture frame and determining coordinate information of each text block in the picture frame;
projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
determining at least one table area in the drawing frame according to the projection area;
determining the intersection point coordinates of the table lines in each table area according to the category information;
determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
In one possible embodiment, the category information of the table line parallel to the horizontal frame of the drawing frame is determined as a first category;
determining the category information of the table line parallel to the vertical frame of the picture frame as a second category;
and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
In a possible embodiment, the table lines of the first category are projected to an abscissa axis parallel to a horizontal border of the drawing frame to obtain at least one horizontal projection area;
based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
In one possible embodiment, at least one table area in the drawing frame is determined according to each of the horizontal projection areas and at least one vertical projection area corresponding to the horizontal projection area.
In a possible implementation manner, the table lines in the same layer in each table region are used to determine the intersection point coordinates according to the category information.
In one possible embodiment, the table lines are divided into cell table sub-lines according to the intersection point coordinates and the category information;
determining the cells corresponding to the cell lines according to a second preset rule;
and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
In one possible implementation, determining a text tile in each cell area according to the mapping relation;
and combining the text blocks according to the first preset rule to obtain the text information.
In one possible embodiment, it is determined whether the distance between mutually parallel table lines is less than a first threshold; when the judgment result is yes, merging the mutually parallel table lines;
judging whether the distance between the end points of the two table lines is smaller than a second threshold value or not;
and when the judgment result is yes, connecting the end points of the two table lines.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (11)
1. A method for identifying a form, comprising:
determining category information of each table line in a picture frame and determining coordinate information of each text block in the picture frame;
projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
determining at least one table area in the drawing frame according to the projection area;
determining the intersection point coordinates of the table lines in each table area according to the category information;
determining coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and generating a mapping relation between each text block and each cell area according to the coordinate information of each text block and the coordinate information of each cell area in each table area.
2. The method of claim 1, wherein determining category information for each table line in a frame and determining coordinate information for each text tile in the frame comprises:
determining the category information of a table line parallel to a horizontal frame of the picture frame as a first category;
determining the category information of the table line parallel to the vertical frame of the picture frame as a second category;
and determining the coordinates of each text block in the picture frame as the coordinate information of the text block.
3. The method according to claim 2, wherein the projecting the table lines to the corresponding coordinate axes according to the category information to obtain a projection area comprises:
projecting the table lines of the first category to an abscissa axis parallel to a horizontal frame of the picture frame to obtain at least one horizontal projection area;
based on each horizontal projection area, projecting the tabular lines of the second category with the abscissa located in the horizontal projection area to the ordinate axis parallel to the vertical frame of the picture frame to obtain at least one vertical projection area.
4. The method according to claim 3, wherein said determining at least one table region in said frame according to said projection region comprises:
and determining at least one table area in the drawing frame according to each horizontal projection area and at least one vertical projection area corresponding to the horizontal projection area.
5. The method of claim 1, wherein determining coordinates of intersection points of table lines in each table region according to the category information comprises:
and determining the intersection point coordinates of the table lines in the same layer in each table area according to the category information.
6. The method of claim 5, wherein determining coordinate information for each cell area within each of the table areas based on the intersection coordinates and the category information comprises:
dividing the table lines into unit grid lines according to the intersection point coordinates and the category information;
determining the cells corresponding to the cell lines according to a second preset rule;
and determining the coordinate information of each cell area according to the first vertex coordinates and the second vertex coordinates of the cells.
7. The method of claim 1, further comprising:
determining a text image block in each unit cell area according to the mapping relation;
and combining the text blocks according to the first preset rule to obtain the text information.
8. The method of claim 1, further comprising:
judging whether the distance between the mutually parallel table lines is smaller than a first threshold value; when the judgment result is yes, merging the mutually parallel table lines;
judging whether the distance between the end points of the two table lines is smaller than a second threshold value or not;
and when the judgment result is yes, connecting the end points of the two table lines.
9. A form recognition apparatus, comprising:
the determining module is used for determining the category information of each table line in a picture frame and determining the coordinate information of each text picture block in the picture frame;
the projection module is used for projecting the table lines to corresponding coordinate axes according to the category information to obtain a projection area;
the determining module is further configured to determine at least one table area in the drawing frame according to the projection area;
the determining module is further configured to determine intersection coordinates of the table lines in each table region according to the category information;
the determining module is further configured to determine coordinate information of each cell area in each table area according to the intersection point coordinates and the category information;
and the generating module is used for generating the mapping relation between the text image blocks and each cell area according to the coordinate information of each text image block and the coordinate information of each cell area in each table area.
10. An apparatus, comprising: a processor and a memory, the processor being configured to execute an identification program stored in the memory to implement the identification method of any one of claims 1 to 8.
11. A storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the identification method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111577374.6A CN114283436A (en) | 2021-12-20 | 2021-12-20 | Table identification method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111577374.6A CN114283436A (en) | 2021-12-20 | 2021-12-20 | Table identification method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114283436A true CN114283436A (en) | 2022-04-05 |
Family
ID=80873908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111577374.6A Pending CN114283436A (en) | 2021-12-20 | 2021-12-20 | Table identification method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114283436A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115331013A (en) * | 2022-10-17 | 2022-11-11 | 杭州恒生聚源信息技术有限公司 | Data extraction method and processing equipment for line graph |
CN115512385A (en) * | 2022-09-30 | 2022-12-23 | 中交第二航务工程局有限公司 | Bitmap-format reinforcing steel drawing data extraction method and system |
CN118097697A (en) * | 2024-03-26 | 2024-05-28 | 内蒙古电力勘测设计院有限责任公司 | Processing method, device and equipment for form image |
-
2021
- 2021-12-20 CN CN202111577374.6A patent/CN114283436A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115512385A (en) * | 2022-09-30 | 2022-12-23 | 中交第二航务工程局有限公司 | Bitmap-format reinforcing steel drawing data extraction method and system |
CN115331013A (en) * | 2022-10-17 | 2022-11-11 | 杭州恒生聚源信息技术有限公司 | Data extraction method and processing equipment for line graph |
CN115331013B (en) * | 2022-10-17 | 2023-02-24 | 杭州恒生聚源信息技术有限公司 | Data extraction method and processing equipment for line graph |
CN118097697A (en) * | 2024-03-26 | 2024-05-28 | 内蒙古电力勘测设计院有限责任公司 | Processing method, device and equipment for form image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114283436A (en) | Table identification method, device, equipment and storage medium | |
US8549399B2 (en) | Identifying a selection of content in a structured document | |
CN111860502B (en) | Picture form identification method and device, electronic equipment and storage medium | |
CN110866930B (en) | Semantic segmentation auxiliary labeling method and device | |
JP2005316946A (en) | Layout-rule generation system, layout system, layout-rule generation program, layout program, storage medium, method of generating layout rule and method of layout | |
CN113343740B (en) | Table detection method, device, equipment and storage medium | |
CN107229439B (en) | Method and device for displaying picture | |
CN112348836A (en) | Method and device for automatically extracting building outline | |
CN111310254B (en) | CAD legend identification method and device, storage medium and electronic equipment | |
US20220043961A1 (en) | Facilitating dynamic document layout by determining reading order using document content stream cues | |
CN111857704A (en) | Code generation method and device for layout relationship | |
CN112926421B (en) | Image processing method and device, electronic equipment and storage medium | |
CN114529773A (en) | Form identification method, system, terminal and medium based on structural unit | |
CN114565703A (en) | Method, device and equipment for adjusting centralized labeling and readable storage medium | |
CN111523531A (en) | Word processing method and device, electronic equipment and computer readable storage medium | |
US11055526B2 (en) | Method, system and apparatus for processing a page of a document | |
TWM623309U (en) | English font image recognition system | |
JPH077456B2 (en) | Recognition device of figure by degree of polymerization | |
CN113538623A (en) | Method and device for determining target image, electronic equipment and storage medium | |
US10679049B2 (en) | Identifying hand drawn tables | |
CN113934875B (en) | Electrophoresis data identification method and system, computer storage medium and electronic equipment | |
CN114444185A (en) | In-situ labeling identification method and device and electronic equipment | |
CN111783180B (en) | Drawing splitting method and related device | |
US9373193B2 (en) | Method and apparatus for detecting and avoiding conflicts of space entity element annotations | |
CN114445840A (en) | Processing method and device of form text, electronic equipment and readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |