CN116311300A - Table generation method, apparatus, electronic device and storage medium - Google Patents

Table generation method, apparatus, electronic device and storage medium Download PDF

Info

Publication number
CN116311300A
CN116311300A CN202310172183.4A CN202310172183A CN116311300A CN 116311300 A CN116311300 A CN 116311300A CN 202310172183 A CN202310172183 A CN 202310172183A CN 116311300 A CN116311300 A CN 116311300A
Authority
CN
China
Prior art keywords
information
cell
text
target
obtaining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310172183.4A
Other languages
Chinese (zh)
Inventor
韩光耀
许海洋
岳洪达
王艺
苏磊
陈禹燊
段博坤
章良杰
李治平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202310172183.4A priority Critical patent/CN116311300A/en
Publication of CN116311300A publication Critical patent/CN116311300A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/177Editing, e.g. inserting or deleting of tables; using ruled lines
    • G06F40/18Editing, e.g. inserting or deleting of tables; using ruled lines of spreadsheets
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Abstract

The disclosure discloses a table generation method, a table generation device, electronic equipment and a storage medium, and relates to the technical field of computers, in particular to the technical field of optical character recognition. The specific implementation scheme is as follows: and acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information. And obtaining the cell coordinate information according to the page parameter information, the cell layout information and the text layout information. And obtaining the text coordinate information according to the text layout information and the cell coordinate information. And generating a target table and marking data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information.

Description

Table generation method, apparatus, electronic device and storage medium
Technical Field
The disclosure relates to the technical field of computers, in particular to the technical field of optical character recognition, and specifically relates to a table generation method, a table generation device, electronic equipment and a storage medium.
Background
The optical character recognition technology refers to the process of analyzing and processing an image file after scanning text data to acquire text and layout information.
With the development of optical character recognition technology, text data in a table image can be recognized and extracted by using a trained table structured model. In training a tabular structured model, a large number of sample table images and annotation data of the sample table images are required.
Disclosure of Invention
The disclosure provides a method, a device, an electronic device and a storage medium for generating a table.
According to an aspect of the present disclosure, there is provided a table generating method including:
and acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information. And obtaining the cell coordinate information according to the page parameter information, the cell layout information and the text layout information. And obtaining the text coordinate information according to the text layout information and the cell coordinate information. And generating a target table and marking data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information.
According to another aspect of the present disclosure, there is provided a form generating apparatus including: the device comprises a first acquisition module, a second acquisition module and a generation module. The first acquisition module is used for acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information. The first obtaining module is used for obtaining the cell coordinate information according to the page parameter information, the cell layout information and the text layout information. And the second obtaining module is used for obtaining the text coordinate information according to the text layout information and the cell coordinate information. And the generating module is used for generating a target table and marking data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method as described above.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 schematically illustrates an exemplary system architecture to which the form generation methods and apparatus may be applied, according to embodiments of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a table generation method according to an embodiment of the disclosure;
FIG. 3 schematically illustrates a schematic diagram of generating cell coordinate information according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a schematic of cells in a table according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a schematic diagram of a display effect of text in a cell determined according to a text arrangement according to an embodiment of the present disclosure;
FIG. 6 schematically illustrates a tabular schematic diagram according to some embodiments of the present disclosure;
FIG. 7 schematically illustrates a table schematic with border lines according to some embodiments of the present disclosure;
FIG. 8 schematically illustrates a tabular schematic with stamps in accordance with some embodiments of the present disclosure;
FIG. 9 schematically illustrates a table schematic of a filtered process according to some embodiments of the present disclosure;
fig. 10 schematically shows a block diagram of a table generating apparatus according to an embodiment of the present disclosure; and
fig. 11 schematically illustrates a block diagram of an electronic device adapted to implement a table generating method according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the process of identifying and extracting information from a table image by utilizing an optical identification technology, not only the characters in the table image but also the table structure are required to be identified, and the structural processing of the row and column information is required to be carried out on the identification result.
Therefore, before training a model for identifying the text and the table structure of the table image, a large number of sample table images need to be acquired, the text and the table structure in the sample table images are labeled to obtain labeling data, and the model is trained by using the sample table images and the labeling data.
However, in the related art, the labeling data is obtained by adopting a manual labeling mode, so that not only the labeling efficiency is low, but also the accuracy of the labeling result is low, especially for the form image in the financial field, for example: the structure of the balance sheet, cash flow sheet, profit sheet and the like in the table images is complex, and great difficulty exists when the data in the table images are marked in a manual marking mode. Directly affecting the effect of model training.
In view of this, an embodiment of the present disclosure provides a table generating method, including:
and acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information. And obtaining the cell coordinate information according to the page parameter information, the cell layout information and the text layout information. And obtaining the text coordinate information according to the text layout information and the cell coordinate information. And generating a target table and marking data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information. The method can generate the form with complex structure applied to the financial technical field, and automatically generate the annotation data corresponding to the form while generating the form without manual annotation.
Fig. 1 schematically illustrates an exemplary system architecture to which the table generation method and apparatus may be applied according to an embodiment of the present disclosure.
It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios. For example, in another embodiment, an exemplary system architecture to which the form generation method and apparatus may be applied may include a terminal device, but the terminal device may implement the form generation method and apparatus provided by the embodiments of the present disclosure without interacting with a server.
As shown in fig. 1, a system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired and/or wireless communication links, and the like.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 101, 102, 103, such as a knowledge reading class application, a web browser application, a search class application, an instant messaging tool, a mailbox client and/or social platform software, etc. (as examples only).
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for content browsed by the user using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that, the table generating method provided by the embodiment of the present disclosure may be generally performed by the terminal device 101, 102, or 103. Accordingly, the table generating apparatus provided by the embodiment of the present disclosure may also be provided in the terminal device 101, 102, or 103.
Alternatively, the table generation method provided by the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the table generating apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The table generation method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the table generating apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
For example, when the user transmits a form generation request online, the terminal device 101, 102, 103 may acquire text content information and layout information of a target page selected by the user from the database, and then transmit the acquired text content information and layout information of the target page to the server 105, and the server 105 transmits the text content information and the layout information of the target page to the cell coordinate information according to the page parameter information, the cell layout information, and the text layout information. And obtaining the text coordinate information according to the text layout information and the cell coordinate information. And generating a target table and labeling data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information. Or by a server or cluster of servers capable of communicating with the terminal devices 101, 102, 103 and/or the server 105, and ultimately enabling the extraction of content of interest to the user.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flowchart of a table generating method according to an embodiment of the present disclosure.
As shown in fig. 2, the method includes operations S210 to S240.
In operation S210, text content information and layout information of a target page including page parameter information, cell layout information, and text layout information are acquired.
In operation S220, cell coordinate information is obtained according to the page parameter information, the cell layout information, and the text layout information.
In operation S230, text coordinate information is obtained according to the text layout information and the cell coordinate information.
In operation S240, a target table and annotation data corresponding to the target table are generated according to the cell coordinate information, the text coordinate information, and the text content information.
According to an embodiment of the present disclosure, the text content information may be form content information for presentation in a cell of a form. For example: in the XX trade list form, the form content information may include: account information, family name information, inquiry start date information, inquiry expiration date information, inquiry time information, inquiry teller information, currency information, transaction date information, billing date, transaction location information, transaction type information, debit status information, transaction amount information, and the like.
According to the embodiment of the disclosure, the text content information may be configured according to the content information of the real table, and the configured text content information is stored as a table content candidate set in the table header dictionary. So as to obtain the required text content information by traversing the header dictionary.
According to an embodiment of the present disclosure, the page parameter information may include page width information, page length information, page line number information, page column number information.
According to embodiments of the present disclosure, the cell layout information may include column width ratio information and relative position information between cells. The column width scale information may be generally used to scale the width of the other cells by the width of the smallest cell based on the device for the row with the largest number of columns in the table. For example: in the same row, 3 cells can be included, and the ratio of the column widths among the 3 cells is as follows: the column width ratio of the 1 st cell to the row minimum cell is "2", the column width ratio of the 2 nd cell to the row minimum cell is "1.7", and the 3 rd cell is the row minimum cell.
For example: the page width is 100, the page length is 300, the page line number is 10, and the row with the largest page column number comprises 4 column cells. The column width ratio of the column may be, in order from left to right: the column width ratio of the 1 st cell is "1", the column width ratio of the 2 nd cell is "1.5", the column width ratio of the 3 rd cell is "1", and the column width ratio of the 4 th cell is "1.5", the widths of the cells of the column can be obtained in order as follows: 20. 30, 20, 30.
According to the embodiment of the disclosure, the height information of the minimum unit cell in a certain column can be obtained according to the page length and page row proportion information. The heights of other cells are scaled according to the row ratio to obtain the height information of all cells in a certain column, which is not described herein.
According to embodiments of the present disclosure, the relative position information between cells may represent relative position information between cells in one row and cells in other rows. For example: the first row may include 4 cells, respectively cell A 1 Cell a 2 Cell a 3 Cell a 4 . In the second row, 7 cells may be included, respectively cell B 1 Cell b 2 Cell b 3 Cell b 4 Cell b 5 Cell b 6 Cell b 7 . The relative positional relationship between the cells can be expressed as: cell A 1 (1, 1), cell CellA 2 (2, 3), cell CellA 3 (4, 5), cell CellA 4 (6,7)。
According to the embodiment of the disclosure, the cell A can be known according to the relative position relationship among the cells 1 And cell b 1 Is a vertically aligned positional relationship, i.e. cell A 1 And cell b 1 Is the same. Cell A 2 Across cell b 2 Cell b 3 I.e. cell A 2 Is equal to the width of the cell CellB 2 And cell b 3 Is the sum of the widths of (a) and (b). And so on, and will not be described in detail herein.
For example: the width of the cells in the second row may in turn be: cell B 1 Is "10" in width, cell B 2 Is "15" in width, cell B 3 Is "10" in width, cell B 4 Is "5" in width, cell B 5 Is "20" in width, cell B 6 Is "20" in width, cell B 7 Is "20" in width. The cell CellA can be obtained according to the relative position relationship among the cells 1 Is "10" in width, cell A 2 Is "25" in width, cell A 3 Is "25" in width, cell A 4 Is "40" in width.
According to an embodiment of the present disclosure, the text layout information may include an arrangement manner of the text in the cell, and font size information of the text. The arrangement of the characters in the cells may include top aligned arrangement, center aligned arrangement, bottom aligned arrangement, etc. of the characters in the cells. The word size information of the words may include height information of the words, width information of the words, and the like.
According to the embodiment of the disclosure, taking the arrangement mode of the characters in the cells as the top alignment arrangement as an example, and the characters are arranged in a single row in the cells, the height of the characters can be determined to be equal to the height of the cells.
For example: cell 1 in the first row 1 The width of "10", the text height may be "5", and the cell CellA may be determined 1 Coordinate information of (2): "(0, 0), (10, 0), (0, 5), (10, 5)".
According to the embodiment of the disclosure, the arrangement mode of the characters in the cells is taken as a top alignment arrangement mode as an example, and the characters are arranged in a single row in the cells. For example: the text may be "1" wide, with the 1 st cell a in the first row 1 The text content information in (a) may be two words of "account number", and the coordinates of "account" word may be "(0, 0), (1, 0), (0, 5), (1, 5)". The coordinates of the "number" word may be "(1, 0), (2, 0), (1, 5), (2, 5)".
According to the embodiment of the disclosure, text contents can be filled into corresponding cells according to the text coordinate information, and then the target table can be obtained.
According to an embodiment of the present disclosure, the annotation data corresponding to the target form characterizes the structured annotation data of the combination of the text content information and the located cell coordinate information. For example: cell A 1 The annotation data of (2) may include: cell A 1 Coordinate information of (c) and cell CellA 1 Is included in the text content information.
According to the embodiment of the disclosure, in the process of generating the target table, header information and random number information corresponding to the header may be included in the text content information. For example: the header information may be "account number", and the random number information corresponding to the header may be "000XXX1111".
According to the embodiment of the disclosure, the target table with different cell layouts and text layouts can be flexibly generated by acquiring the text content information and the layout information of the target page, so that the target table for model training and the labeling data corresponding to the target table can be quickly generated. Because the labeling data is the coordinate information, the text coordinate information and the text content information of the cells directly obtained in the process of generating the target table, the accuracy and the efficiency of the labeling data are improved.
The method shown in fig. 2 is further described below with reference to fig. 3-9 in conjunction with the exemplary embodiment.
Fig. 3 schematically illustrates a schematic diagram of generating cell coordinate information according to an embodiment of the present disclosure.
As shown in fig. 3, in 300, minimum cell size information 3203 is obtained from page parameter information 3201 and cell size ratio information 3202. Then, cell size information 3205 is obtained from the minimum cell size information 3203 and the relative positional relationship information 3204. The arrangement position information 3206 of the cells on the target page is obtained according to the cell size information 3205 and the page parameter information 3201. A plurality of target cells 3207 located in the same row may be determined according to arrangement position information 3206 of the cells on the target page. The line number information 3209 of the text arranged in the unit cell can be obtained according to the line feed identification information 3208. The height information 3210 of the plurality of target cells may be obtained according to the number of rows of text information 3209 arranged in the cells. Cell coordinate information 3211 may be obtained from the height information 3210 of a plurality of target cells 3207 and a plurality of target cells located in the same row.
According to an embodiment of the present disclosure, the size ratio information of the unit cell may include column width ratio information and row height ratio information. The process of determining the cell width information will be described in detail below by taking the column width ratio information as an example.
According to the embodiment of the present disclosure, since the column width ratio information of the cells is generally set by selecting the row with the largest number of columns in the target table. For example: the page width is 100, and the row with the largest column number can comprise 3 column cells, and the column width proportion information is as follows: the column width ratio of the 1 st column cell may be "6", the column width ratio of the 2 nd column cell may be "1", the column width ratio of the 3 rd column cell may be "3", and the minimum cell width may be 10.
According to the embodiment of the disclosure, the relative position relationship between the cells can represent the column width span information between the columns where the cells are located and the columns where other cells are located. For example: cell A 1 May be a cell located in the first row and the first column, cell B 1 And cell b 2 May be cells located in the first column and the second column of the second row, respectively. Cell A 1 The column width of the column spans the cell b 1 And cell b 2 The column width of the column in which the cells are located, the relative positional relationship between the cells can be expressed as: cell A 1 (cell B) 1 Cell b 2 ). I.e. cell A 1 Is equal to the width of the cell CellB 1 And cell b 2 Is the sum of the widths of (a) and (b).
According to the relative position relation between the minimum cell width and the cells, the cell width information can be obtained according to the formula (1):
cell width=minimum cell width×sum [ column width ratio (column width span information) ] (1)
For example: minimum cell width is 10, cell b 1 The column width ratio of (2), cell B 2 The column width ratio of (2) is 1.5, and the column width span information is: cell A 1 (cell B) 1 Cell b 2 ) Cell A can be obtained 1 Is 35.
According to an embodiment of the present disclosure, the text layout information includes line feed identification information of text in a cell, and according to cell width information, page parameter information, and text layout information, cell coordinate information is obtained, which may include the following operations:
and identifying the line feed identification information to obtain the line number information of the text arranged in the unit cell. And obtaining the arrangement position information of the cells on the target page according to the cell width information and the page parameter information. And obtaining the coordinate information of the cell according to the arrangement position information and the arrangement line number information.
According to the embodiment of the disclosure, in a real table, characters in each cell may have a case of being arranged across rows, so that row arrangement number information of the characters in the cell can be obtained by setting row-changing identification information of the characters in the cell.
For example: in cell A 1 The text content information may be "account number/card number: nAccount/card. No: ", where" \n "may represent the line feed identification information.
According to the embodiment of the disclosure, by identifying the line feed identification information, the text content information in the cell A can be obtained 1 The number of rows of arrangement information in (2) rows.
According to the embodiment of the present disclosure, since the minimum cell width is set in the most column and one row, the widths of other cells are also obtained according to the column width span information between the cells and the cells on which the minimum cell is located. Therefore, the arrangement position information of the cells on the target page can be obtained according to the cell width information and the page parameter information.
For example: the row with the largest column number can be the 6 th row, and the column width span information of other cells can be determined according to the relative position relation among the cells of the 6 th row. For example: the column width span information of the 1 st cell of the 5 th row may be (1, 2), and the column width of the 1 st cell representing the 5 th row is equal to the sum of the widths of the 1 st cell and the 2 nd cell of the 6 th row. The column width span information of the 1 st cell of the 4 th row may be (1, 2, 3), and the column width of the 1 st cell representing the 4 th row is equal to the sum of the widths of the 1 st cell to the 3 rd cell of the 6 th row.
According to the embodiment of the present disclosure, since the height of each cell currently considers only page parameter information and a row proportion, the heights in the cells in each row should be the same in the real form image. Therefore, on the basis of determining the relative arrangement positions of the cells on the target page, the heights of the cells in the same row can be combined, and finally the height information of the cells in the row can be determined.
According to the embodiment of the disclosure, obtaining the cell coordinate information according to the arrangement position information and the arrangement line number information may include the following operations:
and obtaining a plurality of target cells positioned in the same row according to the arrangement position information. And obtaining the height information of the plurality of target cells according to the arrangement line number information. And obtaining the coordinate information of the cell according to the arrangement position information and the height information.
For example: from the arrangement position information, a plurality of target cells located in the same row may be determined, for example: the target cell may include 3 cells, respectively cell cells 1 Cell M 2 Cell M 3 . The height information of each cell can be obtained according to the arrangement line number of the characters in each cell, for example: cell M 1 The number of the text layout lines in the Chinese character is 2, and the unit cell is CellM 2 The number of the text layout lines in the Chinese character is 3, and the unit cell is cell M 3 The number of the lines of the character arrangement is 2.
According to an embodiment of the present disclosure, obtaining height information of a plurality of target cells according to the arrangement line number information may include the following operations:
and ordering the arrangement line number information of the plurality of target cells to obtain an ordering result. And obtaining the height information according to the sequencing result.
For example: the 3 cells are ordered according to the information of the number of the arranged rows, and the obtained ordering result can be: cell M 2 Cell M 1 Cell M 3 . The height of the cell with the largest number of rows of cells in each row may be determined as the final height of all cells in that row. For example: cell M 2 May be 5, then the cell m 1 Cell M 3 And is also 5.
Fig. 4 schematically illustrates a cell schematic in a table according to an embodiment of the disclosure.
As shown in fig. 4, in 400, cells a through P may each represent a cell where a cross-list header is located. The row with the largest column number may be the row where the cell Q is located, and this row represents the header. The column width of cell a spans the width of cell Q to cell V6 cells. The column widths of the cells B, F, J, and N are the same as the column width of the cell Q. The column widths of cell C, cell G, cell K, cell O are equal to the sum of the column widths of cell R and cell S. The column widths of the cells D, H and L are equal to the sum of the column widths of the cells T and U. The column width of the cell P is equal to the sum of the column widths of the cell T, the cell U, and the cell V. The column widths of the cells E, I, M are equal to the column width of the cell V. From top to bottom, the rows from the row of the cell a1 to the row of the cell an are all the cells of the table body, and the structure of the cells of the table body is the same as that of the cells of the table head.
According to the embodiment of the disclosure, the text layout information comprises arrangement mode information of the text in the cell, text size information and line feed identification information of the text in the cell. The above operation S230 may include the following operations:
and determining a target strategy according to the arrangement mode information. And obtaining the arrangement line number information of the characters in the unit cells according to the line feed identification information. Based on the target strategy, the text coordinate information is obtained according to the cell coordinate information, the text size information, the arrangement mode information and the arrangement line number information.
According to an embodiment of the present disclosure, the arrangement of the text may include an arrangement of the text in a horizontal direction and an arrangement of the text in a vertical direction in the cell. The arrangement mode in the horizontal direction can comprise: left side alignment, center alignment, right side alignment. The arrangement mode in the vertical direction can comprise: top alignment, center alignment, and bottom alignment. The arrangement mode of the characters in the unit cell can comprise the following 9 modes: left side alignment + top alignment, left side alignment + center alignment, left side alignment + bottom alignment, center alignment + top alignment, center alignment + center alignment, center alignment + bottom alignment, right side alignment + top alignment, right side alignment + center alignment, right side alignment + bottom alignment.
Fig. 5 schematically illustrates a schematic diagram of a display effect of text in a cell according to a text arrangement manner according to an embodiment of the present disclosure.
As shown in fig. 5, the display effect of left side alignment + top alignment is shown as 5321, where the text is arranged starting from the position of the top left corner of the cell. The centered aligned + centered aligned display effect is shown as 532i, with the text arranged from the middle of the cell. The display effect of right side alignment and bottom alignment is shown as 532I, the characters are arranged from the bottom near the right side in the unit cell, and the last character is positioned at the position of the right lower vertex angle of the unit cell.
According to embodiments of the present disclosure, the target policy may characterize a calculation policy for different literal coordinates configured for different arrangements.
According to an embodiment of the present disclosure, the text size information includes text height information and text width information. Based on the target strategy, according to the cell coordinate information, the character size information, the arrangement mode information and the arrangement line number information, the character coordinate information is obtained, which can comprise the following operations:
and obtaining the text abscissa information according to the cell coordinate information, the text width information and the arrangement mode information. Based on the target strategy, according to the cell coordinate information, the character height information and the arrangement line number information, the character ordinate information is obtained. And obtaining the text coordinate information according to the text abscissa information and the text ordinate information.
According to the embodiment of the disclosure, the self-defined offset parameter threshold range may be configured according to the arrangement mode of the characters in the horizontal direction in the cell, for example: [0,1]. When the arrangement mode of the characters in the horizontal direction in the unit cell is that the left side is aligned, the offset parameter of the characters in the horizontal direction can be determined to be 0. When the arrangement mode of the characters in the horizontal direction in the unit cells is centered and aligned, the offset parameter of the characters in the horizontal direction can be determined to be 0.5. When the arrangement mode of the characters in the horizontal direction in the cell is that the right side is aligned, the offset parameter of the characters in the horizontal direction can be determined to be 1.
According to the embodiment of the disclosure, since in a real table, the border line of the table occupies a certain width. In order to avoid that the text adjacent to the cell border line is partially covered, the above-mentioned range of the offset parameter threshold may be changed according to actual needs. For example: may be changed to [0.2,0.8].
According to an embodiment of the present disclosure, obtaining text abscissa information according to offset parameter information, text width information, and cell coordinate information may include the following operations:
and obtaining the cell width information according to the left vertex coordinate information and the right vertex coordinate information. And under the condition that the arrangement mode information is determined to be aligned in the middle, acquiring the text abscissa information according to the left vertex coordinate information, the cell width information and the text width information. And under the condition that the arrangement mode information is determined to be left-side alignment, acquiring the text abscissa information according to the left vertex coordinate information and the offset parameter information. And under the condition that the arrangement mode information is right side alignment, acquiring the text abscissa information according to the right vertex coordinate information, the text width information and the offset parameter information.
According to the embodiment of the disclosure, the text width information can represent the width information of all the texts obtained by all the text contents in the unit according to the preset text size.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is centered alignment, the text abscissa information can be obtained according to the formula (2):
text abscissa=left vertex coordinates+cell width/2-text width/2 (2)
For example: the left vertex abscissa of a cell may be 0, the cell width may be 5, the text width may be 3, and the starting abscissa of the text may be determined to be 1.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is left-side alignment, the text abscissa information can be obtained according to the formula (3):
literal abscissa=left vertex coordinates+offset parameter number (3)
For example: the left vertex abscissa of the cell may be 0, the offset parameter corresponding to the left aligned arrangement may be 0.2, and the starting abscissa of the text may be determined to be 0.2.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is right-side alignment, the text abscissa information may be obtained according to the formula (4):
text abscissa=right vertex coordinates-text width-offset parameter (4)
For example: the abscissa of the right vertex of the cell may be 5, the width of the text may be 3, the number of offset parameters corresponding to the right aligned arrangement may be 0.8, and the starting abscissa of the text may be determined to be 1.2.
According to the embodiment of the disclosure, under the condition that the initial coordinates of the characters in the cells are determined, the abscissa of the next character can be shifted to the right along the horizontal direction by the width of the character number on the basis of the initial coordinates, and the abscissa information of the next character can be obtained.
According to an embodiment of the present disclosure, the cell coordinate information may include upper left vertex coordinate information of a cell, and obtaining text ordinate information according to the cell coordinate information, text height information, and arrangement line number information based on a target policy may include the following operations:
and obtaining the minimum cell height information according to the cell coordinate information. And obtaining the ordinate information of the characters according to the coordinate information of the top left vertex, the height information of the minimum cell, the height information of the characters and the information of the number of rows of arrangement.
For example: in the case where the coordinate information of each cell is determined, the height information of the smallest cell on the target page may be obtained, for example: the minimum cell height may be 5. The text height may be 1, and the number of arrangement lines may include information on the number of arrangement lines of the text in a cell and information on the maximum number of arrangement lines of the text in a plurality of cells in the same line. For example: the number of the lines of the characters arranged in the cells can be 1, and the maximum number of the lines of the characters arranged in the cells in the same line can be 2. According to different arrangement modes of the characters in the vertical direction of the cell, according to different calculation strategies, the ordinate information of the characters can be obtained.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is top alignment, text ordinate information can be obtained according to the formula (5):
text ordinate = vertical coordinate of upper left vertex + minimum cell height x maximum number of rows/2-text height x maximum number of rows/2 (6)
For example: the ordinate of the top left vertex can be 0, the minimum cell height is 5, the maximum arrangement line number is 2, the text height is 1, and the text ordinate can be determined to be 4.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is aligned centrally, the text ordinate information may be obtained according to the formula (6):
ordinate of text = ordinate of upper left vertex + minimum cell height x maximum number of rows/2-text height x number of rows/2 of text rows in the cell (7)
For example: the ordinate of the top left vertex can be 0, the height of the minimum cell is 5, the maximum arrangement line number is 2, the text height is 1, the text arrangement line number in the cell is 1, and the ordinate of the text can be determined to be 4.5.
According to the embodiment of the disclosure, in the case of determining that the arrangement mode information is bottom alignment, the text ordinate information can be obtained according to the formula (7):
Ordinate of text = ordinate of upper left vertex + minimum cell height x maximum number of rows/2-text height x maximum number of rows/2 + minimum cell height x (maximum number of rows-number of rows of text in the cell) (7)
For example: the ordinate of the top left vertex can be 0, the height of the minimum cell is 5, the maximum arrangement line number is 2, the text height is 1, the text arrangement line number in the cell is 1, and the ordinate of the text can be determined to be 9.
According to an embodiment of the present disclosure, the above operation S240 may include the following operations:
and determining the target cell according to the text content information. And filling the text content information into the target cell according to the text coordinate information. And arranging the target cells on the target page according to the cell coordinate information to generate a target table. And processing the cell coordinate information, the text coordinate information and the text content information according to a preset data format to obtain labeling data.
According to embodiments of the present disclosure, table content information and table structure information may be included in the annotation data. For example: the text content information may be an "account number" and the target cell may be determined to be the 1 st cell of the second row. According to the text coordinate information, for example: the "ledger" fills the target cell according to the starting abscissa 0.2, starting ordinate 1 of the first word. The "number" is filled into the target cell according to the starting abscissa 0.5, starting ordinate 1 of the second letter. And arranging the target cells on the target page according to the cell coordinate information to obtain a target table.
It should be noted that, in the embodiment of the present disclosure, the coordinate information of the unit cell and the coordinate information of the text may be pixel coordinate information.
According to an embodiment of the present disclosure, the cell coordinate information, the text coordinate information, and the text content information may be stored according to a one-to-one mapping relationship, and processed according to a predetermined data format, for example: may be JSON format, resulting in annotation data.
Fig. 6 schematically illustrates a table schematic according to some embodiments of the present disclosure.
As shown in fig. 6, in 600, in the 1 st row of cells, the text is arranged in the cells in a manner of a centered alignment in the horizontal direction and a centered alignment in the vertical direction. In each cell after row 2, the text is arranged in the cell in a manner of left side alignment in the horizontal direction and center alignment in the vertical direction. In the cells of the 2 nd row, the 4 th row and the 5 th row, the number of the rows of the characters arranged in the cells is 2. In the 3 rd row and the 1 st cell, the number of rows of the characters arranged in the cell is 3.
Note that, since the table generated in the embodiment of the present disclosure is used as a sample table for model training, text values corresponding to "account number", "card number", "transaction amount", "transaction date" and the like in the table content are random numbers.
To bring the target form closer to the acquired form image, a different form border line may be added to the target form.
According to an embodiment of the present disclosure, the above table generating method may further include the following operations:
and obtaining a linear template of the table frame. And constructing frame lines on the target table according to the frame line type template of the table and the coordinate information of the cells to obtain a first table image.
According to an embodiment of the present disclosure, the target page may be a blank page constructed using the hillow image processing software. Gray colors can also be randomly added on the blank page to obtain first table images with different gray scales.
According to embodiments of the present disclosure, a variety of lineages may be included in the form bezel lineages template, such as: full frame type, non-full frame type, virtual line type, etc. The linearity in the table frame line type template can be flexibly selected according to actual needs, and frame lines are constructed for the unit cells in the target table, so that a first table image is obtained.
Fig. 7 schematically illustrates a table schematic with border lines according to some embodiments of the present disclosure.
As shown in fig. 7, in 700, among the cells in the 1 st to 6 th rows, the frame line type of the cell is the full frame line type. The cells from the 7 th row to the last row comprise a plurality of frame lines. For example: in the 7 th row and the 1 st cell, the line types of the left side frame and the top frame are solid line frame line types, and the line types of the right side frame and the bottom frame are borderless line types. In the 7 th row and the 2 nd cell, the line types in the left side frame and the bottom frame are borderless line types, the line type of the top frame is a solid line frame line type, and the line type of the right side frame is a broken line frame line type.
According to an embodiment of the present disclosure, the above table generating method may further include the following operations:
and acquiring a seal material image set. And randomly selecting a target seal image from the seal material image set. And processing the target seal image and the first form image to obtain a second form image.
According to embodiments of the present disclosure, the seal material image set may include seal template images in different scenes in the simulated real form image. The target seal image T can be randomly selected from the seal template image, and fusion processing can be carried out on the target seal image T and the first form image to obtain a second form image.
Fig. 8 schematically illustrates a tabular diagram with stamps in accordance with some embodiments of the present disclosure.
As shown in fig. 8, in 1100, the target stamp image T is a stamp of XX company, and the position of the target stamp image T may be random, in other words, the relative position of the target stamp image T and the first table image is changed, and a plurality of second table images may be obtained as sample table images for model training.
Because of the acquired real form image, the stamp may be incomplete, for example: for the saddle seal, only partial seal is displayed on the form image of different pages. Therefore, in order to improve the proximity degree between the second form image and the real form image, the target seal image may be processed first, and then fused with the first form image.
According to an embodiment of the present disclosure, processing the target stamp image and the first form image to obtain a second form image may include the following operations:
and cutting the target seal image to obtain a local seal image. And deleting the background color in the local seal image to obtain the target local seal image. And processing the target local seal image and the first form image to obtain a second form image.
Because the real form image can be acquired by using the image acquisition device under different light conditions, different acquisition modes and different acquisition devices can acquire the image information of the same form, the acquired form image can be different. In order to simulate the form images acquired under different acquisition environments, the processing of the target local seal image and the first form image to obtain a second form image may include the following operations:
and carrying out fusion processing on the target local seal image and the first form image to obtain the form image with the target local seal. And filtering the form image with the target local seal to obtain a second form image.
According to an embodiment of the present disclosure, the filtering process may include at least one of: gaussian filtering, contour filtering, detail filtering, edge enhancement filtering, smoothing filtering, depth smoothing filtering, non-sharpening mask filtering, etc. And carrying out one or more times of filtering processing on the form image with the target local seal in one or more filtering modes to obtain a second form image which can be used for simulating different acquisition environments.
Fig. 9 schematically illustrates a table schematic of a filtered process according to some embodiments of the present disclosure.
As shown in fig. 9, in 900, a form image with a target stamp may be processed using gaussian filtering to simulate a form image in a real acquisition environment. As can be seen from 900, the fonts in the form image are in a more fuzzy state, and the training accuracy of the model can be effectively improved by training the form recognition model by using the form image subjected to filtering processing.
Fig. 10 schematically shows a block diagram of a table generating apparatus according to an embodiment of the present disclosure.
As shown in fig. 10, the table generating apparatus 1000 of this embodiment may include a first acquisition module 1010, a first acquisition module 1020, a second acquisition module 1030, and a generation module 1040.
The first obtaining module 1010 is configured to obtain text content information and layout information of a target page, where the layout information of the target page includes page parameter information, cell layout information, and text layout information. In some embodiments, the first obtaining module 1010 may be configured to implement the operation S210 described above, which is not described herein.
The first obtaining module 1020 is configured to obtain cell coordinate information according to the page parameter information, the cell layout information, and the text layout information. In some embodiments, the first obtaining module 1020 may be configured to implement the operation S220 described above, which is not described herein.
The second obtaining module 1030 is configured to obtain text coordinate information according to the text layout information and the cell coordinate information. In some embodiments, the second obtaining module 1030 may be used to implement the operation S230 described above, which is not described herein.
The generating module 1040 is configured to generate a target table and a label number corresponding to the target table according to the cell coordinate information, the text coordinate information, and the text content information. In some embodiments, the generating module 1040 may be configured to implement the operation S240 described above, which is not described herein.
According to an embodiment of the present disclosure, the cell layout information includes size ratio information of cells and relative positional relationship information between cells. The first obtaining module may include: the device comprises a first obtaining sub-module, a second obtaining sub-module and a third obtaining sub-module. The first obtaining sub-module is used for obtaining minimum cell size information according to the page parameter information and the cell size proportion information. And the second obtaining submodule is used for obtaining the cell size information according to the minimum cell size information and the relative position relation information. And the third obtaining sub-module is used for obtaining the cell coordinate information according to the cell size information, the page parameter information and the text layout information.
According to an embodiment of the present disclosure, the text layout information includes line feed identification information of text within a cell, and the third obtaining sub-module may include: the device comprises an identification unit, a first obtaining unit and a second obtaining unit. The identification unit is used for identifying the line feed identification information and obtaining the line number information of the arrangement of the characters in the unit cells. The first obtaining unit is used for obtaining the arrangement position information of the cells on the target page according to the cell size information and the page parameter information. And the second obtaining unit is used for obtaining the cell coordinate information according to the arrangement position information and the arrangement line number information.
According to an embodiment of the present disclosure, the second obtaining unit may include: a first obtaining subunit, a second obtaining subunit, and a third obtaining subunit. The first obtaining subunit is configured to obtain, according to the arrangement position information, a plurality of target cells located in the same row. And the second obtaining subunit is used for obtaining the height information of the plurality of target cells according to the arrangement line number information. And the third obtaining subunit is used for obtaining the coordinate information of the cell according to the arrangement position information and the height information.
According to the embodiment of the disclosure, the second obtaining subunit is configured to sort the information of the number of rows of the arrangement rows of the plurality of target cells, so as to obtain a sorting result. And obtaining the height information according to the sequencing result.
According to an embodiment of the present disclosure, the text layout information includes arrangement mode information of text in a cell, text size information, and line feed identification information of text in the cell, and the second obtaining module may include: the first determination sub-module, the fourth obtaining sub-module, and the fifth obtaining sub-module. The first determining submodule is used for determining a target strategy according to the arrangement mode information. And a fourth obtaining sub-module, configured to obtain information of the number of lines of the text arranged in the unit cell according to the line feed identification information. And a fifth obtaining sub-module, configured to obtain text coordinate information according to the unit cell coordinate information, the text size information, the arrangement mode information and the arrangement line number information based on the target policy.
According to an embodiment of the present disclosure, the text size information includes text height information and text width information, and the fifth obtaining sub-module may include: a third obtaining unit, a fourth obtaining unit, and a fifth obtaining unit. The third obtaining unit is used for obtaining the horizontal coordinate information of the characters according to the coordinate information of the cells, the character width information and the arrangement mode information. And the fourth obtaining unit is used for obtaining the ordinate information of the characters according to the coordinate information of the cells, the character height information and the arrangement line number information based on the target strategy. And a fifth obtaining unit, configured to obtain text coordinate information according to the text abscissa information and the text ordinate information.
According to an embodiment of the present disclosure, the third obtaining unit may include a determining subunit and a fourth obtaining subunit. The determining subunit is used for determining offset parameter information of the characters in the horizontal direction according to the arrangement mode information. And a fourth obtaining subunit, configured to obtain the text abscissa information according to the offset parameter information, the text width information, and the cell coordinate information.
According to an embodiment of the present disclosure, the cell coordinate information includes left vertex coordinate information and right vertex coordinate information, and the fourth obtaining subunit is configured to: and obtaining the cell width information according to the left vertex coordinate information and the right vertex coordinate information. And under the condition that the arrangement mode information is determined to be aligned in the middle, acquiring the text abscissa information according to the left vertex coordinate information, the cell width information and the text width information. And under the condition that the arrangement mode information is determined to be left-side alignment, acquiring the text abscissa information according to the left vertex coordinate information and the offset parameter information. And under the condition that the arrangement mode information is right side alignment, acquiring the text abscissa information according to the right vertex coordinate information, the text width information and the offset parameter information.
According to an embodiment of the present disclosure, the fourth obtaining unit may include a fifth obtaining subunit, a sixth obtaining subunit. The fifth obtaining subunit is configured to obtain the minimum cell height information according to the cell coordinate information. And a sixth obtaining subunit, configured to obtain the ordinate information of the text according to the coordinate information of the top left vertex, the height information of the minimum cell, the text height information and the number of rows of arrangement.
According to the embodiment of the disclosure, the arrangement line number information comprises arrangement line number information of characters in a cell and maximum arrangement line number information of characters in a plurality of cells in the same line; the sixth obtaining subunit is configured to: and under the condition that the arrangement mode information is determined to be top aligned, acquiring the text ordinate information according to the upper left vertex coordinate information, the minimum cell height information, the text height information and the maximum arrangement line number information. Under the condition that the arrangement mode information is determined to be aligned in the middle or aligned at the bottom, the vertical coordinate information of the characters is obtained according to the coordinate information of the top left vertex, the height information of the minimum cell, the height information of the characters, the arrangement line number information of the characters in the cell and the maximum arrangement line number information.
According to an embodiment of the present disclosure, the generating module may include a second determining sub-module, a filling sub-module, a generating sub-module, and a sixth obtaining sub-module. The second determining submodule is used for determining target cells according to the text content information. And the filling sub-module is used for filling the text content information into the target cell according to the text coordinate information. And the generating sub-module is used for arranging the target cells on the target page according to the cell coordinate information to generate a target table. And a sixth obtaining sub-module, configured to process the cell coordinate information, the text coordinate information and the text content information according to a predetermined data format, so as to obtain labeling data.
According to an embodiment of the present disclosure, the table generating apparatus may further include a second obtaining module and a third obtaining module. The second acquisition module is used for acquiring the linear templates of the table frames. And the third obtaining module is used for constructing frame lines on the target table according to the frame line type template of the table and the coordinate information of the cells to obtain a first table image.
According to an embodiment of the present disclosure, the table generating apparatus may further include a third obtaining module, a selecting module, and a fourth obtaining module. The third acquisition module is used for acquiring the seal material image set. And the selecting module is used for randomly selecting the target seal image from the seal material image set. And the fourth obtaining module is used for processing the target seal image and the first form image to obtain a second form image.
According to an embodiment of the present disclosure, the fourth obtaining module may include a seventh obtaining sub-module, an eighth obtaining sub-module, and a ninth obtaining sub-module. And the seventh obtaining submodule is used for cutting the target seal image to obtain a local seal image. And an eighth obtaining submodule, configured to delete a background color in the local seal image to obtain a target local seal image. And a ninth obtaining submodule, configured to process the target local seal image and the first form image to obtain a second form image.
According to an embodiment of the present disclosure, the ninth obtaining sub-module may include a sixth obtaining unit and a seventh obtaining unit. The sixth obtaining unit is used for fusing the target local seal image and the first form image to obtain the form image with the target local seal. And a seventh obtaining unit, configured to perform filtering processing on the form image with the target local seal, to obtain a second form image.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
According to an embodiment of the present disclosure, an electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described above.
According to an embodiment of the present disclosure, a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described above.
According to an embodiment of the present disclosure, a computer program product comprising a computer program which, when executed by a processor, implements a method as described above.
Fig. 11 illustrates a schematic block diagram of an example electronic device 1100 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 11, the apparatus 1100 includes a computing unit 1101 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data required for the operation of the device 1100 can also be stored. The computing unit 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.
Various components in device 1100 are connected to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, etc.; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, etc.; and a communication unit 1109 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.
The computing unit 1101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 1101 performs the respective methods and processes described above, for example, a table generation method. For example, in some embodiments, the table generation method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, some or all of the computer programs may be loaded and/or installed onto device 1100 via ROM1102 and/or communication unit 1109. When a computer program is loaded into the RAM 1103 and executed by the computing unit 1101, one or more steps of the table generation method described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the table generation method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (20)

1. A form generation method, comprising:
acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information;
obtaining cell coordinate information according to the page parameter information, the cell layout information and the text layout information;
obtaining text coordinate information according to the text layout information and the cell coordinate information; and
And generating a target table and labeling data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information.
2. The method of claim 1, wherein the cell layout information includes size proportion information of cells and relative positional relationship information between cells; the page parameter information and the cell layout information obtain cell coordinate information, including:
obtaining minimum cell size information according to the page parameter information and the cell size proportion information;
obtaining cell size information according to the minimum cell size information and the relative position relation information; and
and obtaining the cell coordinate information according to the cell size information, the page parameter information and the text layout information.
3. The method of claim 2, wherein the text layout information includes line feed identification information of text within a cell, and the obtaining the cell coordinate information according to the cell size information, the page parameter information, and the text layout information includes:
Identifying the line feed identification information to obtain the line number information of the arrangement of the characters in the unit cells;
obtaining arrangement position information of the cells on the target page according to the cell size information and the page parameter information; and
and obtaining the cell coordinate information according to the arrangement position information and the arrangement line number information.
4. The method of claim 3, wherein the obtaining the cell coordinate information according to the arrangement position information and the arrangement line number information includes:
obtaining a plurality of target cells positioned in the same row according to the arrangement position information;
obtaining the height information of the plurality of target cells according to the arrangement line number information; and
and obtaining the cell coordinate information according to the arrangement position information and the height information.
5. The method of claim 4, wherein the obtaining the height information of the plurality of target cells according to the arrangement line number information includes:
ordering the arrangement line number information of the plurality of target cells to obtain an ordering result; and
and obtaining the height information according to the sequencing result.
6. The method of claim 1, wherein the text layout information includes layout information of text in a cell, text size information, and line feed identification information of text in a cell, and the obtaining text coordinate information according to the text layout information and the cell coordinate information includes:
determining a target strategy according to the arrangement mode information;
obtaining the arrangement line number information of the characters in the unit cells according to the line feed identification information; and
based on the target strategy, the text coordinate information is obtained according to the unit cell coordinate information, the text size information, the arrangement mode information and the arrangement line number information.
7. The method of claim 6, wherein the text size information includes text height information and text width information, the obtaining the text coordinate information based on the target policy from the cell coordinate information, the text size information, the arrangement information, and the arrangement number information includes:
obtaining character abscissa information according to the cell coordinate information, the character width information and the arrangement mode information;
Based on the target strategy, acquiring text ordinate information according to the cell coordinate information, the text height information and the arrangement line number information; and
and obtaining the text coordinate information according to the text abscissa information and the text ordinate information.
8. The method of claim 7, wherein the obtaining text abscissa information according to the cell coordinate information, the text width information, and the arrangement information includes:
according to the arrangement mode information, determining offset parameter information of the characters in the horizontal direction; and
and obtaining the text abscissa information according to the offset parameter information, the text width information and the cell coordinate information.
9. The method of claim 8, wherein the cell coordinate information includes left vertex coordinate information and right vertex coordinate information, the obtaining the text abscissa information based on the offset parameter information, the text width information, and the cell coordinate information includes:
obtaining cell width information according to the left vertex coordinate information and the right vertex coordinate information;
Under the condition that the arrangement mode information is determined to be aligned in the middle, acquiring the character abscissa information according to the left vertex coordinate information, the cell width information and the character width information;
under the condition that the arrangement mode information is determined to be left side alignment, acquiring the text abscissa information according to the left vertex coordinate information and the offset parameter information; and
and under the condition that the arrangement mode information is right side alignment, acquiring the character abscissa information according to the right vertex coordinate information, the character width information and the offset parameter information.
10. The method of claim 7, wherein the cell coordinate information includes upper left vertex coordinate information of a cell, the obtaining text ordinate information based on the target policy according to the cell coordinate information, the text height information, and the arrangement line number information includes:
obtaining minimum cell height information according to the cell coordinate information; and
and obtaining the ordinate information of the characters according to the coordinate information of the upper left vertex, the height information of the minimum cell, the height information of the characters and the information of the number of rows of arrangement.
11. The method of claim 10, wherein the arrangement line number information includes arrangement line number information of characters in a cell and maximum arrangement line number information of characters in a plurality of cells of the same line; the obtaining the text ordinate information according to the upper left vertex coordinate information, the minimum cell height information, the text height information and the arrangement line number information comprises the following steps:
under the condition that the arrangement mode information is determined to be top aligned, acquiring the text ordinate information according to the upper left vertex coordinate information, the minimum cell height information, the text height information and the maximum arrangement line number information; and
and under the condition that the arrangement mode information is determined to be aligned in the middle or aligned at the bottom, acquiring the ordinate information of the characters according to the coordinate information of the top left vertex, the height information of the minimum cell, the height information of the characters, the information of the number of rows of the characters in the cell and the information of the maximum number of rows of the characters.
12. The method of claim 1, wherein the generating a target form and annotation data corresponding to the target form from the cell coordinate information, the text coordinate information, and the text content information comprises:
Determining a target cell according to the text content information;
filling the text content information into a target cell according to the text coordinate information;
arranging the target cells on the target page according to the cell coordinate information to generate the target table; and
and processing the cell coordinate information, the text coordinate information and the text content information according to a preset data format to obtain the labeling data.
13. The method of claim 1, further comprising:
obtaining a linear template of a table frame; and
and constructing frame lines on the target table according to the table frame line type template and the cell coordinate information to obtain a first table image.
14. The method of claim 13, further comprising:
acquiring a seal material image set;
randomly selecting a target seal image from the seal material image set; and
and processing the target seal image and the first form image to obtain a second form image.
15. The method of claim 14, wherein the processing the target stamp image and the first form image to obtain a second form image comprises:
Cutting the target seal image to obtain a local seal image;
deleting the background color in the local seal image to obtain a target local seal image; and
and processing the target local seal image and the first form image to obtain a second form image.
16. The method of claim 15, wherein the processing the target partial stamp image and the first form image to obtain a second form image comprises:
the target local seal image and the first form image are fused, so that a form image with the target local seal is obtained; and
and carrying out filtering treatment on the form image with the target local seal to obtain the second form image.
17. A form generation apparatus comprising:
the first acquisition module is used for acquiring text content information and layout information of a target page, wherein the layout information of the target page comprises page parameter information, cell layout information and text layout information;
the first obtaining module is used for obtaining the cell coordinate information according to the page parameter information, the cell layout information and the text layout information;
The second obtaining module is used for obtaining the text coordinate information according to the text layout information and the cell coordinate information; and
and the generating module is used for generating a target table and marking data corresponding to the target table according to the cell coordinate information, the text coordinate information and the text content information.
18. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-16.
19. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-16.
20. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-16.
CN202310172183.4A 2023-02-16 2023-02-16 Table generation method, apparatus, electronic device and storage medium Pending CN116311300A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310172183.4A CN116311300A (en) 2023-02-16 2023-02-16 Table generation method, apparatus, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310172183.4A CN116311300A (en) 2023-02-16 2023-02-16 Table generation method, apparatus, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN116311300A true CN116311300A (en) 2023-06-23

Family

ID=86814312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310172183.4A Pending CN116311300A (en) 2023-02-16 2023-02-16 Table generation method, apparatus, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN116311300A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116757886A (en) * 2023-08-16 2023-09-15 南京尘与土信息技术有限公司 Data analysis method and analysis device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116757886A (en) * 2023-08-16 2023-09-15 南京尘与土信息技术有限公司 Data analysis method and analysis device
CN116757886B (en) * 2023-08-16 2023-11-28 南京尘与土信息技术有限公司 Data analysis method and analysis device

Similar Documents

Publication Publication Date Title
CN110390269B (en) PDF document table extraction method, device, equipment and computer readable storage medium
US10984295B2 (en) Font recognition using text localization
US10699166B2 (en) Font attributes for font recognition and similarity
EP3712812A1 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
US10915788B2 (en) Optical character recognition using end-to-end deep learning
WO2023015922A1 (en) Image recognition model training method and apparatus, device, and storage medium
WO2019154197A1 (en) Electronic book handwritten note display method, computing device and computer storage medium
EP3794494A1 (en) Table detection in spreadsheet
US20220392242A1 (en) Method for training text positioning model and method for text positioning
JP2021103552A (en) Method for labelling structured document information, device for labelling structured document information, electronic apparatus, computer readable storage medium, and computer program
CN116311300A (en) Table generation method, apparatus, electronic device and storage medium
US11881044B2 (en) Method and apparatus for processing image, device and storage medium
CN114937270A (en) Ancient book word processing method, ancient book word processing device and computer readable storage medium
CN114120305B (en) Training method of text classification model, and text content recognition method and device
JP2019175037A (en) Character recognition device, method and program
JP7430219B2 (en) Document information structuring device, document information structuring method and program
CN113762223B (en) Question splitting model training method, question splitting method and related device
CN112949450B (en) Bill processing method, device, electronic equipment and storage medium
CN116644724B (en) Method, device, equipment and storage medium for generating bid
CN113536169B (en) Method, device, equipment and storage medium for typesetting characters of webpage
CN113360636B (en) Content display method, device, equipment and storage medium
CN114792423B (en) Document image processing method and device and storage medium
US20230281380A1 (en) Method of processing text, electronic device and storage medium
JP2023047180A (en) Information processing device and information processing program
CN114911963A (en) Template picture classification method, device, equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination