CN115082946A - Table image processing method and device, electronic equipment and computer storage medium - Google Patents
Table image processing method and device, electronic equipment and computer storage medium Download PDFInfo
- Publication number
- CN115082946A CN115082946A CN202210770530.9A CN202210770530A CN115082946A CN 115082946 A CN115082946 A CN 115082946A CN 202210770530 A CN202210770530 A CN 202210770530A CN 115082946 A CN115082946 A CN 115082946A
- Authority
- CN
- China
- Prior art keywords
- line
- lines
- vertical
- cell
- repairing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/20—Drawing from basic elements, e.g. lines or circles
- G06T11/206—Drawing of charts or graphs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/15—Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
Abstract
The embodiment of the application provides a form image processing method and device, electronic equipment and a computer storage medium, and relates to the field of artificial intelligence. The method comprises the following steps: acquiring a table transverse line and a table vertical line of a table image to be processed; then, repairing the table horizontal lines and/or the table vertical lines, wherein the repairing comprises at least one of broken line repairing, intersecting line repairing and repeated line filtering; then, drawing a first table straight line segmentation drawing of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line; and then, acquiring the cell information of each cell of the to-be-processed form image, and generating a new form image according to the cell information of each cell and the first form straight line segmentation graph. The method and the device can improve the form identification accuracy, decouple the form line detection model and the form character identification model, and combine the two models at will to realize the end-to-end form identification capability.
Description
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a form image processing method and apparatus, an electronic device, and a computer storage medium.
Background
With the improvement of the office electronization degree, the document data originally stored in the form of paper is gradually changed into the form of image by the electronization means such as a scanner. OCR (Optical Character Recognition) refers to a process in which an electronic device (e.g., a scanner or a digital camera) examines a Character printed on paper, determines its shape by detecting dark and light patterns, and then translates the shape into a computer text by a Character Recognition method. OCR is widely used in everyday life for a variety of needs and scenarios, common scenarios including, for example: identification cards, business licenses, value added tax invoices, train tickets, taxi tickets, and the like. Furthermore, as application requirements continue to be mined, the need for identifying table content also increases.
Table recognition is considered part of the table understanding, and a table image refers to an image containing a table. The table OCR (Optical Character Recognition) is to extract structure information of a table in a picture, extract Character information in an image in combination with the OCR, and restore information in the table. However, the complexity of the table image is caused by the diversity of scenes, resulting in low recognition accuracy of the entire table.
Disclosure of Invention
The embodiment of the application provides a form image processing method and device, electronic equipment and a computer storage medium, and can solve the problem of low form identification accuracy. The technical scheme is as follows:
according to an aspect of an embodiment of the present application, there is provided a form image processing method including:
acquiring a table transverse line and a table vertical line of a table image to be processed;
repairing the table horizontal lines and/or the table vertical lines, wherein the repairing comprises at least one of broken line repairing, crossed line repairing and repeated line filtering;
drawing a first table straight line segmentation graph of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line;
and acquiring cell information of each cell of the table image to be processed, and generating a new table image according to the cell information of each cell and the first table straight line segmentation graph.
In a possible implementation manner, obtaining a table horizontal line and a table vertical line of a table image to be processed includes:
generating a second table line segmentation graph of the table image to be processed based on the Unet image segmentation model;
and extracting the table transverse lines and the table vertical lines in the second table line segmentation graph according to the inclination of the table lines to obtain the table transverse lines and the table vertical lines of the table image to be processed.
In one possible implementation manner, the performing the broken line repairing process on the table cross line includes:
determining a continuous transverse line set belonging to the transverse lines of the same table according to a straight line clustering method;
determining the cross line broken line belonging to the same table cross line according to a first preset threshold value, and repairing the cross line broken line; and/or the presence of a gas in the gas,
the method for repairing the broken line of the vertical line of the table comprises the following steps:
determining a continuous vertical line set belonging to the vertical lines of the same table according to a linear clustering method;
and determining the vertical line broken lines belonging to the same table vertical line according to a second preset threshold, and repairing the vertical line broken lines.
In one possible implementation, the processing of filtering out the repeated lines of the table horizontal lines and/or the table vertical lines includes:
screening repeated transverse lines from a continuous transverse line set belonging to the transverse lines of the same table according to a repeated line distinguishing method, and filtering the repeated transverse lines; and/or the presence of a gas in the gas,
and screening repeated vertical lines from a continuous vertical line set belonging to the same table vertical line according to a repeated line distinguishing method, and filtering the repeated vertical lines.
In one possible implementation, the cross line repairing process is performed on the table horizontal lines and/or the table vertical lines, and includes:
determining a target table vertical line which is intersected with each table transverse line but not intersected with each table transverse line based on a cross line segment discrimination method according to a third preset threshold, and repairing the target table vertical line to enable each table transverse line to be intersected with the target table vertical line; and/or the presence of a gas in the gas,
and determining a target table transverse line which is intersected with each table vertical line but not intersected with each table vertical line based on an intersection line segment discrimination method according to a fourth preset threshold, and repairing the target table transverse line so as to enable each table vertical line to be intersected with the target table transverse line.
In one possible implementation, the cell information of each cell includes location information and content information of each cell;
acquiring cell information of each cell of a to-be-processed form image, wherein the cell information comprises the following steps:
acquiring the position information of each cell of a to-be-processed form image by a preset image processing method;
identifying the content information of each cell through a Convolution Recurrent Neural Network (CRNN) model;
generating a new table image according to the cell information of each cell and the first table straight line segmentation graph, wherein the method comprises the following steps: and generating a new table image according to the position information of each cell and the straight line segmentation graph of the first table, and filling content information into each corresponding cell.
According to another aspect of embodiments of the present application, there is provided a form image processing apparatus including:
the acquisition module is used for acquiring a table transverse line and a table vertical line of a table image to be processed;
the first processing module is used for repairing the table horizontal lines and/or the table vertical lines, and the repairing comprises at least one of broken line repairing, intersecting line repairing and repeated line filtering;
the second processing module is used for drawing a first table straight line segmentation graph of the table image to be processed based on the repaired table transverse line and/or the repaired table vertical line;
and the third processing module is used for acquiring the cell information of each cell of the table image to be processed and generating a new table image according to the cell information of each cell and the first table straight line segmentation graph.
In one possible implementation manner, the obtaining module is configured to:
generating a second table line segmentation graph of the table image to be processed based on the Unet image segmentation model;
and extracting the table transverse lines and the table vertical lines in the second table line segmentation graph according to the inclination of the table lines to obtain the table transverse lines and the table vertical lines of the table image to be processed.
In a possible implementation manner, when performing the broken line repairing process on the table cross line, the first processing module is configured to:
determining a continuous transverse line set belonging to the transverse lines of the same table according to a straight line clustering method;
determining the cross line broken line belonging to the same table cross line according to a first preset threshold value, and repairing the cross line broken line; and/or the presence of a gas in the gas,
when the first processing module carries out broken line repairing processing on the vertical lines of the table, the first processing module is used for:
determining a continuous vertical line set belonging to the vertical lines of the same table according to a linear clustering method;
and determining the vertical line broken lines belonging to the same table vertical line according to a second preset threshold, and repairing the vertical line broken lines.
In one possible implementation manner, the first processing module, when performing repeated line filtering processing on the table horizontal lines and/or the table vertical lines, is configured to:
screening repeated transverse lines from a continuous transverse line set belonging to the transverse lines of the same table according to a repeated line distinguishing method, and filtering the repeated transverse lines; and/or the presence of a gas in the gas,
and (4) screening repeated vertical lines from a continuous vertical line set belonging to the same vertical line in the same table according to a repeated line distinguishing method, and filtering the repeated vertical lines.
In one possible implementation manner, the first processing module, when performing the intersecting line repairing process on the table horizontal line and/or the table vertical line, is configured to:
determining a target table vertical line which is intersected with each table transverse line but not intersected with each table transverse line based on an intersection line segment discrimination method according to a third preset threshold, and repairing the target table vertical line to enable each table transverse line to be intersected with the target table vertical line; and/or the presence of a gas in the gas,
and determining a target table transverse line which is intersected with each table vertical line but not intersected with each table vertical line based on an intersection line segment discrimination method according to a fourth preset threshold, and repairing the target table transverse line so as to enable each table vertical line to be intersected with the target table transverse line.
In one possible implementation, the cell information of each cell includes location information and content information of each cell; the third processing module is used for, when acquiring the cell information of each cell of the to-be-processed form image:
acquiring the position information of each cell of a to-be-processed form image by a preset image processing method;
identifying the content information of each cell through a Convolution Recurrent Neural Network (CRNN) model;
the third processing module is used for generating a new table image according to the cell information of each cell and the first table straight line segmentation graph, and is used for:
and generating a new table image according to the position information of each cell and the straight line segmentation graph of the first table, and filling content information into each corresponding cell.
According to another aspect of an embodiment of the present application, there is provided an electronic apparatus including: the processor executes the computer program to realize the steps of the table image processing method.
According to still another aspect of embodiments of the present application, there is provided a computer-readable storage medium, and a computer program when executed by a processor, implements the steps of the form image processing method described above.
According to an aspect of an embodiment of the present application, there is provided a computer program product, which when executed by a processor implements the steps of the form image processing method described above.
The technical scheme provided by the embodiment of the application has the following beneficial effects: the missing table lines are repaired or supplemented by repairing or supplementing broken lines and missing lines of the table transverse lines and/or the table vertical lines, and the redundant table lines are removed by filtering repeated lines in the table transverse lines and/or the table vertical lines, so that common table line missing, broken lines and missing line phenomena can be effectively solved, a fault-tolerant mechanism in table decomposition is ensured, the reconstruction of cells and tables is not influenced by the factors, the table identification accuracy can be improved, a table line detection model and a table character identification model can be decoupled, a user can freely combine the two models, and the end-to-end table identification capability is realized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic flowchart of a form image processing method according to an embodiment of the present application;
FIG. 2 is a schematic view of a table line segment according to an embodiment of the present application;
fig. 3 is a schematic diagram of a continuous horizontal line set and a continuous vertical line set obtained by using a straight line clustering method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a repaired form horizontal line and a repaired form vertical line according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a form image processing apparatus according to an embodiment of the present application;
fig. 6 is a structural schematic diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" indicates either an implementation as "a", or an implementation as "a and B".
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The terms referred to in this application will be introduced and explained as follows:
the Unet model: a U-shaped network structure model is greatly different in the image segmentation field in 2015, and Unnet is widely applied to the segmentation field. The Unet model belongs to a convolutional neural network, is developed for biomedical image segmentation by the computer science department of the university of Freuberg, Germany, and is based on a full convolutional network, the architecture of the full convolutional network is modified and expanded, fewer training images can be used, and more accurate segmentation can be generated for deep learning image segmentation.
CRNN (Convolutional Recurrent Neural Network): the method is a popular image-text recognition model at present, and can recognize a longer text sequence. The method is mainly used for recognizing the text sequence with the indefinite length end to end, and the text recognition is converted into the sequence learning problem of time sequence dependence without cutting a single character, namely the sequence recognition based on the image. The whole CRNN network structure comprises three parts, from bottom to top: a CNN (convolutional layer) that extracts features from an input image using the depth CNN to obtain a feature map; RNN (loop layer) that predicts a sequence of features using bidirectional RNN (blstm), learns each feature vector in the sequence, and outputs a distribution of prediction labels (true values); CTC loss (transcription layer), using CTC loss, converts a series of tag distributions obtained from the loop layer into final tag sequences.
Form recognition is considered to be part of the form understanding, and the current common form recognition technology route generally includes the following steps:
1) table detection identifies a portion of a file as a table. Positioning the position of the table (such as the coordinates of the upper left vertex and the lower right vertex of the table) in the picture by using a target detection algorithm (such as a yolo deep learning algorithm);
2) the table structure is decomposed, and the task of this step is to identify the component-cells of the original table. For example: finding out a table line by adopting a table line detection algorithm (a traditional cv image processing algorithm or an image segmentation deep learning algorithm series); classifying the table lines into table transverse lines and table vertical lines through some conditions; and then finding out the intersection point of the horizontal line and the vertical line, and obtaining the cell through the position information of the intersection point.
3) And (4) identifying table content, such as performing text identification on each cell (by adopting a deep CRNN learning algorithm).
4) The table reconstructs, for example, the correct identification of the header elements, the structure of the columns and rows, the correct allocation of the data units.
From the technical route, the general ideas of table identification are similar, and the accuracy of any step is reduced, which directly results in the reduction of the accuracy of the identification of the whole table. In the table recognition technical route mentioned above, the diversity of scenes causes the complexity of pictures, and the table recognition rate is reduced by affecting a certain link in the route. For example, in the table structure decomposition in the second step, the table line detection in the table image in the natural scene often occurs in the situations of missing lines, overlapping lines, wrong lines, and the like. When the fourth step of table reconstruction is caused by line leakage, the cells are merged; when a wrong line results in the fourth step of table reconstruction, one cell is divided into a plurality of cells, and so on.
The technical problem to be solved by the application is mainly to construct complete table components including table transverse lines, table vertical lines, table cells and the like in the second table decomposition process, and solve the common phenomena of table line missing, repeated lines and wrong lines. The fault-tolerant mechanism in the table decomposition is ensured through an effective algorithm, so that the cells and the table reconstruction are not influenced by the factors.
According to the method, a complete table straight line segmentation graph is extracted through a Unet image segmentation model and a table line optimization technology, the cell content of each cell is extracted through analysis and extraction of table straight lines, and finally the identified cell content is filled into a table through a CRNN identification model.
The technical solutions of the embodiments of the present application and the technical effects produced by the technical solutions of the present application will be described below through descriptions of several exemplary embodiments. It should be noted that the following embodiments may be referred to, referred to or combined with each other, and the description of the same terms, similar features, similar implementation steps and the like in different embodiments is not repeated.
Fig. 1 is a schematic flowchart of a form image processing method according to an embodiment of the present application, and as shown in fig. 1, the method includes: step S110, obtaining a form transverse line and a form vertical line of a form image to be processed; step S120, repairing the table horizontal lines and/or the table vertical lines, wherein the repairing comprises at least one of broken line repairing, intersecting line repairing and repeated line filtering; step S130, drawing a first table straight line segmentation graph of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line; step S140, obtaining the cell information of each cell of the table image to be processed, and generating a new table image according to the cell information of each cell and the first table straight line segmentation chart.
Specifically, after the table horizontal lines and the table vertical lines of the table image to be processed are obtained, in the process of performing the repair processing on the table horizontal lines and/or the table vertical lines, at least one of the line breakage repair processing, the intersecting line repair processing, and the filtering repeated line processing may be performed on the table horizontal lines and/or the table vertical lines. Equivalently, when the table horizontal line and/or the table vertical line are/is broken, the broken line repairing processing is carried out on the table horizontal line and/or the table vertical line; when the cross lines of the table horizontal lines and/or the table vertical lines are/is missing, performing cross line repairing treatment on the table horizontal lines and/or the table vertical lines; when the table horizontal lines and/or the table vertical lines have the condition of repeated lines, filtering the repeated lines of the table horizontal lines and/or the table vertical lines; when the table horizontal lines and/or the table vertical lines have the line breakage condition and the intersection line loss condition, performing line breakage repairing treatment and intersection line repairing treatment on the table horizontal lines and/or the table vertical lines; when the table horizontal lines and/or the table vertical lines have the broken line condition and the repeated line condition, performing broken line repairing treatment and filtering repeated line treatment on the table horizontal lines and/or the table vertical lines; similarly, when the table horizontal line and/or the table vertical line have both the broken line condition and the intersecting line missing condition, and also have the repeated line condition, the broken line repairing process, the intersecting line repairing process and the filtering repeated line process are respectively carried out on the table horizontal line and/or the table vertical line.
Specifically, if there is a broken line in the table horizontal line, the broken line repair process may be performed on the table horizontal line, for example, the broken line may be repaired or supplemented, and if there is a broken line in the table vertical line, the broken line repair process may be performed on the table vertical line. In other words, if there is a broken line in the table horizontal line and there is no broken line in the table vertical line, the broken line repairing process can be performed only on the table horizontal line, and there is no need to perform the repairing process on the table vertical line; if the vertical lines of the table are broken and the horizontal lines of the table are not broken, only the vertical lines of the table can be subjected to broken line repairing treatment, and the horizontal lines of the table do not need to be subjected to repairing treatment; if the table horizontal lines and the table vertical lines are broken, broken line repairing processing can be respectively carried out on the table horizontal lines and the table vertical lines.
Similarly, in the case of the intersection line repairing process, if a table horizontal line should intersect with a table vertical line corresponding thereto without intersecting, the intersection line repairing process is required to be performed on the table horizontal line, and/or if a table vertical line should intersect with a table horizontal line corresponding thereto without intersecting, the intersection line repairing process is required to be performed on the table vertical line.
Similarly, for the case of filtering the repeated lines, if there are repeated lines in the horizontal lines of the table, the repeated lines of the horizontal lines of the table may be filtered, for example, the repeated horizontal lines of the table may be filtered, and if there are repeated lines in the vertical lines of the table, the repeated lines of the vertical lines of the table may be filtered.
Based on the above description, in the process of drawing the first table straight line segmentation drawing of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line: when only the table horizontal lines are subjected to repairing processing, a first table straight line segmentation graph of the table image to be processed can be drawn based on the repaired table horizontal lines and the unrepaired table vertical lines; when only the table vertical lines are subjected to repairing processing, a first table straight line segmentation graph of the table image to be processed can be drawn based on the repaired table vertical lines and the unrepaired table horizontal lines; when the table horizontal lines and the table vertical lines are both repaired, the first table straight line segmentation graph of the table image to be processed can be drawn based on the repaired table horizontal lines and the repaired table vertical lines.
Because the table image includes one or more table cells (hereinafter referred to as cells) composed of table horizontal lines and table vertical lines in addition to the table straight lines and the table horizontal lines, and each table cell has individual cell information, after the first table straight line segmentation drawing is drawn and before a new table image is generated, it is necessary to acquire the cell information of each cell of the table image to be processed, and then, a new table image is generated based on the cell information of each cell and the first table straight line segmentation drawing, thereby reconstructing the table.
According to the method provided by the embodiment of the application, the missing table lines are repaired by repairing or complementing the broken lines and the missing lines of the table transverse lines and/or the table vertical lines, the redundant table lines are removed by filtering the repeated lines in the table transverse lines and/or the table vertical lines, the phenomena of common table line missing, broken lines and repeated lines can be effectively solved, a fault-tolerant mechanism in table decomposition is ensured, the reconstruction of the cells and the tables is not influenced by the factors, the table identification accuracy can be improved, the table line detection model and the table character identification model can be decoupled, a user can freely combine the two models, and the end-to-end table identification capability is realized.
In a possible implementation manner of the embodiment of the application, when a table horizontal line and a table vertical line of a table image to be processed are obtained, a second table line segmentation graph of the table image to be processed can be generated based on a Unet image segmentation model; and then, extracting the table transverse lines and the table vertical lines in the second table line segmentation graph according to the inclination of the table lines to obtain the table transverse lines and the table vertical lines of the table image to be processed.
In practical applications, a straight line segmentation map (referred to as a second table line segmentation map) of the table image to be processed can be segmented by an Unet image segmentation model, where the Unet image segmentation model is one of the table line detection models. The table lines of the table image to be processed can be identified through the Unet image segmentation model, then the table lines are classified into table horizontal lines and table vertical lines through some conditions (such as the gradient of the table lines), then the intersections of the table horizontal lines and the table vertical lines can be found, and the cells of the table image to be processed are obtained through the position information of the intersections.
In an example, the form image to be processed may be a form image of a ticket of a train, a coach or a taxi, may also be a form image of a value-added tax invoice, and may also be a form image of other possible tickets, which is not limited in the embodiment of the present application. When the form image to be processed is a form image of a value-added tax invoice, a second form line segmentation map (or a straight line segmentation map) of the form image to be processed identified by the Unet image segmentation model may be as shown in fig. 2, that is, fig. 2 shows a straight line segmentation map of the form image to be processed.
In a possible implementation manner, when performing broken line repairing processing on table transverse lines, a continuous transverse line set belonging to the same table transverse line may be determined according to a straight line clustering method, then transverse line broken lines belonging to the same table transverse line are determined according to a first preset threshold, and the broken lines are subjected to repairing processing.
In one example, when the broken line repairing process needs to be performed on the table transverse line, a continuous transverse line set belonging to the same table transverse line may be calculated by using a straight line clustering method, as shown in fig. 3, then the broken line belonging to the same table transverse line is found according to a preset threshold (i.e. the first preset threshold mentioned above), and then the broken line is subjected to a repairing process (e.g. repairing, completing, etc.), wherein the repaired transverse line broken line is shown as a white dashed line in fig. 4, and the white dashed line in fig. 4 is a broken line repairing effect.
Similarly, when the broken line repairing treatment is carried out on the vertical lines of the table, a continuous vertical line set belonging to the same vertical line of the table can be determined according to a linear clustering method; and then according to a second preset threshold value, determining the vertical line broken line belonging to the same table vertical line, and repairing the vertical line broken line.
In one example, when the broken line repairing process needs to be performed on the vertical lines of the table, a continuous vertical line set belonging to the same vertical line of the table may be calculated by using a straight line clustering method, as shown in fig. 3, then, according to a preset threshold (i.e., the second preset threshold mentioned above), the broken line belonging to the same vertical line of the table is found, and then the broken line is subjected to a repairing process (e.g., repairing, completing, etc.), where the repaired broken line of the vertical line is shown as a white dashed line in fig. 4, and the white dashed line in fig. 4 is a broken line repairing effect.
It should be noted that, the values of the first preset threshold and the second preset threshold may be set according to actual needs, and the values of the first preset threshold and the second preset threshold may be the same or different. In addition, each row in fig. 3 represents a cluster (or set) and a cluster represents a set of line segments belonging to the same straight line.
In a possible implementation manner, when the repeated line filtering processing is performed on the table transverse lines and/or the table vertical lines, the repeated transverse lines can be screened from the continuous transverse line set belonging to the same table transverse line according to a repeated line distinguishing method, and the repeated transverse lines are filtered; and/or screening repeated vertical lines from a continuous vertical line set belonging to the same table vertical line according to a repeated line discrimination method and filtering the repeated vertical lines.
In one example, when processing repeated lines of table horizontal lines and/or table vertical lines, a repeated line discrimination method may be adopted, and when two table horizontal lines satisfy the above-mentioned clustering rule and also satisfy a preset repeated line condition, it may be determined that the two table horizontal line segments are repeated lines, and at this time, one of the two table horizontal line segments needs to be filtered out to remove redundant table lines. Similarly, a repeated line judgment method can be adopted, when two table vertical lines meet the clustering rule, a preset repeated line part can be met, the two table vertical line segments can be judged to be repeated lines, and at the moment, one of the two table vertical line segments needs to be filtered out to remove redundant table lines.
In a possible implementation manner, when the cross line repairing processing is performed on the table horizontal lines, for each table horizontal line, a target table vertical line which should be intersected with each table horizontal line but is not intersected with each table horizontal line is determined according to a third preset threshold value based on a cross line segment discrimination method, and the target table vertical line is repaired, so that each table horizontal line is intersected with the target table vertical line. When the intersecting line repairing processing is performed on the vertical table lines, a target horizontal table line which is to intersect with each vertical table line but is not intersected with each vertical table line is determined according to a fourth preset threshold value based on an intersecting line segment distinguishing method for each vertical table line, and the target horizontal table line is repaired, so that each vertical table line is intersected with the target horizontal table line.
The values of the third preset threshold and the fourth preset threshold may be set according to actual needs, and the values of the third preset threshold and the fourth preset threshold may be the same or different.
In an example, intersecting lines of the following two cases may be repaired, (1) for each table horizontal line, a table vertical line (i.e., a target table vertical line) that should intersect but does not intersect with the table horizontal line is found according to a preset threshold (i.e., the third preset threshold described above), and the table vertical line is subjected to a repairing process (e.g., a bright white vertical line in fig. 4) so that the table horizontal line can intersect with the table vertical line that should intersect but does not intersect with the table horizontal line, or the target table vertical line is subjected to a repairing process so that the target table vertical line can intersect with the table horizontal line that should intersect but does not intersect with the table horizontal line. (2) For each table vertical line, a table horizontal line (i.e., a target table horizontal line) that should intersect but does not intersect with each table vertical line is found according to a preset threshold (i.e., the fourth preset threshold described above), and the table horizontal line is subjected to a repair process (e.g., a bright white horizontal line in fig. 4) so that the each table vertical line can intersect with the table horizontal line that should intersect but does not intersect with the table vertical line, or the target table horizontal line is subjected to a repair process so that the target table horizontal line can intersect with the table vertical line that should intersect but does not intersect with the table vertical line.
In one possible implementation, the cell information of each cell may include location information and content information of each cell; when the cell information of each cell of the to-be-processed form image is obtained, the position information of each cell of the to-be-processed form image can be obtained through a preset image processing method, and the content information of each cell can be identified through a Convolution Recurrent Neural Network (CRNN) model. Based on this, when generating a new table image from the cell information of each cell and the first table straight-line division diagram, the new table image may be generated from the position information of each cell and the first table straight-line division diagram, and the content information may be filled in each corresponding cell.
In an example, not only the position information of each cell of the to-be-processed form image may be extracted through analysis of the straight line of the form, but also the content information of each cell may be identified through the CRNN model, where the content information may be possible information such as text information, picture information, or graphic image information in each cell, and may also be content information in other possible forms, which is not limited in the embodiment of the present application.
After the first table straight-line segmentation graph of the table image to be processed is generated, and after the position information and the content information of each cell of the table image to be processed are obtained, a new table image can be generated according to the position information and the first table straight-line segmentation graph of each cell, and after the new table image is generated, the content information of each cell can be filled into each corresponding cell in the new table image, so that table reconstruction is realized, and the reconstructed table image is obtained.
The following describes possible ways of handling line breaks, repeated lines, and intersecting lines in the embodiments of the present application by specific examples:
(1) when the broken line is processed, a straight line clustering method rule is adopted,
suppose line segment 1 is composed of (x) 11 ,y 11 ),(x 12 ,y 12 ) Two points are formed, and the straight line formula is as follows:
a 1 x+b 1 y+c 1 =0
meanwhile, assume that line segment 2 consists of (x) 21 ,y 21 ),(x 22 ,y 22 ) Two points are formed, and the straight line formula is as follows:
a 2 x+b 2 y+c 2 =0
the criterion for judging that the line segment 1 and the line segment 2 are broken is that any one of the following conditions is satisfied:
the thresh1 is the first preset threshold, and the thresh2 is the second preset value, or the thresh1 is the second preset threshold, and the thresh2 is the first preset value. the value of thresh1 and the value of thresh2 are set according to actual needs, and the values of the thresh1 and the thresh2 may be the same or different.
(2) When processing the repeated line, adopting a repeated line discrimination method, when two line segments (the line segment 1 and the line segment 2) can satisfy any one of the following conditions besides the clustering rule, the line segment 1 and the line segment 2 can be regarded as the repeated line segment:
x 11 <x 21 <x 22 <x 12
y 11 <y 21 <y 22 <y 12
(3) when the intersection between the table horizontal line (for example, the above-mentioned line segment 1) and the table vertical line (for example, the above-mentioned line segment 2) is broken, the intersection between the table horizontal line and the table vertical line is first obtained, and assuming that the intersection is (x, y):
then, a cross line segment discrimination method is adopted, and if two line segments (such as the line segment 1 and the line segment 2) meet the following conditions, the two line segments are judged to be a cross line:
the thresh3 is the third preset threshold or the fourth preset value, and the value of the thresh3 is set according to actual needs.
Therefore, in the embodiment of the application, the problems of missing lines and broken lines between table transverse lines and/or table vertical lines are processed by adopting a clustering method, redundant table transverse lines and/or table vertical lines are removed by adopting a repeated line discrimination method, and the problems of missing lines and broken lines between table transverse lines and/or table vertical lines are processed by adopting a cross line segment discrimination method. In addition, in the embodiment of the present application, the table line segmentation graph of the table image to be processed is generated by using the table line detection model, which is a Unet image segmentation model, and the table content identification model, which is a CRNN model, is used to identify the content information of each cell of the table image to be processed, which are two mutually independent processes, that is, the embodiment of the present application implements the decoupling of the table line detection model and the table content identification model (such as the CRNN model), so that the two models can be arbitrarily combined, and an end-to-end table identification function is implemented.
An embodiment of the present application provides a form image processing apparatus, and as shown in fig. 5, the form image processing apparatus 500 may include: an acquisition module 501, a first processing module 502, a second processing module 503, and a third processing module 504, wherein,
the acquiring module 501 is configured to acquire a table horizontal line and a table vertical line of a table image to be processed;
the first processing module 502 is configured to perform repair processing on table horizontal lines and/or table vertical lines, where the repair processing includes at least one of broken line repair processing, intersecting line repair processing, and filtering repeated line processing;
the second processing module 503 is configured to draw a first table straight line segmentation drawing of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line;
the third processing module 504 is configured to obtain cell information of each cell of the to-be-processed table image, and generate a new table image according to the cell information of each cell and the first table straight-line segmentation map.
The table image processing device repairs or completes the missing table lines by repairing or completing the broken lines and the missing lines of the table transverse lines and/or the table vertical lines, removes the redundant table lines by filtering the repeated lines in the table transverse lines and/or the table vertical lines, can effectively solve the phenomena of the common table line missing lines, the broken lines and the missing lines of the table lines, ensures a fault-tolerant mechanism in the table decomposition, enables the reconstruction of the cells and the tables not to be influenced by the factors, not only can improve the table identification accuracy, but also can decouple the table line detection model and the table character identification model, enables a user to combine the two models at will, and realizes the end-to-end table identification capability.
In one possible implementation manner, the obtaining module is configured to:
generating a second table line segmentation graph of the table image to be processed based on the Unet image segmentation model;
and extracting the table transverse lines and the table vertical lines in the second table line segmentation graph according to the inclination of the table lines to obtain the table transverse lines and the table vertical lines of the table image to be processed.
In a possible implementation manner, the first processing module, when performing the broken line repairing process on the table cross line, is configured to:
determining a continuous transverse line set belonging to the transverse lines of the same table according to a straight line clustering method;
determining the cross line broken line belonging to the same table cross line according to a first preset threshold value, and repairing the cross line broken line; and/or the presence of a gas in the gas,
when the first processing module carries out broken line repairing processing on the vertical lines of the table, the first processing module is used for:
determining a continuous vertical line set belonging to the vertical lines of the same table according to a linear clustering method;
and determining the vertical line broken lines belonging to the same table vertical line according to a second preset threshold, and repairing the vertical line broken lines.
In one possible implementation manner, the first processing module, when performing the repeated line filtering processing on the table horizontal lines and/or the table vertical lines, is configured to:
screening repeated transverse lines from a continuous transverse line set belonging to the transverse lines of the same table according to a repeated line distinguishing method, and filtering the repeated transverse lines; and/or the presence of a gas in the gas,
and (4) screening repeated vertical lines from a continuous vertical line set belonging to the same vertical line in the same table according to a repeated line distinguishing method, and filtering the repeated vertical lines.
In one possible implementation manner, the first processing module, when performing the intersecting line repairing process on the table horizontal line and/or the table vertical line, is configured to:
determining a target table vertical line which is intersected with each table transverse line but not intersected with each table transverse line based on a cross line segment discrimination method according to a third preset threshold, and repairing the target table vertical line to enable each table transverse line to be intersected with the target table vertical line; and/or the presence of a gas in the gas,
and determining a target table transverse line which is intersected with each table vertical line but not intersected with each table vertical line based on an intersection line segment discrimination method according to a fourth preset threshold, and repairing the target table transverse line so as to enable each table vertical line to be intersected with the target table transverse line.
In one possible implementation, the cell information of each cell includes location information and content information of each cell; the third processing module is used for, when acquiring the cell information of each cell of the to-be-processed form image:
acquiring the position information of each cell of a to-be-processed form image by a preset image processing method;
identifying the content information of each cell through a Convolution Recurrent Neural Network (CRNN) model;
the third processing module is used for generating a new table image according to the cell information of each cell and the first table straight line segmentation graph, and is used for:
and generating a new table image according to the position information of each cell and the straight line segmentation graph of the first table, and filling content information into each corresponding cell.
The table image processing apparatus according to the embodiment of the present application can execute the table image processing method shown in the above-mentioned embodiment of the present application, and the implementation principle is similar, the actions executed by the modules in the apparatus according to the embodiments of the present application correspond to the steps in the method according to the embodiments of the present application, and for the detailed functional description of the modules in the apparatus, reference may be specifically made to the description in the corresponding method shown in the foregoing, and details are not repeated here.
The embodiment of the application provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of the form image processing method, compared with the prior art, the method can realize the following steps: the missing table lines are repaired or supplemented by repairing or supplementing broken lines and missing lines of the table transverse lines and/or the table vertical lines, and the redundant table lines are removed by filtering repeated lines in the table transverse lines and/or the table vertical lines, so that common table line missing, broken lines and missing line phenomena can be effectively solved, a fault-tolerant mechanism in table decomposition is ensured, the reconstruction of cells and tables is not influenced by the factors, the table identification accuracy can be improved, a table line detection model and a table character identification model can be decoupled, a user can freely combine the two models, and the end-to-end table identification capability is realized.
In an alternative embodiment, an electronic device is provided, as shown in fig. 6, the electronic device 4000 shown in fig. 6 comprising: a processor 4001 and a memory 4003. Processor 4001 is coupled to memory 4003, such as via bus 4002. Optionally, the electronic device 4000 may further include a transceiver 4004, and the transceiver 4004 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. In addition, the transceiver 4004 is not limited to one in practical applications, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.
The Processor 4001 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 4001 may also be a combination that performs a computational function, including, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
The Memory 4003 may be a ROM (Read Only Memory) or other types of static storage devices that can store static information and instructions, a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.
The memory 4003 is used for storing computer programs for executing the embodiments of the present application, and is controlled by the processor 4001 to execute. The processor 4001 is used to execute computer programs stored in the memory 4003 to implement the steps shown in the foregoing method embodiments.
Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.
Embodiments of the present application further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the steps and corresponding contents of the foregoing method embodiments can be implemented.
It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.
The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.
Claims (10)
1. A form image processing method, comprising:
acquiring a table transverse line and a table vertical line of a table image to be processed;
repairing the table transverse lines and/or the table vertical lines, wherein the repairing comprises at least one of broken line repairing, intersecting line repairing and repeated line filtering;
drawing a first table straight line segmentation graph of the table image to be processed based on the repaired table horizontal line and/or the repaired table vertical line;
and acquiring the cell information of each cell of the to-be-processed form image, and generating a new form image according to the cell information of each cell and the first form straight line segmentation graph.
2. The method according to claim 1, wherein the obtaining of the table horizontal line and the table vertical line of the table image to be processed comprises:
generating a second table line segmentation graph of the table image to be processed based on a Unet image segmentation model;
and extracting the table transverse lines and the table vertical lines in the second table line segmentation graph according to the inclination of the table lines to obtain the table transverse lines and the table vertical lines of the table image to be processed.
3. The method according to claim 1, wherein performing a broken line repair process on the table horizontal line comprises:
determining a continuous transverse line set belonging to the transverse lines of the same table according to a linear clustering method;
determining transverse line broken lines belonging to the same table transverse line according to a first preset threshold value, and repairing the transverse line broken lines; and/or the presence of a gas in the gas,
and performing broken line repairing treatment on the table vertical line, wherein the broken line repairing treatment comprises the following steps:
determining a continuous vertical line set belonging to the vertical lines of the same table according to a linear clustering method;
and determining the vertical line broken lines belonging to the same table vertical line according to a second preset threshold value, and repairing the vertical line broken lines.
4. The method according to claim 3, wherein the step of filtering out repeated lines of the table horizontal lines and/or the table vertical lines comprises the following steps:
according to a repeated line distinguishing method, screening repeated horizontal lines from the continuous horizontal line set belonging to the same table horizontal line and filtering the repeated horizontal lines; and/or the presence of a gas in the gas,
and screening repeated vertical lines from the continuous vertical line set belonging to the same table vertical line according to a repeated line discrimination method, and filtering the repeated vertical lines.
5. The method according to any one of claims 1 to 4, wherein the intersecting line repairing process is performed on the table horizontal lines and/or the table vertical lines, and comprises the following steps:
determining a target table vertical line which is intersected with each table transverse line but not intersected with each table transverse line based on an intersection line segment discrimination method according to a third preset threshold, and repairing the target table vertical line to enable each table transverse line to be intersected with the target table vertical line; and/or the presence of a gas in the gas,
and determining a target table transverse line which is intersected with each table vertical line but not intersected with the table vertical line according to a fourth preset threshold value based on an intersection line segment discrimination method, and repairing the target table transverse line so as to enable each table vertical line to be intersected with the target table transverse line.
6. The method according to any one of claims 1 to 4, wherein the cell information of each cell includes location information and content information of each cell;
the obtaining of the cell information of each cell of the to-be-processed form image includes:
acquiring the position information of each cell of the to-be-processed form image by a preset image processing method;
identifying the content information of each cell through a Convolution Recurrent Neural Network (CRNN) model;
generating a new table image according to the cell information of each cell and the first table straight line segmentation graph, wherein the generating comprises the following steps:
and generating a new table image according to the position information of each cell and the first table straight line segmentation graph, and filling the content information into each corresponding cell.
7. A form image processing apparatus characterized by comprising:
the acquisition module is used for acquiring a form transverse line and a form vertical line of a form image to be processed;
the first processing module is used for repairing the table transverse line and/or the table vertical line, and the repairing process comprises at least one of broken line repairing process, intersecting line repairing process and repeated line filtering process;
the second processing module is used for drawing a first table straight line segmentation graph of the table image to be processed based on the repaired table transverse line and/or the repaired table vertical line;
and the third processing module is used for acquiring the cell information of each cell of the to-be-processed form image and generating a new form image according to the cell information of each cell and the first form straight line segmentation graph.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory, characterized in that the processor executes the computer program to implement the steps of the method of any of claims 1-6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1-6 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770530.9A CN115082946A (en) | 2022-06-30 | 2022-06-30 | Table image processing method and device, electronic equipment and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210770530.9A CN115082946A (en) | 2022-06-30 | 2022-06-30 | Table image processing method and device, electronic equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115082946A true CN115082946A (en) | 2022-09-20 |
Family
ID=83257609
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210770530.9A Pending CN115082946A (en) | 2022-06-30 | 2022-06-30 | Table image processing method and device, electronic equipment and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115082946A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097697A (en) * | 2024-03-26 | 2024-05-28 | 内蒙古电力勘测设计院有限责任公司 | Processing method, device and equipment for form image |
-
2022
- 2022-06-30 CN CN202210770530.9A patent/CN115082946A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118097697A (en) * | 2024-03-26 | 2024-05-28 | 内蒙古电力勘测设计院有限责任公司 | Processing method, device and equipment for form image |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109902622B (en) | Character detection and identification method for boarding check information verification | |
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN111401371B (en) | Text detection and identification method and system and computer equipment | |
US10817741B2 (en) | Word segmentation system, method and device | |
CN111681273B (en) | Image segmentation method and device, electronic equipment and readable storage medium | |
CN105868758B (en) | method and device for detecting text area in image and electronic equipment | |
CN111460927B (en) | Method for extracting structured information of house property evidence image | |
CN113158808B (en) | Method, medium and equipment for Chinese ancient book character recognition, paragraph grouping and layout reconstruction | |
CN110866529B (en) | Character recognition method, device, electronic equipment and storage medium | |
CN110334709B (en) | License plate detection method based on end-to-end multi-task deep learning | |
CN112507876B (en) | Wired form picture analysis method and device based on semantic segmentation | |
CN112949455B (en) | Value-added tax invoice recognition system and method | |
CN114004204B (en) | Table structure reconstruction and text extraction method and system based on computer vision | |
CN113591719A (en) | Method and device for detecting text with any shape in natural scene and training method | |
CN116311310A (en) | Universal form identification method and device combining semantic segmentation and sequence prediction | |
CN115131797A (en) | Scene text detection method based on feature enhancement pyramid network | |
CN114463767A (en) | Credit card identification method, device, computer equipment and storage medium | |
CN112686265A (en) | Hierarchic contour extraction-based pictograph segmentation method | |
CN114581928A (en) | Form identification method and system | |
CN112528903B (en) | Face image acquisition method and device, electronic equipment and medium | |
CN113837015A (en) | Face detection method and system based on feature pyramid | |
CN111680691B (en) | Text detection method, text detection device, electronic equipment and computer readable storage medium | |
CN116912872A (en) | Drawing identification method, device, equipment and readable storage medium | |
CN115082946A (en) | Table image processing method and device, electronic equipment and computer storage medium | |
CN114783042A (en) | Face recognition method, device, equipment and storage medium based on multiple moving targets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |