CN111881883A

CN111881883A - Form document extraction method based on convolution feature extraction and morphological processing

Info

Publication number: CN111881883A
Application number: CN202010792746.6A
Authority: CN
Inventors: 李进文; 罗宝娟; 严京旗; 周审章; 卞志强; 张成栋
Original assignee: Jingpu Shanghai Artificial Intelligence Technology Co Ltd
Current assignee: Jingpu Shanghai Artificial Intelligence Technology Co Ltd
Priority date: 2020-08-10
Filing date: 2020-08-10
Publication date: 2020-11-03

Abstract

The invention relates to the technical field of target detection and image segmentation, in particular to a table document extraction method based on convolution feature extraction and morphological processing, which specifically comprises the following steps: step 110, acquiring document image data; step 120, preprocessing an image; step 130, loading a network weight file; step 140, performing morphological processing on the image to obtain the table distribution of the current image; step 150, performing semantic segmentation on the image to obtain table distribution of the current image; step 160, correcting the table distribution information in steps 140 and 150; step 170, the method directly detects horizontal and vertical line segments in the binary image by using image morphology through design, has general universality and short calculation time; the image table is obtained by utilizing the semantic segmentation network, so that a single-pixel straight line can be directly obtained, the width of a later manual search single pixel is reduced, and the accuracy of detection is further improved by combining the semantic segmentation network with morphology.

Description

Form document extraction method based on convolution feature extraction and morphological processing

Technical Field

The invention relates to the technical field of target detection and image segmentation, in particular to a table document extraction method based on convolution feature extraction and morphological processing.

Background

Typically, form documents such as medical equipment registration certificates, customs reports, annual papers for listed companies, audit reports, electronic receipts, etc., are published in the form of public PDF electronic text. The PDF documents contain rich information and consist of tables, characters and pictures, target information is searched for by much needed manpower and time, and the technology for extracting the table documents is currently applied, such as searching horizontal and vertical line segments by using a traditional image binarization method; the CNN convolutional neural network (consisting of convolutional layers, pooling layers and full-link layers and used for training parameters by a gradient descent method) is utilized, the neural network method extracts image table features through a plurality of convolutional filters to perform image segmentation, the effect of table extraction is achieved, and problems of noise false detection and threshold value and window size selection exist in a binarization method; for CNN feature extraction image segmentation, the characteristics of line segment continuity judgment and generalization limitation exist.

In summary, the present invention solves the existing problems by designing a table document extraction method based on convolution feature extraction and morphological processing.

Disclosure of Invention

The invention aims to provide a table document extraction method based on convolution feature extraction and morphological processing, which is used for quickly extracting the position information of a table from a document image by combining image morphology with the semantic segmentation of a convolution neural network so as to carry out subsequent structured processing on the document content.

In order to achieve the purpose, the invention provides the following technical scheme:

a table document extraction method based on convolution feature extraction and morphological processing specifically comprises the following steps:

step 110, acquiring document image data;

step 120, preprocessing an image;

step 130, loading a network weight file;

step 140, performing morphological processing on the image to obtain the table distribution of the current image;

step 150, performing semantic segmentation on the image to obtain table distribution of the current image;

step 160, correcting the table distribution information in

steps

140 and 150;

step 170, end.

Further, the method for acquiring the image data in the document in step 110 includes one of the following four methods:

one of the methods is to photograph the obtained image data with a digital camera;

the second method is to use the mobile phone to shoot the obtained image data;

the third method is to use the scanner to obtain the image data;

the fourth method is to open a pre-existing file containing image data, read the data in the file and decompress the image data according to standard algorithm.

Further, the step 120 of preprocessing the image includes the following steps:

step 210, rotating and correcting;

step 220, brightness equalization;

step 230, size normalization (setting various aspect ratios);

and step 240, binarization.

Further, the step 130 of loading the network weight file includes the following steps:

step 310, loading network configuration;

step 320, load the weight file.

Further, the step 140 performs morphological processing on the image to obtain the table distribution of the current image, and includes the following steps:

step 410, performing an opening operation on the image processed in step 120 in the horizontal direction;

step 420, performing an opening operation on the image processed in step 120 in the vertical direction;

step 430, merging the results of step 410 and step 420.

Further, the step 150 performs semantic segmentation on the image to obtain the table distribution of the current image, and includes the following steps:

step 510, sending the image to a network for prediction, and analyzing to obtain line segment information (line segments in horizontal and vertical directions);

step 520, merging the information processed in step 510.

Further, the step 160 corrects the table distribution information in the

steps

140 and 150, and includes the following steps:

step 610, aligning the images processed in the

steps

140 and 150;

and step 620, filtering the oblique line segments processed in the step 140, combining the images processed in the step 610, adjusting the extension of the line segments processed in the step 150, keeping the height of a single pixel on a horizontal line segment, and keeping the width of a single pixel on a vertical line segment.

Compared with the prior art, the invention has the beneficial effects that:

1. in the invention, horizontal and vertical line segments in the binary image are directly detected by utilizing image morphology, so that the method has general universality and short calculation time.

2. In the invention, the image table is obtained by utilizing the semantic segmentation network, the single-pixel straight line can be directly obtained, and the width of the later manual searching single pixel is reduced.

3. In the invention, the detection accuracy is further improved by combining the semantic segmentation network with morphology.

Drawings

FIG. 1 is a flow chart of a document form extraction method of the present invention;

FIG. 2 is a flow chart of image pre-processing;

FIG. 3 is a flow chart of morphological processing of an image;

FIG. 4 is a flow chart of semantic segmentation of an image;

FIG. 5 is a flow chart of a merged modification of the images of FIGS. 3 and 4;

FIG. 6 is a schematic diagram of an entered outpatient invoice from Yiwu hospital in Zhejiang province;

FIG. 7 is a schematic representation of the image of FIG. 6 after being processed by the table extraction method of the present invention;

FIG. 8 is a schematic illustration of a page of an entered national institutes of import registry document;

FIG. 9 is a schematic representation of the image of FIG. 8 after being processed by the table extraction method of the present invention;

FIG. 10 is a schematic illustration of an entered Beijing medical hospitalization billing invoice;

FIG. 11 is a schematic representation of the image of FIG. 10 after being processed by the table extraction method of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.

Referring to fig. 1-11, the present invention provides a technical solution:

step 110, acquiring document image data;

step 120, preprocessing an image;

step 130, loading a network weight file;

step 160, correcting the table distribution information in

steps

140 and 150;

step 170, end.

The specific implementation case is as follows:

example 1:

as shown in fig. 1, the present invention provides a table document extraction method based on convolution feature extraction and morphological processing, the method includes the following steps:

step 110, acquiring document image data;

the method for acquiring image data comprises one of the following four methods:

the second method is to use the mobile phone to shoot the obtained image data;

the third method is to use the scanner to obtain the image data;

Step 120, preprocessing the image;

referring to fig. 2, step 120 specifically includes the following steps:

step 210, rotating and correcting;

step 220, brightness equalization;

step 230, size normalization (setting various aspect ratios);

and step 240, binarization.

Step 130, loading a network weight file, comprising the following steps:

step 310, loading network configuration;

step 320, load the weight file.

Step 140, performing morphological processing on the image to obtain table distribution of the current image;

referring to fig. 3, the method includes the following steps:

step 430, merging the results of step 410 and step 420.

referring to fig. 4, the method includes the following steps:

step 520, merging the information processed in step 510.

Step 160, the table distribution information in the step 140 and the step 150 is corrected;

referring to fig. 5, the method includes the following steps:

step 610, aligning the images processed in the

steps

140 and 150;

Step 170 is ended.

In summary, the table document extraction method based on convolution feature extraction and morphology processing provided by the invention utilizes technologies in the fields of image processing, target detection, deep learning and the like, utilizes image morphology to directly detect horizontal and vertical line segments in a binary image, has general universality and short calculation time, thereby improving the sorting accuracy, utilizes a semantic segmentation network to obtain an image table, can directly obtain a single-pixel straight line, and reduces the width of a later manual search single pixel; the semantic segmentation network further improves the accuracy of detection by combining morphology.

Example 2:

fig. 6 shows scanned out-patient invoice of a hospital in beijia province in Zhejiang province (in the figure, if the private information is involved, the erasing process is performed), and fig. 7 shows the scanned out-patient invoice of a hospital in Zhejiang province by using the method of the present invention.

Example 3:

fig. 8 shows a page of a scanned document of a national medical device registry on a general computer (in the figure, if the document relates to private information, the document is subjected to an erasing process), and fig. 9 shows the page after the method of the present invention is used.

Example 4:

fig. 10 shows a scanned bill for medical stay in beijing (in the figure, the bill is erased when the private information is included), and fig. 11 shows the bill after the method of the present invention is applied.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A table document extraction method based on convolution feature extraction and morphological processing specifically comprises the following steps:

step 110, acquiring document image data;

step 120, preprocessing an image;

step 130, loading a network weight file;

step 160, correcting the table distribution information in steps 140 and 150;

step 170, end.

2. The method of claim 1, wherein the method comprises the following steps: the method for acquiring the image data in the document in step 110 includes one of four methods:

the second method is to use the mobile phone to shoot the obtained image data;

the third method is to use the scanner to obtain the image data;

3. The method of claim 1, wherein the method comprises the following steps: the step 120 of preprocessing the image includes the following steps:

step 210, rotating and correcting;

step 220, brightness equalization;

step 230, size normalization (setting various aspect ratios);

and step 240, binarization.

4. The method of claim 1, wherein the method comprises the following steps: the step 130 of loading the network weight file includes the following steps:

step 310, loading network configuration;

step 320, load the weight file.

5. The method of claim 1, wherein the method comprises the following steps: the step 140 of performing morphological processing on the image to obtain the table distribution of the current image includes the following steps:

step 430, merging the results of step 410 and step 420.

6. The method of claim 1, wherein the method comprises the following steps: the step 150 of performing semantic segmentation on the image to obtain the table distribution of the current image includes the following steps:

step 520, merging the information processed in step 510.

7. The method of claim 1, wherein the method comprises the following steps: the step 160 corrects the table distribution information in the steps 140 and 150, and includes the following steps:

step 610, aligning the images processed in the steps 140 and 150;