CN111881883A - Form document extraction method based on convolution feature extraction and morphological processing - Google Patents

Form document extraction method based on convolution feature extraction and morphological processing Download PDF

Info

Publication number
CN111881883A
CN111881883A CN202010792746.6A CN202010792746A CN111881883A CN 111881883 A CN111881883 A CN 111881883A CN 202010792746 A CN202010792746 A CN 202010792746A CN 111881883 A CN111881883 A CN 111881883A
Authority
CN
China
Prior art keywords
image
steps
following
image data
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010792746.6A
Other languages
Chinese (zh)
Inventor
李进文
罗宝娟
严京旗
周审章
卞志强
张成栋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingpu Shanghai Artificial Intelligence Technology Co Ltd
Original Assignee
Jingpu Shanghai Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingpu Shanghai Artificial Intelligence Technology Co Ltd filed Critical Jingpu Shanghai Artificial Intelligence Technology Co Ltd
Priority to CN202010792746.6A priority Critical patent/CN111881883A/en
Publication of CN111881883A publication Critical patent/CN111881883A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/243Aligning, centring, orientation detection or correction of the image by compensating for image skew or non-uniform image deformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of target detection and image segmentation, in particular to a table document extraction method based on convolution feature extraction and morphological processing, which specifically comprises the following steps: step 110, acquiring document image data; step 120, preprocessing an image; step 130, loading a network weight file; step 140, performing morphological processing on the image to obtain the table distribution of the current image; step 150, performing semantic segmentation on the image to obtain table distribution of the current image; step 160, correcting the table distribution information in steps 140 and 150; step 170, the method directly detects horizontal and vertical line segments in the binary image by using image morphology through design, has general universality and short calculation time; the image table is obtained by utilizing the semantic segmentation network, so that a single-pixel straight line can be directly obtained, the width of a later manual search single pixel is reduced, and the accuracy of detection is further improved by combining the semantic segmentation network with morphology.

Description

Form document extraction method based on convolution feature extraction and morphological processing
Technical Field
The invention relates to the technical field of target detection and image segmentation, in particular to a table document extraction method based on convolution feature extraction and morphological processing.
Background
Typically, form documents such as medical equipment registration certificates, customs reports, annual papers for listed companies, audit reports, electronic receipts, etc., are published in the form of public PDF electronic text. The PDF documents contain rich information and consist of tables, characters and pictures, target information is searched for by much needed manpower and time, and the technology for extracting the table documents is currently applied, such as searching horizontal and vertical line segments by using a traditional image binarization method; the CNN convolutional neural network (consisting of convolutional layers, pooling layers and full-link layers and used for training parameters by a gradient descent method) is utilized, the neural network method extracts image table features through a plurality of convolutional filters to perform image segmentation, the effect of table extraction is achieved, and problems of noise false detection and threshold value and window size selection exist in a binarization method; for CNN feature extraction image segmentation, the characteristics of line segment continuity judgment and generalization limitation exist.
In summary, the present invention solves the existing problems by designing a table document extraction method based on convolution feature extraction and morphological processing.
Disclosure of Invention
The invention aims to provide a table document extraction method based on convolution feature extraction and morphological processing, which is used for quickly extracting the position information of a table from a document image by combining image morphology with the semantic segmentation of a convolution neural network so as to carry out subsequent structured processing on the document content.
In order to achieve the purpose, the invention provides the following technical scheme:
a table document extraction method based on convolution feature extraction and morphological processing specifically comprises the following steps:
step 110, acquiring document image data;
step 120, preprocessing an image;
step 130, loading a network weight file;
step 140, performing morphological processing on the image to obtain the table distribution of the current image;
step 150, performing semantic segmentation on the image to obtain table distribution of the current image;
step 160, correcting the table distribution information in steps 140 and 150;
step 170, end.
Further, the method for acquiring the image data in the document in step 110 includes one of the following four methods:
one of the methods is to photograph the obtained image data with a digital camera;
the second method is to use the mobile phone to shoot the obtained image data;
the third method is to use the scanner to obtain the image data;
the fourth method is to open a pre-existing file containing image data, read the data in the file and decompress the image data according to standard algorithm.
Further, the step 120 of preprocessing the image includes the following steps:
step 210, rotating and correcting;
step 220, brightness equalization;
step 230, size normalization (setting various aspect ratios);
and step 240, binarization.
Further, the step 130 of loading the network weight file includes the following steps:
step 310, loading network configuration;
step 320, load the weight file.
Further, the step 140 performs morphological processing on the image to obtain the table distribution of the current image, and includes the following steps:
step 410, performing an opening operation on the image processed in step 120 in the horizontal direction;
step 420, performing an opening operation on the image processed in step 120 in the vertical direction;
step 430, merging the results of step 410 and step 420.
Further, the step 150 performs semantic segmentation on the image to obtain the table distribution of the current image, and includes the following steps:
step 510, sending the image to a network for prediction, and analyzing to obtain line segment information (line segments in horizontal and vertical directions);
step 520, merging the information processed in step 510.
Further, the step 160 corrects the table distribution information in the steps 140 and 150, and includes the following steps:
step 610, aligning the images processed in the steps 140 and 150;
and step 620, filtering the oblique line segments processed in the step 140, combining the images processed in the step 610, adjusting the extension of the line segments processed in the step 150, keeping the height of a single pixel on a horizontal line segment, and keeping the width of a single pixel on a vertical line segment.
Compared with the prior art, the invention has the beneficial effects that:
1. in the invention, horizontal and vertical line segments in the binary image are directly detected by utilizing image morphology, so that the method has general universality and short calculation time.
2. In the invention, the image table is obtained by utilizing the semantic segmentation network, the single-pixel straight line can be directly obtained, and the width of the later manual searching single pixel is reduced.
3. In the invention, the detection accuracy is further improved by combining the semantic segmentation network with morphology.
Drawings
FIG. 1 is a flow chart of a document form extraction method of the present invention;
FIG. 2 is a flow chart of image pre-processing;
FIG. 3 is a flow chart of morphological processing of an image;
FIG. 4 is a flow chart of semantic segmentation of an image;
FIG. 5 is a flow chart of a merged modification of the images of FIGS. 3 and 4;
FIG. 6 is a schematic diagram of an entered outpatient invoice from Yiwu hospital in Zhejiang province;
FIG. 7 is a schematic representation of the image of FIG. 6 after being processed by the table extraction method of the present invention;
FIG. 8 is a schematic illustration of a page of an entered national institutes of import registry document;
FIG. 9 is a schematic representation of the image of FIG. 8 after being processed by the table extraction method of the present invention;
FIG. 10 is a schematic illustration of an entered Beijing medical hospitalization billing invoice;
FIG. 11 is a schematic representation of the image of FIG. 10 after being processed by the table extraction method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
Referring to fig. 1-11, the present invention provides a technical solution:
a table document extraction method based on convolution feature extraction and morphological processing specifically comprises the following steps:
step 110, acquiring document image data;
step 120, preprocessing an image;
step 130, loading a network weight file;
step 140, performing morphological processing on the image to obtain the table distribution of the current image;
step 150, performing semantic segmentation on the image to obtain table distribution of the current image;
step 160, correcting the table distribution information in steps 140 and 150;
step 170, end.
The specific implementation case is as follows:
example 1:
as shown in fig. 1, the present invention provides a table document extraction method based on convolution feature extraction and morphological processing, the method includes the following steps:
step 110, acquiring document image data;
the method for acquiring image data comprises one of the following four methods:
one of the methods is to photograph the obtained image data with a digital camera;
the second method is to use the mobile phone to shoot the obtained image data;
the third method is to use the scanner to obtain the image data;
the fourth method is to open a pre-existing file containing image data, read the data in the file and decompress the image data according to standard algorithm.
Step 120, preprocessing the image;
referring to fig. 2, step 120 specifically includes the following steps:
step 210, rotating and correcting;
step 220, brightness equalization;
step 230, size normalization (setting various aspect ratios);
and step 240, binarization.
Step 130, loading a network weight file, comprising the following steps:
step 310, loading network configuration;
step 320, load the weight file.
Step 140, performing morphological processing on the image to obtain table distribution of the current image;
referring to fig. 3, the method includes the following steps:
step 410, performing an opening operation on the image processed in step 120 in the horizontal direction;
step 420, performing an opening operation on the image processed in step 120 in the vertical direction;
step 430, merging the results of step 410 and step 420.
Step 150, performing semantic segmentation on the image to obtain table distribution of the current image;
referring to fig. 4, the method includes the following steps:
step 510, sending the image to a network for prediction, and analyzing to obtain line segment information (line segments in horizontal and vertical directions);
step 520, merging the information processed in step 510.
Step 160, the table distribution information in the step 140 and the step 150 is corrected;
referring to fig. 5, the method includes the following steps:
step 610, aligning the images processed in the steps 140 and 150;
and step 620, filtering the oblique line segments processed in the step 140, combining the images processed in the step 610, adjusting the extension of the line segments processed in the step 150, keeping the height of a single pixel on a horizontal line segment, and keeping the width of a single pixel on a vertical line segment.
Step 170 is ended.
In summary, the table document extraction method based on convolution feature extraction and morphology processing provided by the invention utilizes technologies in the fields of image processing, target detection, deep learning and the like, utilizes image morphology to directly detect horizontal and vertical line segments in a binary image, has general universality and short calculation time, thereby improving the sorting accuracy, utilizes a semantic segmentation network to obtain an image table, can directly obtain a single-pixel straight line, and reduces the width of a later manual search single pixel; the semantic segmentation network further improves the accuracy of detection by combining morphology.
Example 2:
fig. 6 shows scanned out-patient invoice of a hospital in beijia province in Zhejiang province (in the figure, if the private information is involved, the erasing process is performed), and fig. 7 shows the scanned out-patient invoice of a hospital in Zhejiang province by using the method of the present invention.
Example 3:
fig. 8 shows a page of a scanned document of a national medical device registry on a general computer (in the figure, if the document relates to private information, the document is subjected to an erasing process), and fig. 9 shows the page after the method of the present invention is used.
Example 4:
fig. 10 shows a scanned bill for medical stay in beijing (in the figure, the bill is erased when the private information is included), and fig. 11 shows the bill after the method of the present invention is applied.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (7)

1. A table document extraction method based on convolution feature extraction and morphological processing specifically comprises the following steps:
step 110, acquiring document image data;
step 120, preprocessing an image;
step 130, loading a network weight file;
step 140, performing morphological processing on the image to obtain the table distribution of the current image;
step 150, performing semantic segmentation on the image to obtain table distribution of the current image;
step 160, correcting the table distribution information in steps 140 and 150;
step 170, end.
2. The method of claim 1, wherein the method comprises the following steps: the method for acquiring the image data in the document in step 110 includes one of four methods:
one of the methods is to photograph the obtained image data with a digital camera;
the second method is to use the mobile phone to shoot the obtained image data;
the third method is to use the scanner to obtain the image data;
the fourth method is to open a pre-existing file containing image data, read the data in the file and decompress the image data according to standard algorithm.
3. The method of claim 1, wherein the method comprises the following steps: the step 120 of preprocessing the image includes the following steps:
step 210, rotating and correcting;
step 220, brightness equalization;
step 230, size normalization (setting various aspect ratios);
and step 240, binarization.
4. The method of claim 1, wherein the method comprises the following steps: the step 130 of loading the network weight file includes the following steps:
step 310, loading network configuration;
step 320, load the weight file.
5. The method of claim 1, wherein the method comprises the following steps: the step 140 of performing morphological processing on the image to obtain the table distribution of the current image includes the following steps:
step 410, performing an opening operation on the image processed in step 120 in the horizontal direction;
step 420, performing an opening operation on the image processed in step 120 in the vertical direction;
step 430, merging the results of step 410 and step 420.
6. The method of claim 1, wherein the method comprises the following steps: the step 150 of performing semantic segmentation on the image to obtain the table distribution of the current image includes the following steps:
step 510, sending the image to a network for prediction, and analyzing to obtain line segment information (line segments in horizontal and vertical directions);
step 520, merging the information processed in step 510.
7. The method of claim 1, wherein the method comprises the following steps: the step 160 corrects the table distribution information in the steps 140 and 150, and includes the following steps:
step 610, aligning the images processed in the steps 140 and 150;
and step 620, filtering the oblique line segments processed in the step 140, combining the images processed in the step 610, adjusting the extension of the line segments processed in the step 150, keeping the height of a single pixel on a horizontal line segment, and keeping the width of a single pixel on a vertical line segment.
CN202010792746.6A 2020-08-10 2020-08-10 Form document extraction method based on convolution feature extraction and morphological processing Pending CN111881883A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010792746.6A CN111881883A (en) 2020-08-10 2020-08-10 Form document extraction method based on convolution feature extraction and morphological processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010792746.6A CN111881883A (en) 2020-08-10 2020-08-10 Form document extraction method based on convolution feature extraction and morphological processing

Publications (1)

Publication Number Publication Date
CN111881883A true CN111881883A (en) 2020-11-03

Family

ID=73211362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010792746.6A Pending CN111881883A (en) 2020-08-10 2020-08-10 Form document extraction method based on convolution feature extraction and morphological processing

Country Status (1)

Country Link
CN (1) CN111881883A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation
CN110032989A (en) * 2019-04-23 2019-07-19 福州大学 A kind of form document image classification method based on wire feature and pixel distribution
CN110033471A (en) * 2019-04-19 2019-07-19 福州大学 A kind of wire detection method based on connected domain analysis and morphological operation
CN110136154A (en) * 2019-05-16 2019-08-16 西安电子科技大学 Remote sensing images semantic segmentation method based on full convolutional network and Morphological scale-space
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
US20190303663A1 (en) * 2018-03-30 2019-10-03 Wipro Limited Method and system for detecting and extracting a tabular data from a document
CN110399875A (en) * 2019-07-31 2019-11-01 山东浪潮人工智能研究院有限公司 A kind of form of general use information extracting method based on deep learning and pixel projection
WO2020140698A1 (en) * 2019-01-04 2020-07-09 阿里巴巴集团控股有限公司 Table data acquisition method and apparatus, and server
CN111460921A (en) * 2020-03-13 2020-07-28 华南理工大学 Lane line detection method based on multitask semantic segmentation

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190303663A1 (en) * 2018-03-30 2019-10-03 Wipro Limited Method and system for detecting and extracting a tabular data from a document
CN109086714A (en) * 2018-07-31 2018-12-25 国科赛思(北京)科技有限公司 Table recognition method, identifying system and computer installation
CN110163198A (en) * 2018-09-27 2019-08-23 腾讯科技(深圳)有限公司 A kind of Table recognition method for reconstructing, device and storage medium
WO2020140698A1 (en) * 2019-01-04 2020-07-09 阿里巴巴集团控股有限公司 Table data acquisition method and apparatus, and server
CN110033471A (en) * 2019-04-19 2019-07-19 福州大学 A kind of wire detection method based on connected domain analysis and morphological operation
CN110032989A (en) * 2019-04-23 2019-07-19 福州大学 A kind of form document image classification method based on wire feature and pixel distribution
CN110136154A (en) * 2019-05-16 2019-08-16 西安电子科技大学 Remote sensing images semantic segmentation method based on full convolutional network and Morphological scale-space
CN110399875A (en) * 2019-07-31 2019-11-01 山东浪潮人工智能研究院有限公司 A kind of form of general use information extracting method based on deep learning and pixel projection
CN111460921A (en) * 2020-03-13 2020-07-28 华南理工大学 Lane line detection method based on multitask semantic segmentation

Similar Documents

Publication Publication Date Title
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
Asghar et al. Copy-move and splicing image forgery detection and localization techniques: a review
Gatos et al. Automatic table detection in document images
CN107688806B (en) Affine transformation-based free scene text detection method
Shen et al. Improving OCR performance with background image elimination
US8611662B2 (en) Text detection using multi-layer connected components with histograms
Yang et al. A framework for improved video text detection and recognition
CN103208004A (en) Automatic recognition and extraction method and device for bill information area
CN107766854B (en) Method for realizing rapid page number identification based on template matching
Roy et al. Fractional poisson enhancement model for text detection and recognition in video frames
CN105825211A (en) Method, device and system for recognizing name card
CN112036259A (en) Form correction and recognition method based on combination of image processing and deep learning
CN109766750A (en) A kind of table line position finding and detection method of financial statement
Liu et al. Stroke filter for text localization in video images
Malik et al. An efficient skewed line segmentation technique for cursive script OCR
Bai et al. A fast stroke-based method for text detection in video
CN113033562A (en) Image processing method, device, equipment and storage medium
Grover et al. Text extraction from document images using edge information
CN111881883A (en) Form document extraction method based on convolution feature extraction and morphological processing
Seeri et al. A novel approach for Kannada text extraction
Oh et al. Low-complexity and robust comic fingerprint method for comic identification
Arai et al. Method for extracting product information from TV commercial
Nwokoma et al. Camera-based OCR scene text detection issues: A review
Madan Kumar et al. Text extraction from business cards and classification of extracted text into predefined classes
CN113344096A (en) Automatic bid document analysis method and system based on OCR technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201103

RJ01 Rejection of invention patent application after publication