CN109543525B - Table extraction method for general table image - Google Patents

Table extraction method for general table image Download PDF

Info

Publication number
CN109543525B
CN109543525B CN201811217691.5A CN201811217691A CN109543525B CN 109543525 B CN109543525 B CN 109543525B CN 201811217691 A CN201811217691 A CN 201811217691A CN 109543525 B CN109543525 B CN 109543525B
Authority
CN
China
Prior art keywords
image
array
weight
intersection point
dimensional array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811217691.5A
Other languages
Chinese (zh)
Other versions
CN109543525A (en
Inventor
边赟
李天易
罗嘉礼
巫浩
李腾飞
倪浩原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Zhongke Information Technology Co ltd
Original Assignee
Chengdu Zhongke Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Zhongke Information Technology Co ltd filed Critical Chengdu Zhongke Information Technology Co ltd
Priority to CN201811217691.5A priority Critical patent/CN109543525B/en
Publication of CN109543525A publication Critical patent/CN109543525A/en
Application granted granted Critical
Publication of CN109543525B publication Critical patent/CN109543525B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of office automation, and discloses a table extraction method of a general table image, which comprises the following steps: s1, preprocessing the universal form image; s2, filtering characters in the preprocessed image by using morphological operation; s3, carrying out reconstruction operation on the filtered image; s4, redrawing the table to realize table extraction; the invention solves the problems of low speed and low accuracy of table extraction in the prior art.

Description

Table extraction method for general table image
Technical Field
The invention belongs to the field of office automation, and particularly relates to a table extraction method for a general table image.
Background
The form, also called table, is a visual communication mode and a means for organizing and organizing data. People widely adopt tables with different shapes and colors in communication, scientific research and data analysis activities. Forms of various kinds often appear in many places such as print media, handwritten records, computer software, architectural decorations, traffic signs, and the like. The conventions and terms used to describe the tables exactly will vary from context to context. In addition, different tables may differ in kind, structure, flexibility, labeling, expression, and use. In various books and technical articles, tables are usually placed in a floating area with numbers and titles, thereby distinguishing them from the body parts of the articles.
Forms are widely used in various industries as a highly refined, centralized form of information presentation. With the popularization of computers and the improvement of enterprise informatization degree, the use of computers for making forms is increasing. In practical applications, due to differences in industries and application fields, contents and formats of forms are very different, it is difficult to meet various application requirements with several specific form styles, and people have increasingly used electronic documents instead of paper documents, for example, a user can take an image of a paper document with a smart phone and then send the image to others to complete information transmission, but electronic documents obtained by shooting or scanning are stored in a picture format.
In summary, the diversity of tables and the table images stored in the picture format make the tables difficult to extract.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention aims to provide a fast and accurate table extraction method for a general table image, which extracts a table from the table image, improves the efficiency of document management in office automation, and solves the problems of slow speed and low accuracy of table extraction in the prior art.
The technical scheme adopted by the invention is as follows:
a table extraction method of a universal table image comprises the following steps:
s1: preprocessing the universal form image;
s2: filtering characters in the preprocessed image by using morphological operation;
s3: the reconstruction operation is carried out on the filtered image, and the method comprises the following steps:
s3-1: and performing open operation image reconstruction, comprising the following steps:
a-1: initializing an input filtered image instant operation image into an original marked image;
a-2: creating a second structural element;
a-3: and carrying out reconstruction operation according to the original marked image and the second structural element, wherein the calculation formula is as follows:
Figure GDA0002764956810000021
in the formula, hk+1Marking the image for the current time; h iskMarking the image for the previous generation; k is an iteration parameter; b is a second structural element;
Figure GDA0002764956810000022
is a function of the expansion operation; g is a mask image; n is intersection;
a-4: judging whether the current marked image is the same as the previous generation marked image, if so, realizing reconstruction, outputting the obtained current marked image, otherwise, updating the iteration parameters, and returning to the step A-3;
s3-2: extracting line feature coordinates of the obtained marked image, comprising the following steps:
b-1: representing the binary image by adopting two-dimensional data;
b-2: taking the first element of the two-dimensional array as a projection origin, projecting other elements in the horizontal and vertical directions, and obtaining an accumulated value corresponding to the projection direction;
b-3: traversing the accumulated values, and screening out all feature coordinates meeting the conditions, wherein the accumulated values corresponding to the feature coordinates are larger than the minimum distance between the coordinate horizontal lines in the two projection directions;
b-4: further analyzing the characteristic coordinates, and extracting a middle value when more than one characteristic coordinate meets the conditions;
b-5: traversing other elements of the two-dimensional array, and repeating the steps B-2 to B-4 to obtain a characteristic coordinate array;
s3-3: classifying the image intersection points according to the characteristic coordinate array, and calculating a weight array of a table in the marked image, wherein the method comprises the following steps:
c-1: dividing the table into different types according to the mode that the regular line segments of the table are connected with the grid points;
c-2: traversing each intersection point according to the obtained characteristic coordinate array, namely the horizontal and vertical line coordinate array;
the intersection points are the intersection points of the horizontal lines and the vertical lines in the table;
c-3: sequentially extracting image blocks in four directions by taking the current intersection point as a starting point, and sequentially judging whether a straight line exists, wherein if the straight line exists, the output result is 1, and if not, the output result is 0;
the four directions are: a lateral positive direction, a lateral negative direction, a longitudinal positive direction, and a longitudinal negative direction;
c-4: classifying the intersection points according to the output result, storing the output result in the established corresponding three-dimensional array, and modifying and checking all the three-dimensional arrays according to the modification rule;
the modification rule is as follows: if a transverse positive straight line exists at the intersection point, a transverse negative straight line inevitably exists at the intersection point on the right side of the intersection point, and the other directions are the same;
c-5: and obtaining the weight sum of each intersection point according to the weight of each corresponding direction in the three-dimensional array, and storing the weight sum in a weight array, wherein the calculation formula is as follows:
p(x,y)=q1*p(x,y,0)+q2*p(x,y,1)+q3*p(x,y,2)+q4*p(x,y,3)
in the formula, p (x, y) is a weight array; q. q.s1P (x, y,0) is the product of the weight corresponding to the horizontal direction and the three-dimensional array; q. q.s2P (x, y,1) is the product of the weight corresponding to the negative direction and the three-dimensional array; q. q.s3P (x, y,2) is the product of the weight corresponding to the longitudinal direction and the three-dimensional array; q. q.s4P (x, y,3) is the product of the weight corresponding to the longitudinal direction and the negative direction and the three-dimensional array;
s4: redrawing the table to realize table extraction;
and traversing each row and each column according to the type of the intersection point and the weight array, searching corresponding start endpoint coordinates and end endpoints, connecting the start endpoint coordinates and the end endpoints, redrawing the table, and extracting the table.
Further, in step S1, the preprocessing includes reduction processing, gradation processing, and binarization processing performed in this order.
Further, in step S1, the calculation formula of the preprocessing is:
carrying out gray value processing by using a weighting method, wherein the calculation formula is as follows:
F(i,j)=0.3R(i,j)+0.59G(i,j)+0.11B(i,j)
in the formula, F (i, j) is the gray value of the pixel point; r (i, j), G (i, j) and B (i, j) are corresponding to the brightness values of the red, green and blue colors of the pixel points; (i, j) are pixel point coordinates;
the calculation formula for the binarization processing is as follows:
Figure GDA0002764956810000041
wherein g (x, y) is a binarized image; f (x, y) is an input gray image; t is a threshold value.
Further, the gray value with the largest inter-class variance is used as the optimal threshold value for binarization processing, and the calculation formula is as follows:
T*=arg max[g(t)]
in the formula, T*Is an optimal threshold value; g (t) is the between-class variance; t is a selected gray value variable, and t belongs to { 0., M-1}, wherein M is the number of gray values in the image; arg max (·) is the maximum satisfaction function.
Further, in step S2, the morphological operation performed on the preprocessed image, i.e., the binarized image, is an on operation, i.e., an erosion operation and an expansion operation are performed in sequence, and the calculation formula is as follows:
Figure GDA0002764956810000051
in the formula (I), the compound is shown in the specification,
Figure GDA0002764956810000052
is an open operation function; SE is a first structural element; a is an input binary image;
Figure GDA0002764956810000053
is a function of the expansion operation; theta is the corrosion operating function.
The invention has the beneficial effects that:
1) the method is based on the digital image processing technology, and is used for extracting the forms in the general form image, so that the speed and the practicability of the extraction method are improved;
2) the method filters the characters in the general form image by using morphological operation, and filters the characters in the image by using open operation, thereby eliminating fine noise, achieving the effect of denoising the target, smoothing the outline of the target object, eliminating fine protrusions and improving the accuracy of form extraction;
3) the invention improves the efficiency of document management in office automation and solves the problems of low speed and low accuracy of table extraction in the prior art.
Drawings
FIG. 1 is a flow diagram of a table extraction method for a generic table image;
FIG. 2 is a flow chart of a method of a reconstruction operation;
FIG. 3 is a flow chart of a method of performing an on-operation image reconstruction;
FIG. 4 is a flow chart of a method of extracting line feature coordinates of an acquired marker image;
FIG. 5 is a flowchart of a method of classifying image intersections and calculating a weight array for tables in a labeled image;
FIG. 6 is a generic form image;
FIG. 7 is a filtered image;
FIG. 8 is a redrawn table.
Detailed Description
The invention is further explained below with reference to the drawings and the specific embodiments.
Example 1:
as shown in fig. 1, the present embodiment provides a table extraction method for a common table image, including the following steps:
s1: preprocessing is performed on the general form image shown in fig. 6, wherein the preprocessing includes reduction processing, gray scale processing and binarization processing which are performed in sequence;
carrying out gray value processing by using a weighting method, wherein the calculation formula is as follows:
F(i,j)=0.3R(i,j)+0.59G(i,j)+0.11B(i,j)
in the formula, F (i, j) is the gray value of the pixel point; r (i, j), G (i, j) and B (i, j) are corresponding to the brightness values of the red, green and blue colors of the pixel points; (i, j) are pixel point coordinates;
the calculation formula for the binarization processing is as follows:
Figure GDA0002764956810000061
wherein g (x, y) is a binarized image; f (x, y) is an input gray image; t is a threshold value;
the gray value with the largest inter-class variance is used as the optimal threshold value of binarization processing, and the calculation formula is as follows:
T*=arg max[g(t)]
in the formula, T*Is an optimal threshold value; g (t) is the between-class variance; t is a selected gray value variable, and t belongs to { 0., M-1}, wherein M is the number of gray values in the image, and argmax (·) is a maximum satisfying function;
s2: filtering characters in the preprocessed image by using morphological operation;
performing morphological operation on the preprocessed image, namely the binary image, to be open operation, namely performing corrosion operation and expansion operation in sequence;
the dilation operation is "domain dilation" in which a highlight portion in an image is dilated, an effect graph has a highlight area larger than an original image, mathematically, dilation is defined as a set operation, and dilation of SE to a is defined as:
Figure GDA0002764956810000071
this formula shows that SE expands A as a result of
Figure GDA0002764956810000072
A set of points z overlapping at least one pixel with A, the dilation operation being operative to incorporate background points with the target area into the object, to fill in objects on the imageA small space in the middle;
erosion is "field predated" where the highlight in the original is eroded, the effect map has a smaller highlight area than the original, mathematically, the dilation or erosion operation is a convolution of the image (or a portion of the image) with a kernel, and the erosion of SE on a is defined as:
A⊙SE={z|(SE)Z∈A}
this formula shows that the result of the expansion of SE on A is such that (SE)zContaining the set of points z in A, the erosion operation will remove burr bumps in the image that are smaller than the structural elements;
in practical operation, when the binary image is processed, the dilation operation and the erosion operation are combined together, and the operation of opening operation is to perform the erosion operation on the A by the SE first and then perform the dilation operation. The calculation formula of the open operation of SE on a is as follows:
Figure GDA0002764956810000073
in the formula (I), the compound is shown in the specification,
Figure GDA0002764956810000081
is an open operation function; SE is a first structural element; a is an input binary image;
Figure GDA0002764956810000082
is a function of the expansion operation; theta is the corrosion operating function.
The opening operation can eliminate fine noise, so that the effect of denoising the target is achieved, the contour of the target object is smoothed, and fine protrusions are eliminated;
s3: performing a reconstruction operation on the filtered image as shown in fig. 7;
the reconstruction operation, as shown in fig. 2, includes the following steps:
s3-1: performing on-operation image reconstruction, which is a morphological transformation involving two images and a structural element, one image being a marker image, i.e. a starting point of the transformation, and the other image being a mask image, and used for constraining the transformation process, as shown in fig. 3, the marker image obtaining method for the reconstruction operation includes the following steps:
a-1: initializing an input filtered image instant operation image into an original marked image;
a-2: creating a second structural element;
a-3: and carrying out reconstruction operation according to the original marked image and the second structural element, wherein the calculation formula is as follows:
Figure GDA0002764956810000083
in the formula, hk+1Marking the image for the current time; h iskMarking the image for the previous generation; k is an iteration parameter; b is a second structural element;
Figure GDA0002764956810000084
is a function of the expansion operation; g is a mask image; n is intersection;
a-4: judging whether the current marked image is the same as the previous generation marked image, if so, realizing reconstruction, outputting the obtained current marked image, otherwise, updating the iteration parameters, and returning to the step A-3;
s3-2: extracting the line feature coordinates of the obtained marked image, as shown in fig. 4, a binary image can be represented by a two-dimensional array, including the following steps:
b-1: representing the binary image by adopting two-dimensional data;
b-2: taking the first element of the two-dimensional array as a projection origin, projecting other elements in the horizontal and vertical directions, and obtaining an accumulated value corresponding to the projection direction;
b-3: traversing the accumulated values, and screening out all feature coordinates meeting the conditions, wherein the accumulated values corresponding to the feature coordinates are larger than the minimum distance between the coordinate horizontal lines in the two projection directions;
b-4: further analyzing the characteristic coordinates, and extracting a middle value when more than one characteristic coordinate meets the conditions;
b-5: traversing other elements of the two-dimensional array, and repeating the steps B-2 to B-4 to obtain a characteristic coordinate array;
s3-3: classifying the image intersection points according to the feature coordinate array, and calculating the weight array of the table in the marked image, as shown in fig. 5, the method comprises the following steps:
c-1: dividing the table into different types according to the mode that the regular line segments of the table are connected with the grid points;
c-2: traversing each intersection point according to the obtained characteristic coordinate array, namely the horizontal and vertical line coordinate array;
the intersection points are the intersection points of the horizontal lines and the vertical lines in the table;
c-3: sequentially extracting image blocks in four directions by taking the current intersection point as a starting point, and sequentially judging whether a straight line exists, wherein if the straight line exists, the output result is 1, and if not, the output result is 0;
the four directions are: a lateral positive direction, a lateral negative direction, a longitudinal positive direction, and a longitudinal negative direction;
c-4: classifying the intersection points according to the output result, storing the output result in the established corresponding three-dimensional array, and modifying and checking all the three-dimensional arrays according to the modification rule;
the modification rule is as follows: if a transverse positive straight line exists at the intersection point, a transverse negative straight line inevitably exists at the intersection point on the right side of the intersection point, and the other directions are the same;
c-5: and obtaining the weight sum of each intersection point according to the weight of each corresponding direction in the three-dimensional array, and storing the weight sum in a weight array, wherein the calculation formula is as follows:
p(x,y)=q1*p(x,y,0)+q2*p(x,y,1)+q3*p(x,y,2)+q4*p(x,y,3)
in the formula, p (x, y) is a weight array; q. q.s1P (x, y,0) is the product of the weight corresponding to the horizontal direction and the three-dimensional array; q. q.s2P (x, y,1) is the product of the weight corresponding to the negative direction and the three-dimensional array; q. q.s3P (x, y,2) is the product of the weight corresponding to the longitudinal direction and the three-dimensional array; q. q.s4P (x, y,3) is the product of the weight corresponding to the longitudinal direction and the negative direction and the three-dimensional array;
s4: and (3) redrawing the table to extract the table, as shown in fig. 8, traversing each row and each column according to the type of the intersection point and the weight array, searching corresponding start endpoint coordinates and end endpoints, connecting the start endpoint coordinates and the end endpoints to finish redrawing the table, and extracting the table.
The invention aims to provide a rapid and accurate table extraction method of a general table image, which is used for extracting a table from the table image, improving the management efficiency of documents in office automation and solving the problems of low speed and low accuracy of table extraction in the prior art.
The present invention is not limited to the above-described alternative embodiments, and various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims (5)

1. A form extraction method of a general form image is characterized in that: the method comprises the following steps:
s1: preprocessing the universal form image;
s2: filtering characters in the preprocessed image by using morphological operation;
s3: the reconstruction operation is carried out on the filtered image, and the method comprises the following steps:
s3-1: and performing open operation image reconstruction, comprising the following steps:
a-1: initializing an input filtered image instant operation image into an original marked image;
a-2: creating a second structural element;
a-3: and carrying out reconstruction operation according to the original marked image and the second structural element, wherein the calculation formula is as follows:
Figure FDA0002764956800000011
in the formula, hk+1Marking the image for the current time; h iskMarking the image for the previous generation; k is an iteration parameter; b is a second structural element;
Figure FDA0002764956800000012
is a function of the expansion operation; g is a mask image; n is intersection;
a-4: judging whether the current marked image is the same as the previous generation marked image, if so, realizing reconstruction, outputting the obtained current marked image, otherwise, updating the iteration parameters, and returning to the step A-3;
s3-2: extracting line feature coordinates of the obtained marked image, comprising the following steps:
b-1: representing the binary image by adopting two-dimensional data;
b-2: taking the first element of the two-dimensional array as a projection origin, projecting other elements in the horizontal and vertical directions, and obtaining an accumulated value corresponding to the projection direction;
b-3: traversing the accumulated values, and screening out all feature coordinates meeting the conditions, wherein the accumulated values corresponding to the feature coordinates are larger than the minimum distance between the coordinate horizontal lines in the two projection directions;
b-4: further analyzing the characteristic coordinates, and extracting a middle value when more than one characteristic coordinate meets the conditions;
b-5: traversing other elements of the two-dimensional array, and repeating the steps B-2 to B-4 to obtain a characteristic coordinate array;
s3-3: classifying the image intersection points according to the characteristic coordinate array, and calculating a weight array of a table in the marked image, wherein the method comprises the following steps:
c-1: dividing the table into different types according to the mode that the regular line segments of the table are connected with the grid points;
c-2: traversing each intersection point according to the obtained characteristic coordinate array, namely the horizontal and vertical line coordinate array;
the intersection point is the intersection point of a transverse line and a longitudinal line in the table;
c-3: sequentially extracting image blocks in four directions by taking the current intersection point as a starting point, and sequentially judging whether a straight line exists, wherein if the straight line exists, the output result is 1, and if not, the output result is 0;
the four directions are: a lateral positive direction, a lateral negative direction, a longitudinal positive direction, and a longitudinal negative direction;
c-4: classifying the intersection points according to the output result, storing the output result in the established corresponding three-dimensional array, and modifying and checking all the three-dimensional arrays according to the modification rule;
the modification rule is as follows: if a transverse positive straight line exists at the intersection point, a transverse negative straight line inevitably exists at the intersection point on the right side of the intersection point, and the other directions are the same;
c-5: and obtaining the weight sum of each intersection point according to the weight of each corresponding direction in the three-dimensional array, and storing the weight sum in a weight array, wherein the calculation formula is as follows:
p(x,y)=q1*p(x,y,0)+q2*p(x,y,1)+q3*p(x,y,2)+q4*p(x,y,3)
in the formula, p (x, y) is a weight array; q. q.s1P (x, y,0) is the product of the weight corresponding to the horizontal direction and the three-dimensional array; q. q.s2P (x, y,1) is the product of the weight corresponding to the negative direction and the three-dimensional array; q. q.s3P (x, y,2) is the product of the weight corresponding to the longitudinal direction and the three-dimensional array; q. q.s4P (x, y,3) is the product of the weight corresponding to the longitudinal direction and the negative direction and the three-dimensional array;
s4: redrawing the table to realize table extraction;
and traversing each row and each column according to the type of the intersection point and the weight array, searching corresponding start endpoint coordinates and end endpoints, connecting the start endpoint coordinates and the end endpoints, redrawing the table, and extracting the table.
2. The form extraction method for a common form image according to claim 1, characterized in that: in step S1, the preprocessing includes a reduction process, a gradation process, and a binarization process performed in this order.
3. The form extraction method for a common form image according to claim 2, characterized in that: in step S1, the calculation formula of the preprocessing is:
carrying out gray value processing by using a weighting method, wherein the calculation formula is as follows:
F(i,j)=0.3R(i,j)+0.59G(i,j)+0.11B(i,j)
in the formula, F (i, j) is the gray value of the pixel point; r (i, j), G (i, j) and B (i, j) are corresponding to the brightness values of the red, green and blue colors of the pixel points; (i, j) are pixel point coordinates;
the calculation formula for the binarization processing is as follows:
Figure FDA0002764956800000031
wherein g (x, y) is a binarized image; f (x, y) is an input gray image; t is a threshold value.
4. The form extraction method for a common form image according to claim 3, characterized in that: the gray value with the largest inter-class variance is used as the optimal threshold value of binarization processing, and the calculation formula is as follows:
T*=arg max[g(t)]
in the formula, T*Is an optimal threshold value; g (t) is the between-class variance; t is a selected gray value variable, and t belongs to { 0., M-1}, wherein M is the number of gray values in the image; arg max (·) is the maximum satisfaction function.
5. The form extraction method for a common form image according to claim 4, wherein: in step S2, the morphological operation performed on the preprocessed image, i.e., the binarized image, is an open operation, i.e., a corrosion operation and an expansion operation are sequentially performed, and the calculation formula is as follows:
Figure FDA0002764956800000041
in the formula (I), the compound is shown in the specification,
Figure FDA0002764956800000042
is an open operation function; SE is a first structural element; a is an input binary image;
Figure FDA0002764956800000043
is a function of the expansion operation; theta is the corrosion operating function.
CN201811217691.5A 2018-10-18 2018-10-18 Table extraction method for general table image Active CN109543525B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811217691.5A CN109543525B (en) 2018-10-18 2018-10-18 Table extraction method for general table image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811217691.5A CN109543525B (en) 2018-10-18 2018-10-18 Table extraction method for general table image

Publications (2)

Publication Number Publication Date
CN109543525A CN109543525A (en) 2019-03-29
CN109543525B true CN109543525B (en) 2020-12-11

Family

ID=65844215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811217691.5A Active CN109543525B (en) 2018-10-18 2018-10-18 Table extraction method for general table image

Country Status (1)

Country Link
CN (1) CN109543525B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859874B (en) * 2019-04-17 2023-06-13 百度在线网络技术(北京)有限公司 Form generation method and system, video playing device and computer readable medium
CN112115111A (en) * 2019-06-20 2020-12-22 上海怀若智能科技有限公司 OCR-based document version management method and system
CN111539312A (en) * 2020-04-21 2020-08-14 罗嘉杰 Method for extracting table from image

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN106156761A (en) * 2016-08-10 2016-11-23 北京交通大学 The image form detection of facing moving terminal shooting and recognition methods
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image
CN106897690A (en) * 2017-02-22 2017-06-27 南京述酷信息技术有限公司 PDF table extracting methods
CN107958201A (en) * 2017-10-13 2018-04-24 上海眼控科技股份有限公司 A kind of intelligent checking system and method for vehicle annual test insurance policy form
CN108491788A (en) * 2018-03-20 2018-09-04 上海眼控科技股份有限公司 A kind of intelligent extract method and device for financial statement cell

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN106446881A (en) * 2016-07-29 2017-02-22 北京交通大学 Method for extracting lab test result from medical lab sheet image
CN106156761A (en) * 2016-08-10 2016-11-23 北京交通大学 The image form detection of facing moving terminal shooting and recognition methods
CN106897690A (en) * 2017-02-22 2017-06-27 南京述酷信息技术有限公司 PDF table extracting methods
CN107958201A (en) * 2017-10-13 2018-04-24 上海眼控科技股份有限公司 A kind of intelligent checking system and method for vehicle annual test insurance policy form
CN108491788A (en) * 2018-03-20 2018-09-04 上海眼控科技股份有限公司 A kind of intelligent extract method and device for financial statement cell

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《Designing a Flexible Framework for a Table Abstraction》;H. Conrad Cunningham等;《Springer US 2009》;20091231;第1-38页 *
《一种基于自适用结构元素的表格框线去除形态学算法》;刘艳顺等;《贵州大学学报(自然科学版)》;20080731;第25卷(第4期);第350-353,361页 *
《印刷体表格识别的研究》;刘昱;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140415(第4期);第I138-979页 *

Also Published As

Publication number Publication date
CN109543525A (en) 2019-03-29

Similar Documents

Publication Publication Date Title
CN104751142B (en) A kind of natural scene Method for text detection based on stroke feature
Singh et al. A study of image segmentation algorithms for different types of images
US20110052062A1 (en) System and method for identifying pictures in documents
CN109543525B (en) Table extraction method for general table image
CN104408449B (en) Intelligent mobile terminal scene literal processing method
CN102800094A (en) Fast color image segmentation method
CN101930461A (en) Digital image visualized management and retrieval for communication network
DE102018003475A1 (en) Form-based graphic search
Al Abodi et al. An effective approach to offline Arabic handwriting recognition
CN112712860B (en) Grain finite element model modeling method based on real metallographic structure
CN109409211B (en) Processing method, processing device and storage medium for Chinese character skeleton stroke segments
JP2011248702A (en) Image processing device, image processing method, image processing program, and program storage medium
US9323726B1 (en) Optimizing a glyph-based file
CN104636309B (en) Matrix recognition methods based on machine vision
CN108805139A (en) A kind of image similarity computational methods based on frequency-domain visual significance analysis
CN109271882B (en) Method for extracting color-distinguished handwritten Chinese characters
Dong et al. A parallel thinning algorithm based on stroke continuity detection
CN104809721B (en) A kind of caricature dividing method and device
CN107358244B (en) A kind of quick local invariant feature extracts and description method
CN113628113A (en) Image splicing method and related equipment thereof
CN103927533B (en) The intelligent processing method of graph text information in a kind of scanned document for earlier patents
CN106056575B (en) A kind of image matching method based on like physical property proposed algorithm
CN110490210B (en) Color texture classification method based on t sampling difference between compact channels
Takahashi et al. Region graph based text extraction from outdoor images
CN107818579B (en) Color texture feature extraction method based on quaternion Gabor filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40003778

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant