CN114821611A - Method for protecting private data in archive form image - Google Patents
Method for protecting private data in archive form image Download PDFInfo
- Publication number
- CN114821611A CN114821611A CN202210558787.8A CN202210558787A CN114821611A CN 114821611 A CN114821611 A CN 114821611A CN 202210558787 A CN202210558787 A CN 202210558787A CN 114821611 A CN114821611 A CN 114821611A
- Authority
- CN
- China
- Prior art keywords
- image
- line
- lines
- row
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 230000009466 transformation Effects 0.000 claims abstract description 8
- 238000005452 bending Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000003708 edge detection Methods 0.000 claims description 7
- 230000000149 penetrating effect Effects 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 4
- 238000005562 fading Methods 0.000 claims description 4
- 230000002452 interceptive effect Effects 0.000 claims description 4
- 230000008859 change Effects 0.000 claims description 3
- 238000007639 printing Methods 0.000 claims description 3
- 238000012797 qualification Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/16—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/1801—Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/24—Character recognition characterised by the processing or recognition method
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/42—Document-oriented image-based pattern recognition based on the type of document
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a method for protecting private data in a form image of an archive, which utilizes RGB color difference of a color image to fade contents such as a color seal and the like of the form image; scaling the image to a specific size, and obtaining an edge intensity image of the image through convolution operation; obtaining candidate horizontal line equations and candidate vertical line equations by utilizing Hough transformation; backtracking and searching other possible tracking tracks; the coordinate information of the utilization line is obtained from the file user, and the private part and the part needing to be utilized are automatically blurred in the file image. The method has high calculation speed, and can complete identification and privacy fuzzy processing within 1 second on a common office computer; and the anti-interference capability is strong, and the old file, the table with the color seal, the original correction trace and the table with extremely light lines can be accurately identified. In addition, the compatibility is good, and the tables with different length-width ratios, different row and column numbers and different header styles can be effectively identified and processed.
Description
Technical Field
The invention relates to the field of archive utilization and image processing, in particular to an archive form image identification and processing algorithm.
Background
In order to respect the original recording performance and the authenticity of the contents of the file, the file management department will scan and store the original file, and will mask or blur the information of others in the image that is not related to the file user when issuing the used materials such as file certificate.
The existing shielding method is to manually open the original file image by using image processing software such as drawing or PhotoShop, manually select the part needing privacy processing, process the image, and then save, print, sign, seal and deliver the image to the file user. The manual privacy processing selection link is relatively complicated in operation.
The computer vision technology has long existed in the table identification technology in the document image, and the traditional image identification and processing technology comprises the technologies of image inclination correction, image binarization, horizontal and vertical projection of table horizontal lines and vertical lines, voting by Hough change to obtain a line segment equation, and then line segment tracking detection and the like. When the image background is clean and the table is simple and standard, the method has a good effect. However, a large number of file tables have various interference factors such as complicated head, old and dark appearance, curved lines, handwriting and signature coverage on the table images, and the like, so that the method is difficult to obtain the table wireframe information.
The other branch of the computer vision technology is a machine learning technology based on a deep neural network, and the technology constructs a set of learning mechanism capable of self-learning a large number of samples from the characteristics of bottom layer edges and the like of images to the characteristics of high-level structures and the like by using methods such as multilayer convolution layers, pooling layers, activation functions, loss functions and the like. The disadvantages include: a large number of learning samples need to be prepared and marked, and high calculation power is needed for learning and reasoning calculation. The workload of collecting and marking a large number of file form sample images at the early stage of machine learning is very large, and even if a large number of manpower is organized to complete the work, the operation of the inference model obtained by machine learning in a common office computer of a general file utilization department also has certain delay, so that the application is inconvenient.
Disclosure of Invention
The invention aims to provide a method for protecting private data in an archive form image. The method has the advantages of high calculation speed, strong anti-interference capability, accurate identification on old files, colored seals, original correction marks and tables with extremely light lines, good compatibility and effective identification and processing on tables with different length-width ratios, different row and column numbers and different header styles.
The technical scheme of the invention is as follows: the method for protecting the privacy data in the archive form image is characterized by comprising the following steps:
a method for protecting private data in an archive form image specifically comprises the following steps:
1) firstly, fading contents such as a color seal of a form image by utilizing RGB color difference of a color image;
2) the method comprises the steps of scaling an image to a specific size, designing a horizontal edge detection operator with good compatibility under the image size, and obtaining an edge intensity image of the image through convolution operation;
3) obtaining candidate horizontal line equations and candidate vertical line equations by utilizing Hough transformation;
4) line tracking is carried out according to a linear equation, line coordinates of no more than 15 pixels are memorized by utilizing a circular queue, and the line bending condition is judged in real time; judging the excessive bending as interference, and backtracking and searching other possible tracking tracks by using coordinate information stored in the queue;
5) sorting horizontal lines penetrating through the table in a Y coordinate mode, and identifying a table head according to the characteristic that the height of the table head is greatly different from that of a common line;
6) and acquiring coordinate information of the utilization line from the file user by using a graphical interactive interface, and automatically blurring a privacy part and a part needing to be utilized in the file image by combining the form information identified in the step.
The detection operator in the step 2) is to construct an edge detection operator matrix H with n rows and m columns, wherein the row value in the middle of the matrix is negative, the row element values at two ends of the matrix are positive, the element values of each row are the same, the sum of all positive element values of the matrix is 1, the sum of negative element values is-1, n is an integer not greater than 7, and m is an integer not greater than 5.
The method for obtaining the candidate horizontal line equation and the candidate vertical line equation by utilizing the Hough transformation in the step 3) comprises the following two steps:
and (3) carrying out Hough transformation on the image, which is called voting for short: taking the parameter space of the vote as a two-dimensional space: the row coordinates represent straight line intercept, the column coordinates represent straight line inclination angle, and the height is consistent with the height of the document; the voting threshold value is that the edge strength reaches +5 and the edge strength has 1 vote qualification, so that the weak edge and the strong edge are ensured, the obtained votes are consistent under the condition of the same length, and the table horizontal lines with the same length can be conveniently distinguished in the voting result; the case of edge intensity greater than 0 and lower than +5 is considered as a line artifact generated by paper and scanning instruments;
obtaining equation parameters of the candidate lines: in a parameter space diagram of Hough change, obtaining the maximum value of the parameter space diagram as the width maxValH of a table line; and traversing the parameter space diagram, and inspecting local maximum values reaching seventy percent of the maxValH so as to detect discontinuous missing table lines as much as possible.
The line segment tracking algorithm capable of tracking proper bending, interruption and serious stroke interference in the step 4) is that under the guidance of a linear equation, a line is tracked, the inclination angle and the intercept of the linear are obtained through a parameter equation of the linear, and the intercept is the intersection point Y coordinate of the linear and the X is equal to 0; in the edge intensity map, starting from an X-0 and a Y-intercept point, the edge intensity map travels to the right at a straight line inclined angle; taking t as 20 as a threshold value, and taking the point of the maximum value of the upper, middle and lower three points of the current point as the tracking direction; when the intensity of the maximum point is greater than t, the point is regarded as a line point, otherwise, the point is regarded as a non-line point; recording the coordinates of the previously tracked points by using a circular queue; if the length of the line is greater than 7, calculating the line bending included angle constructed by the nearest 7 points; when the included angle is larger than 3 degrees, the tracking error is considered to be performed, and rollback is performed; all the traced lines are entered into the set S.
The operation of sorting the Y coordinates of the horizontal lines penetrating through the table and identifying the head of the table in the step 5) is to perform voting counting on the x coordinates of the left and right end points of all lines in the S, and take the x with the maximum ticket number at the left end as the left boundary L of the table and the x with the maximum ticket number at the right end as the right boundary R of the table; traversing all lines in the S, only keeping lines with left and right end points near L and R respectively, and regarding the lines as horizontal lines capable of penetrating through the table; these horizontal lines are sorted in the y-coordinate and treated as adjacent table horizontal lines. In the upper part of the table, from top to bottom: the height of two adjacent rows is more than 20%, and the upper row is regarded as the head part.
The step 6) of automatically blurring the private parts and the parts needing to be used for reservation in the archival image means that the row selected by the archival user is obtained, clear images are reserved for the row and the head of the row, and conventional image blurring is performed on other rows to obtain a finally available image for printing and outputting.
The invention has the beneficial effects that: 1. the calculation speed is high, and the identification and privacy fuzzy processing can be completed within 1 second on a common office computer; 2. the anti-interference capability is strong, and the old file, the table with the color seal, the original correction trace and the table with extremely light lines can be accurately identified. 3. The compatibility is good, and the tables with different length-width ratios, different row and column numbers and different header styles can be effectively identified and processed.
Drawings
FIG. 1 is a color scanning member (masked name) with various interference features;
FIG. 2 is a comparison between the ordinary color-to-gray scale and the graying method employed in the present patent;
FIG. 3 is a horizontal edge view;
FIG. 4 is a Hough transform parameter space (left) and horizontal tracking effect (right);
FIG. 5 is a table that retains horizontal lines that run through the table;
FIG. 6 is a schematic diagram of a line tracking algorithm;
tracing from left to right, yellow for no line, green for correct tracing, red for interference line
FIG. 7 is an image of a user automatically generating privacy-preserving information after selecting a line of use on a graphical interactive interface.
Detailed Description
The present invention is further illustrated by the following examples, which are not to be construed as limiting the invention.
Example 1: method for protecting private data in archive form image
The first step is as follows: size unification and fading of color stamps and signatures
The length and width of the original image are defined as W and H, and the scaling is zoom-in min (1024/W, 1024/H). The artwork is scaled to zoom times in size.
In the image after the size scaling, the line width can be controlled within 0.5 to 2 pixels, so that the subsequent detection is convenient. The color image is separated into R, G, B three channel images. By calculating max of the image matrix, Gray is max (R, G, B), and Gray is an image after fading colors, so that interference of a color stamp, a red signature and the like can be greatly reduced. As shown in comparison with fig. 2.
The second step is that: constructing a horizontal edge detection operator, and carrying out horizontal edge detection on the image
Constructing n rows and m columns of edge detection operator matrixes H, wherein the suggested values of n and m are 5 and 5, the row value in the middle of the matrixes is negative, the row element values at two ends of the matrixes are positive, the element values of each row are the same, the sum of all positive element values of the matrixes is 1, and the sum of negative element values of the matrixes is-1. The H value is suggested as:
H=[+0.100,+0.100,+0.100,+0.100,+0.100;
-0.025,-0.025,-0.025,-0.025,-0.025;
-0.150,-0.150,-0.150,-0.150,-0.150;
-0.025,-0.025,-0.025,-0.025,-0.025;
+0.100,+0.100,+0.100,+0.100,+0.100;]
performing 2-dimensional convolution calculation on Gray and H to obtain a horizontal edge intensity graph E-conv 2d (Gray, H); the effect is shown in fig. 3.
The 2d convolution of conv2d and the convolutional neural network here is a calculation.
The third step: extracting table line
Data related to archival privacy is generally in units of behavior, all illustrated here by tracking of horizontal lines.
And (6) voting. Hough transformation (voting) is carried out on the image, and the parameter space of the voting is taken as a two-dimensional space: the row coordinates represent the straight line intercept, and the column coordinates represent the straight line inclination angle (at an angular resolution of 0.1 °, angular range ± 3 °) for a total of 61 columns. The range can be automatically expanded if needed, and the height is consistent with the height of the document. The voting threshold is that the edges with the edge strength of +5 all have 1-vote qualification, the weak edges and the strong edges are ensured, and the obtained votes are consistent under the condition of the same length, so that the table horizontal lines with the same length can be distinguished conveniently in the voting result. The case of edge intensities greater than 0 and less than +5 is considered a line artifact produced by paper and scanning instruments.
Equation parameters of the candidate lines are obtained. In the parameter space diagram (black part on the left side of fig. 4) of Hough variation, the maximum value is the table line width maxValH. And traversing the parameter space diagram, and inspecting local maximum values reaching seventy percent of the maxValH so as to detect discontinuous missing table lines as much as possible.
The fourth step: the line is followed (shown in fig. 6) under the guidance of the straight line equation. The linear parameter equation can be used to find the inclination angle and the intercept (the Y coordinate of the intersection point with the line where X is 0). In the edge intensity map, starting from the point where X is 0 and Y is the intercept point, the straight line is inclined to the right. And taking t as 20 as a threshold value, and taking the point of the maximum value of the upper, middle and lower three points of the current point as the tracking direction. And when the maximum point intensity is greater than t, the point is regarded as a line point, otherwise, the point is regarded as a non-line point.
The coordinates of previously tracked points are recorded using a circular queue (a data structure). If the line length is greater than 7, the line bend angle for the nearest 7 point configuration is calculated. And when the included angle is larger than 3 degrees, judging that the tracking is wrong, and performing rollback. All the traced lines are entered into the set S.
The reason for using a circular queue is: the bending of the tracking line is considered only in the 7 points tracked recently, and the coordinate storage space of the points in a longer range can be recycled.
The fifth step: finding the horizontal through line and gauge head of the table
And (4) voting and counting the x coordinates of the left and right end points of all the lines in the S, taking the x with the maximum vote number at the left end as the left boundary L of the table, and taking the x with the maximum vote number at the right end as the right boundary R of the table.
And traversing all the lines in the S, and only keeping the lines with the left and right end points near the L and the R respectively as horizontal lines capable of penetrating through the table. The effect is shown in fig. 6.
These horizontal lines are sorted in the y-coordinate and treated as adjacent table horizontal lines. In the upper part of the table, from top to bottom: the height of two adjacent rows is more than 20%, and the upper row is regarded as the head part.
And a sixth step: applying privacy protection
And acquiring the row selected by the file user from the software interactive interface, reserving a clear image for the row and the header, and performing conventional image blurring on other rows to obtain a final available image for printing and outputting. The effect is shown in fig. 7.
Claims (6)
1. A method for protecting private data in an archive form image is characterized by comprising the following steps:
1) firstly, fading contents such as a color seal of a form image by utilizing RGB color difference of a color image;
2) the method comprises the steps of scaling an image to a specific size, designing a horizontal edge detection operator with good compatibility under the image size, and obtaining an edge intensity image of the image through convolution operation;
3) obtaining candidate horizontal line equations and candidate vertical line equations by utilizing Hough transformation;
4) line tracking is carried out according to a linear equation, line coordinates of no more than 15 pixels are memorized by utilizing a circular queue, and the line bending condition is judged in real time; judging the excessive bending as interference, and backtracking and searching other possible tracking tracks by using coordinate information stored in the queue;
5) sorting horizontal lines penetrating through the table in a Y coordinate mode, and identifying a table head according to the characteristic that the height of the table head is greatly different from that of a common line;
6) and acquiring coordinate information of the utilization line from the file user by using a graphical interactive interface, and automatically blurring a privacy part and a part needing to be utilized in the file image by combining the form information identified in the step.
2. The method of protecting private data in an archival form image as in claim 1, wherein: the detection operator in the step 2) is to construct an edge detection operator matrix H with n rows and m columns, wherein the row value in the middle of the matrix is negative, the row element values at two ends of the matrix are positive, the element values of each row are the same, the sum of all positive element values of the matrix is 1, the sum of negative element values is-1, n is an integer not greater than 7, and m is an integer not greater than 5.
3. The method of protecting private data in an archival form image as in claim 1, wherein: the method for obtaining the candidate horizontal line equation and the candidate vertical line equation by utilizing Hough transformation in the step 3) comprises the following two steps:
and (3) carrying out Hough transformation on the image, which is called voting for short: taking the parameter space of the vote as a two-dimensional space: the row coordinates represent straight line intercept, the column coordinates represent straight line inclination angle, and the height is consistent with the height of the document; the voting threshold value is that the edge strength reaches +5 and the edge strength has 1 vote qualification, so that the weak edge and the strong edge are ensured, the obtained votes are consistent under the condition of the same length, and the table horizontal lines with the same length can be conveniently distinguished in the voting result; the case of edge intensity greater than 0 and lower than +5 is considered as a line artifact generated by paper and scanning instruments;
obtaining equation parameters of the candidate lines: in a parameter space diagram of Hough change, obtaining the maximum value of the parameter space diagram as the width maxValH of a table line; and traversing the parameter space diagram, and examining the local maximum value reaching seventy percent of the maxValH so as to detect the discontinuous missing table lines as much as possible.
4. The method of protecting private data in an archival form image as in claim 1, wherein: the line segment tracking algorithm capable of tracking proper bending, interruption and serious stroke interference in the step 4) tracks lines under the guidance of a linear equation, obtains the inclination angle and the intercept of the straight line through a linear parameter equation, and the intercept is the intersection point Y coordinate of the straight line with X being 0; in the edge intensity map, starting from an X-0 and a Y-intercept point, the edge intensity map travels to the right at a straight line inclined angle; taking t as 20 as a threshold value, and taking the point of the maximum value of the upper, middle and lower three points of the current point as the tracking direction; when the intensity of the maximum point is greater than t, the point is regarded as a line point, otherwise, the point is regarded as a non-line point; recording the coordinates of the previously tracked points by using a circular queue; if the length of the line is greater than 7, calculating the line bending included angle constructed by the nearest 7 points; when the included angle is larger than 3 degrees, the tracking error is considered to be performed, and rollback is performed; all the traced lines are entered into the set S.
5. The method of protecting private data in an archival form image as in claim 1, wherein: the operation of sorting the Y coordinates of the horizontal lines penetrating through the table and identifying the head of the table in the step 5) is to perform voting counting on the x coordinates of the left and right end points of all lines in the S, and take the x with the maximum ticket number at the left end as the left boundary L of the table and the x with the maximum ticket number at the right end as the right boundary R of the table; traversing all lines in the S, only keeping lines with left and right end points near L and R respectively, and regarding the lines as horizontal lines capable of penetrating through the table; these horizontal lines are sorted in the y-coordinate and treated as adjacent table horizontal lines. In the upper part of the table, from top to bottom: the height of two adjacent rows is more than 20%, and the upper row is regarded as the head part.
6. The method of protecting private data in an archival form image as in claim 1, wherein: the step 6) of automatically blurring the private parts and the parts needing to be used for reservation in the archival image means that the row selected by the archival user is obtained, clear images are reserved for the row and the head of the row, and conventional image blurring is performed on other rows to obtain a finally available image for printing and outputting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210558787.8A CN114821611A (en) | 2022-05-20 | 2022-05-20 | Method for protecting private data in archive form image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210558787.8A CN114821611A (en) | 2022-05-20 | 2022-05-20 | Method for protecting private data in archive form image |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114821611A true CN114821611A (en) | 2022-07-29 |
Family
ID=82516430
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210558787.8A Pending CN114821611A (en) | 2022-05-20 | 2022-05-20 | Method for protecting private data in archive form image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114821611A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5579414A (en) * | 1992-10-19 | 1996-11-26 | Fast; Bruce B. | OCR image preprocessing method for image enhancement of scanned documents by reversing invert text |
US5737442A (en) * | 1995-10-20 | 1998-04-07 | Bcl Computers | Processor based method for extracting tables from printed documents |
US20120008874A1 (en) * | 2009-04-07 | 2012-01-12 | Murata Machinery, Ltd. | Image processing apparatus, image processing method, image processing program, and storage medium |
CN103020609A (en) * | 2012-12-30 | 2013-04-03 | 上海师范大学 | Complex fiber image recognition method |
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
US20150294523A1 (en) * | 2013-08-26 | 2015-10-15 | Vertifi Software, LLC | Document image capturing and processing |
CN109766749A (en) * | 2018-11-27 | 2019-05-17 | 上海眼控科技股份有限公司 | A kind of detection method of the bending table line for financial statement |
CN113516103A (en) * | 2021-08-07 | 2021-10-19 | 山东微明信息技术有限公司 | Table image inclination angle determining method based on support vector machine |
-
2022
- 2022-05-20 CN CN202210558787.8A patent/CN114821611A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5579414A (en) * | 1992-10-19 | 1996-11-26 | Fast; Bruce B. | OCR image preprocessing method for image enhancement of scanned documents by reversing invert text |
US5737442A (en) * | 1995-10-20 | 1998-04-07 | Bcl Computers | Processor based method for extracting tables from printed documents |
US20120008874A1 (en) * | 2009-04-07 | 2012-01-12 | Murata Machinery, Ltd. | Image processing apparatus, image processing method, image processing program, and storage medium |
CN103020609A (en) * | 2012-12-30 | 2013-04-03 | 上海师范大学 | Complex fiber image recognition method |
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
US20150294523A1 (en) * | 2013-08-26 | 2015-10-15 | Vertifi Software, LLC | Document image capturing and processing |
CN109766749A (en) * | 2018-11-27 | 2019-05-17 | 上海眼控科技股份有限公司 | A kind of detection method of the bending table line for financial statement |
CN113516103A (en) * | 2021-08-07 | 2021-10-19 | 山东微明信息技术有限公司 | Table image inclination angle determining method based on support vector machine |
Non-Patent Citations (1)
Title |
---|
李云华;段会川;: "基于Hough变换的图像档案的表格提取与倾斜校正", 信息技术与信息化, no. 06, 15 December 2007 (2007-12-15) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102790841B (en) | Method of detecting and correcting digital images of books in the book spine area | |
CN105095892A (en) | Student document management system based on image processing | |
CN102855495B (en) | Method for implementing electronic edition standard answer, and application system thereof | |
CN110210413A (en) | A kind of multidisciplinary paper content detection based on deep learning and identifying system and method | |
CN104112128A (en) | Digital image processing system applied to bill image character recognition and method | |
CN101622631A (en) | Multi-color dropout for scanned document | |
US8930302B2 (en) | Apparatus and method for identifying the creator of a work of art | |
CN110097059B (en) | Document image binarization method, system and device based on generation countermeasure network | |
US7564587B2 (en) | Method of scoring a printed form having targets to be marked | |
CN112446259A (en) | Image processing method, device, terminal and computer readable storage medium | |
CN111539409A (en) | Ancient tomb question and character recognition method based on hyperspectral remote sensing technology | |
CN104463138A (en) | Text positioning method and system based on visual structure attribute | |
CN112541922A (en) | Test paper layout segmentation method based on digital image, electronic equipment and storage medium | |
US5761385A (en) | Product and method for extracting image data | |
Oka et al. | Vectorization of contour lines from scanned topographic maps | |
CN112241730A (en) | Form extraction method and system based on machine learning | |
CN113065396A (en) | Automatic filing processing system and method for scanned archive image based on deep learning | |
CN114708237A (en) | Detection algorithm for hair health condition | |
EP2223265A1 (en) | A method for resolving contradicting output data from an optical character recognition (ocr) system, wherein the output data comprises more than one recognition alternative for an image of a character | |
CN113159158A (en) | License plate correction and reconstruction method and system based on generation countermeasure network | |
CN112364863A (en) | Character positioning method and system for license document | |
CN114821611A (en) | Method for protecting private data in archive form image | |
CN111950556A (en) | License plate printing quality detection method based on deep learning | |
CN115631197B (en) | Image processing method, device, medium, equipment and system | |
CN103927533B (en) | The intelligent processing method of graph text information in a kind of scanned document for earlier patents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |