CN110263739A - Photo table recognition methods based on OCR technique - Google Patents
Photo table recognition methods based on OCR technique Download PDFInfo
- Publication number
- CN110263739A CN110263739A CN201910558402.6A CN201910558402A CN110263739A CN 110263739 A CN110263739 A CN 110263739A CN 201910558402 A CN201910558402 A CN 201910558402A CN 110263739 A CN110263739 A CN 110263739A
- Authority
- CN
- China
- Prior art keywords
- text
- line
- ocr technique
- row cutting
- cutting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/63—Scene text, e.g. street names
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Character Discrimination (AREA)
- Character Input (AREA)
Abstract
The present invention relates to the photo table recognition methods based on OCR technique, comprising: A. carries out row cutting to image content by OCR technique, judges whether contain table in image content, if there is then continuing, otherwise terminates;B. expansive working and the etching operation that OCR technique is carried out to row cutting figure, respectively obtain the line and alignment of table, calculate the intersecting point coordinate of line and alignment;C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains the line of text of graphic form in each unit lattice;E. the character text in the line of text of all graphic forms is identified by OCR technique, and its corresponding character text is combined by complete structured text according to the positioning of each line of text.The present invention can be realized the accuracy of Table recognition 100%, and not need to pre-establish form template, can be suitable for wider application field, and resource consumption is few.
Description
Technical field
It is especially the photo table recognition methods based on OCR technique the present invention relates to the method for image recognition.
Background technique
In field of image processing, people make great progress to containing the research that form document identifies.Table is known
Before not, needs first to carry out printed page analysis to document, extract the table in document, then table is positioned, last root
The text in table is identified according to positioning result.In terms of object detection and positioning, common technology has: edge detecting technology
The technologies such as (canny edge detection) and rcnn/faster-rcnn/yolo/ssd;In terms of OCR (optical character identification), mainly
Technology has: text classification, cnn (convolutional neural networks) and CRNN+CTC based on supervised learning etc..At present to the positioning of table
Mainly use following methods:
(1) the form locating method of rule-based template:
The way of this method is exactly to collect various tables, and different rule templates is extracted from different classes of table.
When parsing new table, category division first is carried out to new table, then carries out table solution using the rule template of the category
Analysis.
(2) based on the localization method of study:
This method carries out form locating using the method for machine learning.It is divided into two processes of training and prediction.In training rank
Section, first have to building one data set, then allow machine learning model to learn a fixed mode on the training set, so as to
Forecast period uses.In forecast period, the new table input model to be parsed, model is automatically performed the parsing of table, finally
Export recognition result.
The defect of scheme at present:
(1) the form locating method of rule-based template:
The thought of this method is to enumerate.If a new table will position failure not in existing classification.Also,
With increasing for table classification, system effectiveness can also be gradually decreased.
(2) based on the localization method of study:
This method is the strategy based on machine learning.One maximum disadvantage of machine learning cannot exactly reach percent
Hundred is correct, very high to recognition result requirement in certain application scenarios, if there is 0. 1 percent mistake will result in
Very big loss, it is clear that the system based on this strategy is not ideal selection scheme.Moreover, system needs to train in advance, carry out
Feature extraction, these can all consume certain resource.Finally, this, which will result in, is if sample set lacks representative and generality
System study eventually leads to model and fails to certain form locatings less than certain modes.
Summary of the invention
The photo table recognition methods based on OCR technique that the present invention provides a kind of, does not need to pre-establish form template,
And can reach 100% accuracy.
The present invention is based on the photo table recognition methods of OCR technique, comprising:
A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content
There is table, if there is then continuing, otherwise terminates;
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table
The line and alignment of lattice, and calculate the intersecting point coordinate of line and alignment;
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;
D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains each unit lattice
The line of text of middle graphic form;
E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to each line of text
Its corresponding character text is combined into complete structured text by positioning.
Compared with the existing methods, flexibility of the invention is more preferable, after a new picture input, does not need image and cuts
Divide expert to pre-establish corresponding Table recognition rule, greatly reduces cost.And by test, identification side of the invention
Method can be realized the accuracy of Table recognition 100%, can be suitable for high-precision application field, and resource consumption is few.
Specifically, step A includes:
A1. picture is projected on longitudinal axis according to pixel column, forms projection column corresponding with each pixel column;
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms at least one row cutting
Figure;
A3. all row cutting figures of iteration judge whether contain table in each row cutting figure.
Further, step A3 includes: all row cutting figures of iteration, and carries out OCR technique to each row cutting figure
Whether etching operation is judged according to the result of etching operation comprising rectangle frame in current row cutting figure, if the picture comprising if
Comprising table, if all row cutting figures all do not include rectangle frame, picture does not include table.
Specifically, step B includes:
B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR
Technology carries out expansive working to the part of the row cutting figure slided into, it is therefore an objective to which the side for enhancing fuzzy table facilitates and divides below
Analysis.Wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working
Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain containing only line and the row of alignment is cut
Component.
Specifically, calculating the intersecting point coordinate of line and alignment described in step B, comprising:
B3. according to the line and alignment of table is obtained, line figure is added with nomogram, is found out all containing only object picture
Element, (after binary conversion treatment, background pixel value is " 1 " to the full null range without background pixel in image, and object pixel value is
"0");
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds the friendship of a line and alignment
Point coordinate.
Specifically, step C includes:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by phase
Same ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell
Collection.
Further, step E includes:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling of character string
In the cell of table;
E4. tableau format text is saved, for example, json format or xml format etc..
The present invention is based on the photo table recognition methods of OCR technique, do not need to pre-establish form template, can adapt to more
Extensive scene significantly reduces the workload for pre-establishing form template, and can reach 100% accuracy, greatly
The accuracy for improving Table recognition.
Specific embodiment with reference to embodiments is described in further detail above content of the invention again.
But the range that this should not be interpreted as to the above-mentioned theme of the present invention is only limitted to example below.Think not departing from the above-mentioned technology of the present invention
In the case of thinking, the various replacements or change made according to ordinary skill knowledge and customary means should all be included in this hair
In bright range.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow charts of the photo table recognition methods of OCR technique.
Fig. 2 is the binary map of original image.
Fig. 3 is the perspective view of Fig. 2 on longitudinal axis.
Fig. 4 is the original image comprising table and table text.
Fig. 5 is the row cutting figure that Fig. 4 contains only line and alignment after etching operation.
Fig. 6 is the original image comprising table.
Fig. 7 is that table is syncopated as from Fig. 6.
Fig. 8 is a certain cell being syncopated as in Fig. 7 table.
Fig. 9 is to be syncopated as the first row text in Fig. 8 cell.
Specific embodiment
The present invention is based on the photo table recognition methods of OCR technique as shown in Figure 1, comprising:
A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content
There is table, if there is then continuing, otherwise terminate, specifically include:
A1. as shown in Figures 2 and 3, picture is projected on longitudinal axis according to pixel column, is formed and each pixel column pair
The projection column answered.
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms several row cutting figures.
A3. all row cutting figures of iteration, and the etching operation of OCR technique is carried out to each row cutting figure, according to corrosion
Whether the result of operation judge in current row cutting figure comprising rectangle frame, if comprising if picture include table, if all
Row cutting figure does not all include rectangle frame, then picture does not include table.
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table
The line and alignment of lattice, and the intersecting point coordinate of line and alignment is calculated, it specifically includes:
B1. as shown in figure 4, it is sliding on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively
It is dynamic, and expansive working is carried out by part of the OCR technique to the row cutting figure slided into, it is therefore an objective to enhance fuzzy table
Side facilitates later analysis.Wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working
Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain it is as shown in Figure 5 contain only line and
The row cutting figure of alignment.
B3. according to the line and alignment of table is obtained, line figure is added with nomogram, is found out all containing only object picture
Element, (after binary conversion treatment, background pixel value is " 1 " to the full null range without background pixel in image, and object pixel value is
"0");
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds the friendship of a line and alignment
Point coordinate.
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection, specifically include:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by phase
Same ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell
Collection.
D. the cell concentrated according to principle identical with step A, iteration unit lattice, to the cell of each graphic form
It carries out row cutting and obtains the line of text of graphic form in each unit lattice;
Fig. 6~Fig. 9 shows original image, a certain cell for being syncopated as table, being syncopated as in table respectively and cuts
Separate the process of the first row text in the cell.
E. it identifies the character text in picture, and character text group is combined into structured text, specifically:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling of character string
In the cell of table;
E4. tableau format text is saved, for example, json format or xml format etc..
After tested, recognition methods of the invention can be realized the accuracy of Table recognition 100%, can be suitable for high-precision
Application field, and resource consumption is few.
Claims (7)
1. the photo table recognition methods based on OCR technique, feature include:
A. row cutting is carried out to image content by OCR technique, judges whether contain table in image content, if there is then after
It is continuous, otherwise terminate;
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table
Line and alignment, and calculate the intersecting point coordinate of line and alignment;
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;
D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains scheming in each unit lattice
The line of text of sheet form;
E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to the positioning of each line of text
Its corresponding character text is combined into complete structured text.
2. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step A includes:
A1. picture is projected on longitudinal axis according to pixel column, forms projection column corresponding with each pixel column;
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms at least one row cutting figure;
A3. all row cutting figures of iteration judge whether contain table in each row cutting figure.
3. the photo table recognition methods based on OCR technique as claimed in claim 2, it is characterized in that: step A3 includes: iteration
All row cutting figures, and the etching operation of OCR technique is carried out to each row cutting figure, worked as according to the judgement of the result of etching operation
In preceding row cutting figure whether include rectangle frame, if comprising if picture include table, if all row cutting figures all do not include
Rectangle frame, then picture does not include table.
4. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step B includes:
B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR technique
Expansive working is carried out to the part of the row cutting figure slided into, wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to the row for having been subjected to expansive working
Cutting figure carries out etching operation, eliminates the text in row cutting figure in table, obtains the row cutting figure for containing only line and alignment.
5. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: it is calculated described in step B
The intersecting point coordinate of line and alignment, comprising:
B3. according to obtaining the line and alignment of table, line figure is added with nomogram, find out it is all containing only object pixel, no
Full null range containing background pixel;
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds a line and the intersection point of alignment is sat
Mark.
6. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step C includes:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by identical
Ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell collection.
7. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step E includes:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling table of character string
Cell in;
E4. tableau format text is saved.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910558402.6A CN110263739A (en) | 2019-06-26 | 2019-06-26 | Photo table recognition methods based on OCR technique |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910558402.6A CN110263739A (en) | 2019-06-26 | 2019-06-26 | Photo table recognition methods based on OCR technique |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110263739A true CN110263739A (en) | 2019-09-20 |
Family
ID=67921629
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910558402.6A Pending CN110263739A (en) | 2019-06-26 | 2019-06-26 | Photo table recognition methods based on OCR technique |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263739A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110956087A (en) * | 2019-10-25 | 2020-04-03 | 天津幸福生命科技有限公司 | Method and device for identifying table in picture, readable medium and electronic equipment |
CN111223109A (en) * | 2020-01-03 | 2020-06-02 | 四川新网银行股份有限公司 | Complex form image analysis method |
CN112528832A (en) * | 2020-12-07 | 2021-03-19 | 国网青海省电力公司电力科学研究院 | Method and system for processing PDF-format relay protection fixed value list |
CN113989822A (en) * | 2021-12-24 | 2022-01-28 | 中奥智能工业研究院(南京)有限公司 | Picture table content extraction method based on computer vision and natural language processing |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
CN107315989A (en) * | 2017-05-03 | 2017-11-03 | 天方创新(北京)信息技术有限公司 | For the text recognition method and device of medical information picture |
CN109614923A (en) * | 2018-12-07 | 2019-04-12 | 上海智臻智能网络科技股份有限公司 | The recognition methods of OCR document and its device |
CN109685052A (en) * | 2018-12-06 | 2019-04-26 | 泰康保险集团股份有限公司 | Method for processing text images, device, electronic equipment and computer-readable medium |
-
2019
- 2019-06-26 CN CN201910558402.6A patent/CN110263739A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103258198A (en) * | 2013-04-26 | 2013-08-21 | 四川大学 | Extraction method for characters in form document image |
CN107315989A (en) * | 2017-05-03 | 2017-11-03 | 天方创新(北京)信息技术有限公司 | For the text recognition method and device of medical information picture |
CN109685052A (en) * | 2018-12-06 | 2019-04-26 | 泰康保险集团股份有限公司 | Method for processing text images, device, electronic equipment and computer-readable medium |
CN109614923A (en) * | 2018-12-07 | 2019-04-12 | 上海智臻智能网络科技股份有限公司 | The recognition methods of OCR document and its device |
Non-Patent Citations (1)
Title |
---|
柴功博 等: "基于视窗的航图导航数据提取技术研究", 《民航学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110956087A (en) * | 2019-10-25 | 2020-04-03 | 天津幸福生命科技有限公司 | Method and device for identifying table in picture, readable medium and electronic equipment |
CN110956087B (en) * | 2019-10-25 | 2024-04-19 | 北京懿医云科技有限公司 | Method and device for identifying table in picture, readable medium and electronic equipment |
CN111223109A (en) * | 2020-01-03 | 2020-06-02 | 四川新网银行股份有限公司 | Complex form image analysis method |
CN111223109B (en) * | 2020-01-03 | 2023-06-06 | 四川新网银行股份有限公司 | Complex form image analysis method |
CN112528832A (en) * | 2020-12-07 | 2021-03-19 | 国网青海省电力公司电力科学研究院 | Method and system for processing PDF-format relay protection fixed value list |
CN113989822A (en) * | 2021-12-24 | 2022-01-28 | 中奥智能工业研究院(南京)有限公司 | Picture table content extraction method based on computer vision and natural language processing |
CN113989822B (en) * | 2021-12-24 | 2022-03-08 | 中奥智能工业研究院(南京)有限公司 | Picture table content extraction method based on computer vision and natural language processing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263739A (en) | Photo table recognition methods based on OCR technique | |
CN109308476B (en) | Billing information processing method, system and computer readable storage medium | |
US10339428B2 (en) | Intelligent scoring method and system for text objective question | |
CN110363252B (en) | End-to-end trend scene character detection and identification method and system | |
US10817741B2 (en) | Word segmentation system, method and device | |
CN111402209B (en) | U-Net-based high-speed railway steel rail damage detection method | |
CN105426856A (en) | Image table character identification method | |
CN107633239A (en) | Bill classification and bill field extracting method based on deep learning and OCR | |
CN109858372A (en) | A kind of lane class precision automatic Pilot structured data analysis method | |
CN103439348B (en) | Remote controller key defect detection method based on difference image method | |
CN111242024A (en) | Method and system for recognizing legends and characters in drawings based on machine learning | |
CN104820835A (en) | Automatic examination paper marking method for examination papers | |
CN109544564A (en) | A kind of medical image segmentation method | |
CN106934455B (en) | Remote sensing image optics adapter structure choosing method and system based on CNN | |
CN101777060A (en) | Automatic evaluation method and system of webpage visual quality | |
CN108509988B (en) | Test paper score automatic statistical method and device, electronic equipment and storage medium | |
CN102750534B (en) | A kind of method and apparatus of character cutting | |
CN102024138B (en) | Character identification method and character identification device | |
CN107818321A (en) | A kind of watermark date recognition method for vehicle annual test | |
CN106203296B (en) | The video actions recognition methods of one attribute auxiliary | |
CN112883926B (en) | Identification method and device for form medical images | |
CN113537227A (en) | Structured text recognition method and system | |
CN107622271A (en) | Handwriting text lines extracting method and system | |
CN107784321A (en) | Numeral paints this method for quickly identifying, system and computer-readable recording medium | |
CN110334709A (en) | Detection method of license plate based on end-to-end multitask deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190920 |
|
RJ01 | Rejection of invention patent application after publication |