CN110263739A - Photo table recognition methods based on OCR technique - Google Patents

Photo table recognition methods based on OCR technique Download PDF

Info

Publication number
CN110263739A
CN110263739A CN201910558402.6A CN201910558402A CN110263739A CN 110263739 A CN110263739 A CN 110263739A CN 201910558402 A CN201910558402 A CN 201910558402A CN 110263739 A CN110263739 A CN 110263739A
Authority
CN
China
Prior art keywords
text
line
ocr technique
row cutting
cutting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910558402.6A
Other languages
Chinese (zh)
Inventor
吴信朝
李开宇
翟恩荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan XW Bank Co Ltd
Original Assignee
Sichuan XW Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan XW Bank Co Ltd filed Critical Sichuan XW Bank Co Ltd
Priority to CN201910558402.6A priority Critical patent/CN110263739A/en
Publication of CN110263739A publication Critical patent/CN110263739A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The present invention relates to the photo table recognition methods based on OCR technique, comprising: A. carries out row cutting to image content by OCR technique, judges whether contain table in image content, if there is then continuing, otherwise terminates;B. expansive working and the etching operation that OCR technique is carried out to row cutting figure, respectively obtain the line and alignment of table, calculate the intersecting point coordinate of line and alignment;C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains the line of text of graphic form in each unit lattice;E. the character text in the line of text of all graphic forms is identified by OCR technique, and its corresponding character text is combined by complete structured text according to the positioning of each line of text.The present invention can be realized the accuracy of Table recognition 100%, and not need to pre-establish form template, can be suitable for wider application field, and resource consumption is few.

Description

Photo table recognition methods based on OCR technique
Technical field
It is especially the photo table recognition methods based on OCR technique the present invention relates to the method for image recognition.
Background technique
In field of image processing, people make great progress to containing the research that form document identifies.Table is known Before not, needs first to carry out printed page analysis to document, extract the table in document, then table is positioned, last root The text in table is identified according to positioning result.In terms of object detection and positioning, common technology has: edge detecting technology The technologies such as (canny edge detection) and rcnn/faster-rcnn/yolo/ssd;In terms of OCR (optical character identification), mainly Technology has: text classification, cnn (convolutional neural networks) and CRNN+CTC based on supervised learning etc..At present to the positioning of table Mainly use following methods:
(1) the form locating method of rule-based template:
The way of this method is exactly to collect various tables, and different rule templates is extracted from different classes of table. When parsing new table, category division first is carried out to new table, then carries out table solution using the rule template of the category Analysis.
(2) based on the localization method of study:
This method carries out form locating using the method for machine learning.It is divided into two processes of training and prediction.In training rank Section, first have to building one data set, then allow machine learning model to learn a fixed mode on the training set, so as to Forecast period uses.In forecast period, the new table input model to be parsed, model is automatically performed the parsing of table, finally Export recognition result.
The defect of scheme at present:
(1) the form locating method of rule-based template:
The thought of this method is to enumerate.If a new table will position failure not in existing classification.Also, With increasing for table classification, system effectiveness can also be gradually decreased.
(2) based on the localization method of study:
This method is the strategy based on machine learning.One maximum disadvantage of machine learning cannot exactly reach percent Hundred is correct, very high to recognition result requirement in certain application scenarios, if there is 0. 1 percent mistake will result in Very big loss, it is clear that the system based on this strategy is not ideal selection scheme.Moreover, system needs to train in advance, carry out Feature extraction, these can all consume certain resource.Finally, this, which will result in, is if sample set lacks representative and generality System study eventually leads to model and fails to certain form locatings less than certain modes.
Summary of the invention
The photo table recognition methods based on OCR technique that the present invention provides a kind of, does not need to pre-establish form template, And can reach 100% accuracy.
The present invention is based on the photo table recognition methods of OCR technique, comprising:
A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content There is table, if there is then continuing, otherwise terminates;
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table The line and alignment of lattice, and calculate the intersecting point coordinate of line and alignment;
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;
D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains each unit lattice The line of text of middle graphic form;
E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to each line of text Its corresponding character text is combined into complete structured text by positioning.
Compared with the existing methods, flexibility of the invention is more preferable, after a new picture input, does not need image and cuts Divide expert to pre-establish corresponding Table recognition rule, greatly reduces cost.And by test, identification side of the invention Method can be realized the accuracy of Table recognition 100%, can be suitable for high-precision application field, and resource consumption is few.
Specifically, step A includes:
A1. picture is projected on longitudinal axis according to pixel column, forms projection column corresponding with each pixel column;
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms at least one row cutting Figure;
A3. all row cutting figures of iteration judge whether contain table in each row cutting figure.
Further, step A3 includes: all row cutting figures of iteration, and carries out OCR technique to each row cutting figure Whether etching operation is judged according to the result of etching operation comprising rectangle frame in current row cutting figure, if the picture comprising if Comprising table, if all row cutting figures all do not include rectangle frame, picture does not include table.
Specifically, step B includes:
B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR Technology carries out expansive working to the part of the row cutting figure slided into, it is therefore an objective to which the side for enhancing fuzzy table facilitates and divides below Analysis.Wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain containing only line and the row of alignment is cut Component.
Specifically, calculating the intersecting point coordinate of line and alignment described in step B, comprising:
B3. according to the line and alignment of table is obtained, line figure is added with nomogram, is found out all containing only object picture Element, (after binary conversion treatment, background pixel value is " 1 " to the full null range without background pixel in image, and object pixel value is "0");
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds the friendship of a line and alignment Point coordinate.
Specifically, step C includes:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by phase Same ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell Collection.
Further, step E includes:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling of character string In the cell of table;
E4. tableau format text is saved, for example, json format or xml format etc..
The present invention is based on the photo table recognition methods of OCR technique, do not need to pre-establish form template, can adapt to more Extensive scene significantly reduces the workload for pre-establishing form template, and can reach 100% accuracy, greatly The accuracy for improving Table recognition.
Specific embodiment with reference to embodiments is described in further detail above content of the invention again. But the range that this should not be interpreted as to the above-mentioned theme of the present invention is only limitted to example below.Think not departing from the above-mentioned technology of the present invention In the case of thinking, the various replacements or change made according to ordinary skill knowledge and customary means should all be included in this hair In bright range.
Detailed description of the invention
Fig. 1 is that the present invention is based on the flow charts of the photo table recognition methods of OCR technique.
Fig. 2 is the binary map of original image.
Fig. 3 is the perspective view of Fig. 2 on longitudinal axis.
Fig. 4 is the original image comprising table and table text.
Fig. 5 is the row cutting figure that Fig. 4 contains only line and alignment after etching operation.
Fig. 6 is the original image comprising table.
Fig. 7 is that table is syncopated as from Fig. 6.
Fig. 8 is a certain cell being syncopated as in Fig. 7 table.
Fig. 9 is to be syncopated as the first row text in Fig. 8 cell.
Specific embodiment
The present invention is based on the photo table recognition methods of OCR technique as shown in Figure 1, comprising:
A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content There is table, if there is then continuing, otherwise terminate, specifically include:
A1. as shown in Figures 2 and 3, picture is projected on longitudinal axis according to pixel column, is formed and each pixel column pair The projection column answered.
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms several row cutting figures.
A3. all row cutting figures of iteration, and the etching operation of OCR technique is carried out to each row cutting figure, according to corrosion Whether the result of operation judge in current row cutting figure comprising rectangle frame, if comprising if picture include table, if all Row cutting figure does not all include rectangle frame, then picture does not include table.
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table The line and alignment of lattice, and the intersecting point coordinate of line and alignment is calculated, it specifically includes:
B1. as shown in figure 4, it is sliding on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively It is dynamic, and expansive working is carried out by part of the OCR technique to the row cutting figure slided into, it is therefore an objective to enhance fuzzy table Side facilitates later analysis.Wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain it is as shown in Figure 5 contain only line and The row cutting figure of alignment.
B3. according to the line and alignment of table is obtained, line figure is added with nomogram, is found out all containing only object picture Element, (after binary conversion treatment, background pixel value is " 1 " to the full null range without background pixel in image, and object pixel value is "0");
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds the friendship of a line and alignment Point coordinate.
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection, specifically include:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by phase Same ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell Collection.
D. the cell concentrated according to principle identical with step A, iteration unit lattice, to the cell of each graphic form It carries out row cutting and obtains the line of text of graphic form in each unit lattice;
Fig. 6~Fig. 9 shows original image, a certain cell for being syncopated as table, being syncopated as in table respectively and cuts Separate the process of the first row text in the cell.
E. it identifies the character text in picture, and character text group is combined into structured text, specifically:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling of character string In the cell of table;
E4. tableau format text is saved, for example, json format or xml format etc..
After tested, recognition methods of the invention can be realized the accuracy of Table recognition 100%, can be suitable for high-precision Application field, and resource consumption is few.

Claims (7)

1. the photo table recognition methods based on OCR technique, feature include:
A. row cutting is carried out to image content by OCR technique, judges whether contain table in image content, if there is then after It is continuous, otherwise terminate;
B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table Line and alignment, and calculate the intersecting point coordinate of line and alignment;
C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection;
D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains scheming in each unit lattice The line of text of sheet form;
E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to the positioning of each line of text Its corresponding character text is combined into complete structured text.
2. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step A includes:
A1. picture is projected on longitudinal axis according to pixel column, forms projection column corresponding with each pixel column;
A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms at least one row cutting figure;
A3. all row cutting figures of iteration judge whether contain table in each row cutting figure.
3. the photo table recognition methods based on OCR technique as claimed in claim 2, it is characterized in that: step A3 includes: iteration All row cutting figures, and the etching operation of OCR technique is carried out to each row cutting figure, worked as according to the judgement of the result of etching operation In preceding row cutting figure whether include rectangle frame, if comprising if picture include table, if all row cutting figures all do not include Rectangle frame, then picture does not include table.
4. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step B includes:
B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR technique Expansive working is carried out to the part of the row cutting figure slided into, wherein w is the width of picture, and h is the height of picture;
B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to the row for having been subjected to expansive working Cutting figure carries out etching operation, eliminates the text in row cutting figure in table, obtains the row cutting figure for containing only line and alignment.
5. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: it is calculated described in step B The intersecting point coordinate of line and alignment, comprising:
B3. according to obtaining the line and alignment of table, line figure is added with nomogram, find out it is all containing only object pixel, no Full null range containing background pixel;
B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds a line and the intersection point of alignment is sat Mark.
6. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step C includes:
C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by identical Ordinate alignment, all different column are aligned by identical abscissa;
C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell collection.
7. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step E includes:
E1. the character text in the line of text of all graphic forms is identified by OCR technique;
E2. its corresponding character text is connected by character string according to the positioning of each line of text;
E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling table of character string Cell in;
E4. tableau format text is saved.
CN201910558402.6A 2019-06-26 2019-06-26 Photo table recognition methods based on OCR technique Pending CN110263739A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910558402.6A CN110263739A (en) 2019-06-26 2019-06-26 Photo table recognition methods based on OCR technique

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910558402.6A CN110263739A (en) 2019-06-26 2019-06-26 Photo table recognition methods based on OCR technique

Publications (1)

Publication Number Publication Date
CN110263739A true CN110263739A (en) 2019-09-20

Family

ID=67921629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910558402.6A Pending CN110263739A (en) 2019-06-26 2019-06-26 Photo table recognition methods based on OCR technique

Country Status (1)

Country Link
CN (1) CN110263739A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956087A (en) * 2019-10-25 2020-04-03 天津幸福生命科技有限公司 Method and device for identifying table in picture, readable medium and electronic equipment
CN111223109A (en) * 2020-01-03 2020-06-02 四川新网银行股份有限公司 Complex form image analysis method
CN112528832A (en) * 2020-12-07 2021-03-19 国网青海省电力公司电力科学研究院 Method and system for processing PDF-format relay protection fixed value list
CN113989822A (en) * 2021-12-24 2022-01-28 中奥智能工业研究院(南京)有限公司 Picture table content extraction method based on computer vision and natural language processing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN107315989A (en) * 2017-05-03 2017-11-03 天方创新(北京)信息技术有限公司 For the text recognition method and device of medical information picture
CN109614923A (en) * 2018-12-07 2019-04-12 上海智臻智能网络科技股份有限公司 The recognition methods of OCR document and its device
CN109685052A (en) * 2018-12-06 2019-04-26 泰康保险集团股份有限公司 Method for processing text images, device, electronic equipment and computer-readable medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103258198A (en) * 2013-04-26 2013-08-21 四川大学 Extraction method for characters in form document image
CN107315989A (en) * 2017-05-03 2017-11-03 天方创新(北京)信息技术有限公司 For the text recognition method and device of medical information picture
CN109685052A (en) * 2018-12-06 2019-04-26 泰康保险集团股份有限公司 Method for processing text images, device, electronic equipment and computer-readable medium
CN109614923A (en) * 2018-12-07 2019-04-12 上海智臻智能网络科技股份有限公司 The recognition methods of OCR document and its device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
柴功博 等: "基于视窗的航图导航数据提取技术研究", 《民航学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956087A (en) * 2019-10-25 2020-04-03 天津幸福生命科技有限公司 Method and device for identifying table in picture, readable medium and electronic equipment
CN110956087B (en) * 2019-10-25 2024-04-19 北京懿医云科技有限公司 Method and device for identifying table in picture, readable medium and electronic equipment
CN111223109A (en) * 2020-01-03 2020-06-02 四川新网银行股份有限公司 Complex form image analysis method
CN111223109B (en) * 2020-01-03 2023-06-06 四川新网银行股份有限公司 Complex form image analysis method
CN112528832A (en) * 2020-12-07 2021-03-19 国网青海省电力公司电力科学研究院 Method and system for processing PDF-format relay protection fixed value list
CN113989822A (en) * 2021-12-24 2022-01-28 中奥智能工业研究院(南京)有限公司 Picture table content extraction method based on computer vision and natural language processing
CN113989822B (en) * 2021-12-24 2022-03-08 中奥智能工业研究院(南京)有限公司 Picture table content extraction method based on computer vision and natural language processing

Similar Documents

Publication Publication Date Title
CN110263739A (en) Photo table recognition methods based on OCR technique
CN109308476B (en) Billing information processing method, system and computer readable storage medium
US10339428B2 (en) Intelligent scoring method and system for text objective question
CN110363252B (en) End-to-end trend scene character detection and identification method and system
US10817741B2 (en) Word segmentation system, method and device
CN111402209B (en) U-Net-based high-speed railway steel rail damage detection method
CN105426856A (en) Image table character identification method
CN107633239A (en) Bill classification and bill field extracting method based on deep learning and OCR
CN109858372A (en) A kind of lane class precision automatic Pilot structured data analysis method
CN103439348B (en) Remote controller key defect detection method based on difference image method
CN111242024A (en) Method and system for recognizing legends and characters in drawings based on machine learning
CN104820835A (en) Automatic examination paper marking method for examination papers
CN109544564A (en) A kind of medical image segmentation method
CN106934455B (en) Remote sensing image optics adapter structure choosing method and system based on CNN
CN101777060A (en) Automatic evaluation method and system of webpage visual quality
CN108509988B (en) Test paper score automatic statistical method and device, electronic equipment and storage medium
CN102750534B (en) A kind of method and apparatus of character cutting
CN102024138B (en) Character identification method and character identification device
CN107818321A (en) A kind of watermark date recognition method for vehicle annual test
CN106203296B (en) The video actions recognition methods of one attribute auxiliary
CN112883926B (en) Identification method and device for form medical images
CN113537227A (en) Structured text recognition method and system
CN107622271A (en) Handwriting text lines extracting method and system
CN107784321A (en) Numeral paints this method for quickly identifying, system and computer-readable recording medium
CN110334709A (en) Detection method of license plate based on end-to-end multitask deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190920

RJ01 Rejection of invention patent application after publication