CN110263739A

CN110263739A - Photo table recognition methods based on OCR technique

Info

Publication number: CN110263739A
Application number: CN201910558402.6A
Authority: CN
Inventors: 吴信朝; 李开宇; 翟恩荣
Original assignee: Sichuan XW Bank Co Ltd
Current assignee: Sichuan XW Bank Co Ltd
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2019-09-20

Abstract

The present invention relates to the photo table recognition methods based on OCR technique, comprising: A. carries out row cutting to image content by OCR technique, judges whether contain table in image content, if there is then continuing, otherwise terminates；B. expansive working and the etching operation that OCR technique is carried out to row cutting figure, respectively obtain the line and alignment of table, calculate the intersecting point coordinate of line and alignment；C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection；D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains the line of text of graphic form in each unit lattice；E. the character text in the line of text of all graphic forms is identified by OCR technique, and its corresponding character text is combined by complete structured text according to the positioning of each line of text.The present invention can be realized the accuracy of Table recognition 100%, and not need to pre-establish form template, can be suitable for wider application field, and resource consumption is few.

Description

Photo table recognition methods based on OCR technique

Technical field

It is especially the photo table recognition methods based on OCR technique the present invention relates to the method for image recognition.

Background technique

In field of image processing, people make great progress to containing the research that form document identifies.Table is known Before not, needs first to carry out printed page analysis to document, extract the table in document, then table is positioned, last root The text in table is identified according to positioning result.In terms of object detection and positioning, common technology has: edge detecting technology The technologies such as (canny edge detection) and rcnn/faster-rcnn/yolo/ssd；In terms of OCR (optical character identification), mainly Technology has: text classification, cnn (convolutional neural networks) and CRNN+CTC based on supervised learning etc..At present to the positioning of table Mainly use following methods:

(1) the form locating method of rule-based template:

The way of this method is exactly to collect various tables, and different rule templates is extracted from different classes of table. When parsing new table, category division first is carried out to new table, then carries out table solution using the rule template of the category Analysis.

(2) based on the localization method of study:

This method carries out form locating using the method for machine learning.It is divided into two processes of training and prediction.In training rank Section, first have to building one data set, then allow machine learning model to learn a fixed mode on the training set, so as to Forecast period uses.In forecast period, the new table input model to be parsed, model is automatically performed the parsing of table, finally Export recognition result.

The defect of scheme at present:

(1) the form locating method of rule-based template:

The thought of this method is to enumerate.If a new table will position failure not in existing classification.Also, With increasing for table classification, system effectiveness can also be gradually decreased.

(2) based on the localization method of study:

This method is the strategy based on machine learning.One maximum disadvantage of machine learning cannot exactly reach percent Hundred is correct, very high to recognition result requirement in certain application scenarios, if there is 0. 1 percent mistake will result in Very big loss, it is clear that the system based on this strategy is not ideal selection scheme.Moreover, system needs to train in advance, carry out Feature extraction, these can all consume certain resource.Finally, this, which will result in, is if sample set lacks representative and generality System study eventually leads to model and fails to certain form locatings less than certain modes.

Summary of the invention

The photo table recognition methods based on OCR technique that the present invention provides a kind of, does not need to pre-establish form template, And can reach 100% accuracy.

The present invention is based on the photo table recognition methods of OCR technique, comprising:

A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content There is table, if there is then continuing, otherwise terminates；

B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table The line and alignment of lattice, and calculate the intersecting point coordinate of line and alignment；

C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection；

D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains each unit lattice The line of text of middle graphic form；

E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to each line of text Its corresponding character text is combined into complete structured text by positioning.

Compared with the existing methods, flexibility of the invention is more preferable, after a new picture input, does not need image and cuts Divide expert to pre-establish corresponding Table recognition rule, greatly reduces cost.And by test, identification side of the invention Method can be realized the accuracy of Table recognition 100%, can be suitable for high-precision application field, and resource consumption is few.

Specifically, step A includes:

A1. picture is projected on longitudinal axis according to pixel column, forms projection column corresponding with each pixel column；

A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms at least one row cutting Figure；

A3. all row cutting figures of iteration judge whether contain table in each row cutting figure.

Further, step A3 includes: all row cutting figures of iteration, and carries out OCR technique to each row cutting figure Whether etching operation is judged according to the result of etching operation comprising rectangle frame in current row cutting figure, if the picture comprising if Comprising table, if all row cutting figures all do not include rectangle frame, picture does not include table.

Specifically, step B includes:

B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR Technology carries out expansive working to the part of the row cutting figure slided into, it is therefore an objective to which the side for enhancing fuzzy table facilitates and divides below Analysis.Wherein w is the width of picture, and h is the height of picture；

B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain containing only line and the row of alignment is cut Component.

Specifically, calculating the intersecting point coordinate of line and alignment described in step B, comprising:

B3. according to the line and alignment of table is obtained, line figure is added with nomogram, is found out all containing only object picture Element, (after binary conversion treatment, background pixel value is " 1 " to the full null range without background pixel in image, and object pixel value is "0")；

B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds the friendship of a line and alignment Point coordinate.

Specifically, step C includes:

C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by phase Same ordinate alignment, all different column are aligned by identical abscissa；

C2. picture cutting is carried out according to four adjacent coordinates, obtains unit trrellis diagram, all cell figures are at cell Collection.

Further, step E includes:

E1. the character text in the line of text of all graphic forms is identified by OCR technique；

E2. its corresponding character text is connected by character string according to the positioning of each line of text；

E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling of character string In the cell of table；

E4. tableau format text is saved, for example, json format or xml format etc..

The present invention is based on the photo table recognition methods of OCR technique, do not need to pre-establish form template, can adapt to more Extensive scene significantly reduces the workload for pre-establishing form template, and can reach 100% accuracy, greatly The accuracy for improving Table recognition.

Specific embodiment with reference to embodiments is described in further detail above content of the invention again. But the range that this should not be interpreted as to the above-mentioned theme of the present invention is only limitted to example below.Think not departing from the above-mentioned technology of the present invention In the case of thinking, the various replacements or change made according to ordinary skill knowledge and customary means should all be included in this hair In bright range.

Detailed description of the invention

Fig. 1 is that the present invention is based on the flow charts of the photo table recognition methods of OCR technique.

Fig. 2 is the binary map of original image.

Fig. 3 is the perspective view of Fig. 2 on longitudinal axis.

Fig. 4 is the original image comprising table and table text.

Fig. 5 is the row cutting figure that Fig. 4 contains only line and alignment after etching operation.

Fig. 6 is the original image comprising table.

Fig. 7 is that table is syncopated as from Fig. 6.

Fig. 8 is a certain cell being syncopated as in Fig. 7 table.

Fig. 9 is to be syncopated as the first row text in Fig. 8 cell.

Specific embodiment

The present invention is based on the photo table recognition methods of OCR technique as shown in Figure 1, comprising:

A. row cutting is carried out to image content by OCR (optical character identification) technology, judges whether contain in image content There is table, if there is then continuing, otherwise terminate, specifically include:

A1. as shown in Figures 2 and 3, picture is projected on longitudinal axis according to pixel column, is formed and each pixel column pair The projection column answered.

A2. cutting is carried out by the pixel column that length of the OCR technique to projection column is zero, forms several row cutting figures.

A3. all row cutting figures of iteration, and the etching operation of OCR technique is carried out to each row cutting figure, according to corrosion Whether the result of operation judge in current row cutting figure comprising rectangle frame, if comprising if picture include table, if all Row cutting figure does not all include rectangle frame, then picture does not include table.

B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table The line and alignment of lattice, and the intersecting point coordinate of line and alignment is calculated, it specifically includes:

B1. as shown in figure 4, it is sliding on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively It is dynamic, and expansive working is carried out by part of the OCR technique to the row cutting figure slided into, it is therefore an objective to enhance fuzzy table Side facilitates later analysis.Wherein w is the width of picture, and h is the height of picture；

B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to having been subjected to expansive working Row cutting figure carry out etching operation, eliminate the text in row cutting figure in table, obtain it is as shown in Figure 5 contain only line and The row cutting figure of alignment.

C. it is carried out cutting figure according to the intersecting point coordinate, obtains cell collection, specifically include:

D. the cell concentrated according to principle identical with step A, iteration unit lattice, to the cell of each graphic form It carries out row cutting and obtains the line of text of graphic form in each unit lattice；

Fig. 6~Fig. 9 shows original image, a certain cell for being syncopated as table, being syncopated as in table respectively and cuts Separate the process of the first row text in the cell.

E. it identifies the character text in picture, and character text group is combined into structured text, specifically:

E4. tableau format text is saved, for example, json format or xml format etc..

After tested, recognition methods of the invention can be realized the accuracy of Table recognition 100%, can be suitable for high-precision Application field, and resource consumption is few.

Claims

1. the photo table recognition methods based on OCR technique, feature include:

A. row cutting is carried out to image content by OCR technique, judges whether contain table in image content, if there is then after It is continuous, otherwise terminate；

B. expansive working and the etching operation that OCR technique is carried out to the row cutting figure formed after row cutting, respectively obtain table Line and alignment, and calculate the intersecting point coordinate of line and alignment；

D. the cell that iteration unit lattice are concentrated, carries out row cutting to the cell of each graphic form and obtains scheming in each unit lattice The line of text of sheet form；

E. the character text in the line of text of all graphic forms is identified by OCR technique, and according to the positioning of each line of text Its corresponding character text is combined into complete structured text.

2. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step A includes:

3. the photo table recognition methods based on OCR technique as claimed in claim 2, it is characterized in that: step A3 includes: iteration All row cutting figures, and the etching operation of OCR technique is carried out to each row cutting figure, worked as according to the judgement of the result of etching operation In preceding row cutting figure whether include rectangle frame, if comprising if picture include table, if all row cutting figures all do not include Rectangle frame, then picture does not include table.

4. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step B includes:

B1. it is slided on the row cutting figure with the grid of " 1x2/3w " and " 2/3hx1 " size respectively, and passes through OCR technique Expansive working is carried out to the part of the row cutting figure slided into, wherein w is the width of picture, and h is the height of picture；

B2. it is expert on cutting figure and is slided with " 1x2/3w " and " 2/3hx1 " big sub-box respectively, to the row for having been subjected to expansive working Cutting figure carries out etching operation, eliminates the text in row cutting figure in table, obtains the row cutting figure for containing only line and alignment.

5. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: it is calculated described in step B The intersecting point coordinate of line and alignment, comprising:

B3. according to obtaining the line and alignment of table, line figure is added with nomogram, find out it is all containing only object pixel, no Full null range containing background pixel；

B4. the centre coordinate of each full null range is calculated, each centre coordinate respectively corresponds a line and the intersection point of alignment is sat Mark.

6. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step C includes:

C1., obtained intersecting point coordinate is carried out to abscissa alignment and ordinate alignment respectively, makes all colleagues not by identical Ordinate alignment, all different column are aligned by identical abscissa；

7. the photo table recognition methods based on OCR technique as described in claim 1, it is characterized in that: step E includes:

E3. according to step A~step D dicing process, tableau format is restored, and the corresponding filling table of character string Cell in；

E4. tableau format text is saved.