CN102693424A - Document skew correction method based on Harr-like features - Google Patents

Document skew correction method based on Harr-like features Download PDF

Info

Publication number
CN102693424A
CN102693424A CN2012101702708A CN201210170270A CN102693424A CN 102693424 A CN102693424 A CN 102693424A CN 2012101702708 A CN2012101702708 A CN 2012101702708A CN 201210170270 A CN201210170270 A CN 201210170270A CN 102693424 A CN102693424 A CN 102693424A
Authority
CN
China
Prior art keywords
document
harr
objective function
characteristic
moving window
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101702708A
Other languages
Chinese (zh)
Other versions
CN102693424B (en
Inventor
宋利
刘兵
董莉莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201210170270.8A priority Critical patent/CN102693424B/en
Publication of CN102693424A publication Critical patent/CN102693424A/en
Application granted granted Critical
Publication of CN102693424B publication Critical patent/CN102693424B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Editing Of Facsimile Originals (AREA)
  • Character Input (AREA)

Abstract

A document skew correction method based on Harr-like features belongs to the technical field of document image processing. The method includes firstly, diving input images into sub-regions; and then calculating an objective function in the sub-regions, estimating a document skew angle according to an extremum of the objective function, and performing skew correction to the input document images. Compared with a traditional technology, the method is based on the Harr-like features. In addition, the novel document image skew correction method is provided, and an effective solution to skew correction of complex and ordinary document images is provided.

Description

Document sloped correcting method based on the Harr-like characteristic
Technical field
What the present invention relates to is the method in a kind of Document image processing field, specifically is a kind of document sloped correcting method based on the Harr-like characteristic.
Background technology
It is a kind of important file and picture preconditioning technique that file image inclination is proofreaied and correct.General document analysis system (DAS) is as the document printed page analysis, and optical character recognition, file retrieval etc. often require the image of input less than tilting.Yet in the real process, document changes in the process of image through image acquisition equipment (like digital camera, scanner etc.) because human factor or there is machine error in scanner itself, and the file and picture that finally obtains has certain inclination unavoidably.If the image that tilts can not get the performance that accurate correction may have a strong impact on the document analysis system; For example for the document printed page analysis; If the file and picture of input has certain angle of inclination; Character can deform, and intercharacter separation and OCR accuracy of identification can receive bigger influence, and this just requires to introduce in the file and picture preprocessing process this gordian technique of image rectification.
The document slant correction generally is divided into manual correction and from NMO correction.Manual correction is meant and relies on human intervention that input picture is carried out tilt estimation and by Software tool image carried out slant correction.Because the document that will handle in the reality is a magnanimity, only relies on manual work to carry out slant correction, not only waste of manpower, and also efficient is low.Therefore, by computing machine input picture is carried out necessary angle of inclination and estimate, realize that file image inclination has obtained extensive studies and concern from NMO correction.At present, have and a lot of in the document analysis system of reality, obtained application about document slant correction technology.But what these methods often were directed against is the document of particular type, and for the content of document, language and layout structure have certain requirement.For general document, on the one hand, document is of a great variety, and layout structure is complicated, comprises different language in the document, the figure that also contains large scale that has, table, formula and nonstandard handwritten form; On the other hand, the file and picture after the scanning may contain noise in various degree.These factors make file image inclination be corrected into for the difficult point in the file and picture preconditioning technique.Invent and a kind ofly be applicable to that general file image inclination bearing calibration has great importance.
Traditional file image inclination bearing calibration mainly contains: projected outline's analytic approach, Hough converter technique, connected component analytic approach.
Projected outline's analytic approach at first to file and picture in the horizontal direction or vertical direction carry out projection, utilize the cost function calculation angle of inclination of projection histogram then.Though this method is simple, because employing is exhaustive search, speed is slow, and the sensing range at angle of inclination often is limited in [15 °, 15 °].In addition, the accuracy of detection of this method neither be very high.J.Sadri and M.Cheriet; In " A New Approach for Skew Correction of Documents Based on Particle Swarm Optimization " that 10th International Conference on Document Analysis and Recognition in 2009 (the 10th document analysis identification international conference) delivers (new method of a kind of document slant correction based on particle group optimizing) literary composition, seek the objective function extreme value through the particle group optimizing strategy; Significantly reduced the pitch angle searching times, improved inclination angle detection speed .A.Papandreou and B.Gatos and in " ANovel Skew Detection Technique Based on Vertical Projections " (a kind of tilt detection technology) literary composition that 11th International Conference on Document Analysis and Recognition in 2011 (the 11st document analysis identification international conference) delivers, combined vertical projection and bounding box (bounding box) to improve the inclination angle detection precision based on vertical projection.The Hough converter technique is by the Hough conversion, in the Hough space of file and picture, seeks extreme value the pitch angle is estimated.Hough converter technique precision is high, do not have the restriction of angle of inclination sensing range, but the algorithm computation complexity is high.The connected component analytic approach at first is divided into different communication means with file and picture, estimates the pitch angle through analyzing these regional characteristics.
Summary of the invention
The present invention provides a kind of new image tilt correction method in order to solve Chinese fir problem of the prior art, and promptly based on the document sloped correcting method of Harr-like characteristic, this method for general file image inclination estimated result accurately and reliably.
The present invention realizes through following technical scheme; The present invention divides subregion, calculating target function in these subregions then to input picture earlier by the Harr-like characteristic; Carry out the document inclination angle according to the objective function extreme value and estimate, the input file and picture is implemented slant correction.
Described Harr-like characteristic is used for face characteristic by propositions such as Papageorgiou at first and representes.The researchist had carried out many expansions to it again afterwards, had obtained polytype Harr-like characteristic.The present invention uses simple Harr-like 2-rectangular characteristic to be used for file image inclination and proofreaies and correct.
Described Harr-like 2-rectangular characteristic is made up of in abutting connection with the rectangular area two of black and white.Different according to spread pattern, two kinds of V-type and H-type are arranged, correspond respectively to vertical and horizontal both direction.The computing formula of eigenwert: V=sum (deceiving)-sum (in vain).Be in the black region all grey scale pixel values with deduct in the white portion all grey scale pixel values with.For computation of characteristic values, at first to define the size of moving window.
Described moving window is in order original input image to be divided into different subregions, and the eigenwert of calculating these subregions is used to construct objective function.The present invention defines moving window and is of a size of C*H (pixel) and W*C (pixel), corresponds respectively to these two kinds of rectangular characteristic of H-type and V-type.Wherein, H is the height of input picture, and W is the width of input picture.C is a constant, can get even number, as 2,4, and 6 ...After having defined the size of moving window, on original image, be that sliding step moves moving window, calculate corresponding eigenwert, utilize these eigenwerts, can construct new eigenwert, be used to construct objective function with C/2.
Described new eigenwert is meant:
Feature H=∑|H(I)|,i=0,C/2,C--- (1)
Feature V=∑|V(I)|,j=0,C/2,C--- (2)
Wherein, H (i) is a H-type rectangular characteristic value in the moving window, and H (j) is a V-type rectangular characteristic value in the moving window.I, j correspond respectively to the capable subscript and the row subscript of input picture.On level and vertical both direction, be that step-length moves moving window respectively with C/2, the traversal entire image.Absolute value with these eigenwerts adds up at last, just obtains the new eigenwert Feature corresponding to entire image HAnd Feature V
Described objective function is meant, original input image is done the rotation of different angles, corresponding to each rotation angle θ, can obtain a bigger response F (θ) according to formula (2), and this response is objective function.
F(θ)=max[Feature(θ) HFeature(θ) V (3)
When objective function F (0) when obtaining global maximum, corresponding rotation angle is that the inclination angle of input picture is estimated.
Figure BDA00001692587800031
Compare with conventional art, the present invention starts with from the Harr-like characteristic, and a kind of new file image inclination detection method is provided, for this difficult problem of slant correction of the general file and picture of complicacy provides effective solution.
Description of drawings
Fig. 1 is a Harr-like 2-rectangular characteristic synoptic diagram;
Fig. 2 is a process flow diagram of the present invention;
Fig. 3 is an embodiment effect synoptic diagram.
Embodiment
Elaborate in the face of embodiments of the invention down, present embodiment provided detailed embodiment and concrete operating process, but protection scope of the present invention is not limited to following embodiment being to implement under the prerequisite with technical scheme of the present invention.
As shown in Figure 1, Harr-like 2-rectangular characteristic is made up of in abutting connection with the rectangular area two of black and white.Different according to spread pattern, two kinds of V-type and H-type are arranged, correspond respectively to vertical and horizontal both direction.The computing formula of eigenwert: V=sum (deceiving)-sum (in vain).Be in the black region all grey scale pixel values with deduct in the white portion all grey scale pixel values with.
As shown in Figure 2, the process flow diagram of the inventive method, each several part practical implementation details is following:
(1). the input file and picture: image can be a gray-scale map, also can be black and white binary image.
(2). the initialization rotation angle: θ ∈ [15 °, 15 °], the θ initial value is made as-15 °.
(3). image rotating θ degree: utilize the image rotation function, original image is rotated the θ degree.
(4). the calculating target function value, write down current maximal value and corresponding rotation angle: according to formula (1)-(3), calculate the target function value of current postrotational image.During calculating, the size of moving window is provided with C=8.And charge to the maximal value and the corresponding rotation angle of present objective function.
(5). the rotation angle that export target function global maximum is corresponding: when objective function in whole [15 °, 15 °] scope, when obtaining global maximum, corresponding rotation angle θ is the estimated value at pitch angle.
Explain: in order to accelerate exhaustive search speed, guarantee certain precision simultaneously, whole exhaustive search process can be subdivided into for two steps: coarse search and smart search.During coarse search, step-size in search Δ θ=1 °, this process of hunting zone θ ∈ [15 °, 15 °] obtains a rough pitch angle and estimates θ '.During smart the search, step-size in search Δ θ=0.2 °, hunting zone θ ∈ [θ °-0.9 °, θ °+0.9 °].This process obtains a more accurate inclination angle and estimates θ °.
Figure BDA00001692587800041
estimated at final inclination angle then from θ °-0.1; θ °, choose in the middle of θ °-0.1 3.The angle of correspondence when the select target functional value is maximum.
Implementation result:
According to above-mentioned steps, to experimentizing by the open document image data collection that provides on the internet.Data set is the GENERAL TYPE file and picture.Language has letter/Chinese-traditional, Japanese, English and other foreign languages.Content contains literal, large scale figure, table.According to different language and content, data set divides for 5 groups, is respectively: 1) english document image sets, 2) letter/Chinese-traditional and Japanese image sets, 3) contain the image sets of large scale figure, 4) contain the image sets of form, 5) other state's language image sets.Every group has 100 pictures.Whole data set has 500 pictures.
As shown in Figure 3, the test master drawing of figure (a)-(e) for from 5 groups of image sets, extracting, the figure below has provided the inclination angle estimation that the present invention provides, and the data in the bracket are the true inclination angle of image.Visible the inventive method for general file image inclination testing result accurately and has reliably embodied validity of the present invention and value from figure.
In order to embody progressive of the present invention, this method and more representational parallelogram segmentation cover object method propositions such as (, be called for short PCP) C.H.Chou and have carried out quantizing to compare.The present invention adopts average error and variance as measurement index the inclination angle estimated result of two kinds of methods to be estimated.Wherein
Average error is defined as:
Figure BDA00001692587800051
Variance is defined as:
Figure BDA00001692587800052
The low more method that shows of these two indexs is effective more.
Through above-mentioned two kinds of methods are tested on 5 picture group pictures, comparative result is following:
Figure BDA00001692587800053
Above quantitatively evaluating is comparative illustration as a result; Though this method average error on the 3rd group and the 4th picture group picture is looked into the method a little less than PCP; But on other groups and whole 500 pictures,, further embodied the value of this method no matter average error still is that the variance aspect all is superior to the PCP method.The present invention provides effective solution for this difficult problem of slant correction of complicated general file and picture.
Although content of the present invention has been done detailed introduction through above-mentioned preferred embodiment, will be appreciated that above-mentioned description should not be considered to limitation of the present invention.After those skilled in the art have read foregoing, for multiple modification of the present invention with to substitute all will be conspicuous.Therefore, protection scope of the present invention should be limited appended claim.

Claims (5)

1. document sloped correcting method based on the Harr-like characteristic; It is characterized in that; By simple Harr-like 2-rectangular characteristic, earlier input picture is divided subregion, calculating target function in these subregions then; Carry out the document inclination angle according to the objective function extreme value and estimate, the input file and picture is implemented slant correction.
2. the document sloped correcting method based on the Harr-like characteristic according to claim 1; It is characterized in that; Described Harr-like characteristic is made up of in abutting connection with the rectangular area two of black and white, and is different according to spread pattern, and two kinds of V-type and H-type are arranged; Correspond respectively to vertical and horizontal both direction, the computing formula of eigenwert:
V=sum (deceiving)-sum (in vain), promptly in the black region all grey scale pixel values with deduct in the white portion all grey scale pixel values with; In order to construct the final objective function, at first to define the size of moving window.
3. the document sloped correcting method based on the Harr-like characteristic according to claim 2 is characterized in that, described moving window is in order original input image to be divided into different subregions, and the eigenwert of calculating these subregions is used to construct objective function;
The definition moving window is of a size of C*H (pixel) and W*C (pixel), corresponds respectively to these two kinds of rectangular characteristic of H-type and V-type; Wherein, H is the height of input picture, and W is the width of input picture, and C is a constant, gets even number;
After having defined the size of moving window, on original image, be that sliding step moves moving window, calculate corresponding eigenwert, utilize these eigenwerts, construct new eigenwert, be used to construct objective function with C/2.
4. the document sloped correcting method based on the Harr-like characteristic according to claim 3 is characterized in that, described new eigenwert is meant:
Feature H=∑|H(I)|,i=0,C/2,C--- (1)
Feature V=∑|V(I)|,j=0,C/2,C--- (2)
Wherein, H (i) is a H-type rectangular characteristic value in the moving window, and H (j) is a V-type rectangular characteristic value in the moving window, and i, j correspond respectively to the capable subscript and the row subscript of input picture;
On level and vertical both direction, be that step-length moves moving window respectively with C/2, the traversal entire image, the absolute value with these eigenwerts adds up at last, just obtains the new eigenwert Feature corresponding to entire image HAnd Feature V
5. the document sloped correcting method based on the Harr-like characteristic according to claim 4; It is characterized in that; Described objective function is meant, original input image done the rotation of different angles, corresponding to each rotation angle θ; All obtain a bigger response F (θ) according to formula (2), this response is objective function:
F(θ)=max[Feature(θ) HFeature(θ) V]
When objective function F (θ) when obtaining global maximum, corresponding rotation angle is that the inclination angle of input picture is estimated:
Figure FDA00001692587700021
CN201210170270.8A 2012-05-28 2012-05-28 Document skew correction method based on Harr-like features Expired - Fee Related CN102693424B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210170270.8A CN102693424B (en) 2012-05-28 2012-05-28 Document skew correction method based on Harr-like features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210170270.8A CN102693424B (en) 2012-05-28 2012-05-28 Document skew correction method based on Harr-like features

Publications (2)

Publication Number Publication Date
CN102693424A true CN102693424A (en) 2012-09-26
CN102693424B CN102693424B (en) 2014-07-02

Family

ID=46858842

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210170270.8A Expired - Fee Related CN102693424B (en) 2012-05-28 2012-05-28 Document skew correction method based on Harr-like features

Country Status (1)

Country Link
CN (1) CN102693424B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077569A (en) * 2014-06-24 2014-10-01 纵横壹旅游科技(成都)有限公司 Image recognizing method and system
CN110232046A (en) * 2019-05-27 2019-09-13 武汉市润普网络科技有限公司 A kind of electronics folder is with case production method
CN110533036A (en) * 2019-08-28 2019-12-03 湖南长城信息金融设备有限责任公司 A kind of bill scan image quick slant correction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149801A (en) * 2007-10-23 2008-03-26 北京大学 Complex structure file image inclination quick detection method
CN101697228A (en) * 2009-10-15 2010-04-21 东莞市步步高教育电子产品有限公司 Method for processing text images
CN101937508A (en) * 2010-09-30 2011-01-05 湖南大学 License plate localization and identification method based on high-definition image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101149801A (en) * 2007-10-23 2008-03-26 北京大学 Complex structure file image inclination quick detection method
CN101697228A (en) * 2009-10-15 2010-04-21 东莞市步步高教育电子产品有限公司 Method for processing text images
CN101937508A (en) * 2010-09-30 2011-01-05 湖南大学 License plate localization and identification method based on high-definition image

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104077569A (en) * 2014-06-24 2014-10-01 纵横壹旅游科技(成都)有限公司 Image recognizing method and system
CN110232046A (en) * 2019-05-27 2019-09-13 武汉市润普网络科技有限公司 A kind of electronics folder is with case production method
CN110533036A (en) * 2019-08-28 2019-12-03 湖南长城信息金融设备有限责任公司 A kind of bill scan image quick slant correction method and system
CN110533036B (en) * 2019-08-28 2022-06-07 长城信息股份有限公司 Rapid inclination correction method and system for bill scanned image

Also Published As

Publication number Publication date
CN102693424B (en) 2014-07-02

Similar Documents

Publication Publication Date Title
CN111814722B (en) Method and device for identifying table in image, electronic equipment and storage medium
US8170368B2 (en) Correcting device and method for perspective transformed document images
CN101770575B (en) Method and device for measuring image inclination angle of business card
US8457403B2 (en) Method of detecting and correcting digital images of books in the book spine area
US20120099792A1 (en) Adaptive optical character recognition on a document with distorted characters
Chen et al. Shadow-based Building Detection and Segmentation in High-resolution Remote Sensing Image.
CN101149801A (en) Complex structure file image inclination quick detection method
CN105354571B (en) Distortion text image baseline estimation method based on curve projection
CN112734729B (en) Water gauge water level line image detection method and device suitable for night light supplement condition and storage medium
Meng et al. Skew estimation of document images using bagging
Ramappa et al. Skew detection, correction and segmentation of handwritten Kannada document
Ziran et al. Text alignment in early printed books combining deep learning and dynamic programming
CN108961262B (en) Bar code positioning method in complex scene
Yadav et al. Text extraction in document images: highlight on using corner points
CN108256518B (en) Character area detection method and device
CN102693424B (en) Document skew correction method based on Harr-like features
Meng et al. Extraction of virtual baselines from distorted document images using curvilinear projection
Narang et al. Line segmentation of Devanagari ancient manuscripts
Saragiotis et al. Local skew correction in documents
Gui et al. A fast caption detection method for low quality video images
CN107609482B (en) Chinese text image inversion discrimination method based on Chinese character stroke characteristics
Clawson et al. Automated recognition and extraction of tabular fields for the indexing of census records
Kaur et al. Page segmentation in OCR system-a review
Al-Shatnawi A skew detection and correction technique for Arabic script text-line based on subwords bounding
Zhang et al. Text extraction for historical Tibetan document images based on connected component analysis and corner point detection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140702

CF01 Termination of patent right due to non-payment of annual fee