CN104835119A - Method for positioning base line of bending book cover - Google Patents

Method for positioning base line of bending book cover Download PDF

Info

Publication number
CN104835119A
CN104835119A CN201510198135.8A CN201510198135A CN104835119A CN 104835119 A CN104835119 A CN 104835119A CN 201510198135 A CN201510198135 A CN 201510198135A CN 104835119 A CN104835119 A CN 104835119A
Authority
CN
China
Prior art keywords
text
line
height
value
length
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510198135.8A
Other languages
Chinese (zh)
Inventor
肖夏
田健飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201510198135.8A priority Critical patent/CN104835119A/en
Publication of CN104835119A publication Critical patent/CN104835119A/en
Pending legal-status Critical Current

Links

Abstract

The invention relates to a method for positioning base line of a bending book cover, comprising the steps of conducting gray scale transformation on curved-surface images and conducting binary; removing noise influence and obtaining a binary image after pretreatment; setting a rectangle to conduct morphological open/close operation on the binary image, and connecting each row of text into the same communication area; presetting a height threshold based on the height of the text row, and presetting a length threshold based on the length of the text row; calculating the height value and the width value of each communication area, and removing the communication area whose height value is larger than the height threshold; removing the communication area whose length value is shorter than the length threshold, and deleting the object whose area is smaller than a font, and obtaining the communication area of each text row; conducting curve fitting by adopting function of third order, and obtaining the base line of each text row of the curved book cover. The method for positioning base line of a bending book cover is high in precision and rapid in speed.

Description

A kind of method of locating bending written datum line
Art
The invention belongs to digital image processing techniques field, relate to a kind of bending written datum line defining method.
Background technology
Along with continuous progress and the development of modern science and technology, a large amount of of electronic product popularize, and increasing people select to carry out reading and learning at e-platform.But a large amount of documents only has papery version, a large amount of paper documents is carried out electronization with regard to needing by this.Due to portability, the convenience of digital camera, increasing people adopts digital camera to carry out image acquisition to document.Some documents due to thickness larger, can not flatten completely, therefore the image gathered, mostly there will be and be similar to the phenomenon convex in the middle of cylinder, both sides are recessed, post-processed needs the curvature and the depth information that calculate written each several part according to the datum line of bending written image, then utilize the relation between surface coordinates system and plane coordinate system to be writtenly launched into plane picture by bending based on these information, this just needs to extract written datum line; Some document electronicizations only need the textual portions gathering document, and remove and do not comprise the blank parts of information and uninterested image section, this also needs the datum line information utilizing image.Precision, speed that datum line extracts, decide the quality that further work carries out.
Summary of the invention
The object of this invention is to provide the fast bending written datum line localization method of a kind of precision high speed, technical scheme is as follows:
The bending written baseline methodology in a kind of location, comprises the following steps:
1) bending written surface chart picture is gathered;
2) greyscale transformation is carried out to surface chart picture, and carry out binaryzation;
3) according to pixel size and the empirical value of image, delete area in bianry image and be less than the object of a punctuation mark area, remove the impact of noise, obtain through pretreated bianry image;
4) set a rectangle, its length is determined according to the horizontal range between two font centers, and width is determined according to 1/2 of font height, utilizes this rectangle to carry out morphologic opening and closing operation to bianry image, is originally linked to be same connected region by often composing a piece of writing;
5) preset a height threshold according to the height of line of text, the length according to line of text presets a length threshold;
6) calculate height value and the width value of each connected region, connected region height value being greater than this height threshold is removed; Connected region length value being shorter than length threshold is removed, and then deletes the object that area is less than a font area, finally obtains each line of text connected region.
7) ask 6 respectively) in the coboundary of each line of text connected region that obtains and lower boundary, then the intermediate value often organizing coboundary and the corresponding horizontal ordinate of lower boundary is obtained, through adopting function of third order to carry out curve fitting, obtain the datum line of bending each written line of text.
The present invention adopts rectangular configuration to carry out morphologic opening and closing operation to image, the line of text connected region up-and-down boundary obtained can fit tightly with the up-and-down boundary of line of text, make its connected region not by the impact of the inner proportion of font, the center line obtained has higher precision.This method only adopts several morphologic opening and closing operation and third degree curve fitting operation, and computing is simple, has higher speed.
Accompanying drawing explanation
Fig. 1 initial pictures.
Blurred picture after Fig. 2 morphology opening and closing operation.
The blurred picture of Fig. 3 only containing line of text part.
The partial enlargement image (in line of text, white lines represent center line) of Fig. 4 center line positioning result.
Fig. 5 datum line positioning result (in line of text, black lines represents datum line).
The partial enlargement image (in line of text, black lines represents datum line) of Fig. 6 datum line positioning result.
The process flow diagram that Fig. 7 document surface chart is located as datum line.
Embodiment
Below in conjunction with drawings and Examples, the present invention will be described.
Localization method provided by the invention, regard the bounding box of bending written each font as parallelogram, upper side frame and lower frame are parallel, amplify no matter written, reduce or bend, the center line of upper and lower side frame can regard the center line of line of text as, therefore adopts line of text center line to carry out position baseline.
Comprise the following steps:
1) bending written surface chart picture is gathered, as shown in Figure 1.
2) greyscale transformation is carried out to surface chart picture, and carry out binaryzation.Then according to pixel size and the empirical value of image, delete area in bianry image and be less than the object of a punctuation mark area, remove the impact of noise, obtain through pretreated bianry image.
3) rectangle is set, its length is determined according to the horizontal range between two font centers, and width is determined according to 1/2 of font height, utilizes this rectangle to carry out morphologic opening and closing operation to image, originally same connected region is linked to be, as shown in Figure 2 by often composing a piece of writing.
4) preset a height threshold according to the height of line of text, size is about three times of line of text height; Length according to line of text presets a length threshold, and size is about 3/4 of line of text maximum length;
5) calculate height value and the width value of each connected region, connected region height value being greater than height threshold is removed, and eliminates the impact of the higher illustration of written middle height; Connected region length value being shorter than length threshold is removed, and eliminates the impact of the shorter line of text of written middle length; Then delete the object that area is less than a font area, finally obtain the line of text connected region be left.As shown in Figure 3.
6) ask 5 respectively) in the coboundary of each line of text connected region that obtains and lower boundary, then obtain the intermediate value often organizing coboundary and the corresponding horizontal ordinate of lower boundary, as shown in Figure 4.Adopt function of third order to carry out curve fitting to intermediate value, obtain the datum line of bending each written line of text and the equation datum line Equation f of correspondence thereof n(x)=a nx 3+ b nx 2+ c nx+d n, x ∈ (0, len), n ∈ (1, N), N is datum line quantity, the width of len surface chart picture, f nx () represents the ordinate value of datum line, x represents the abscissa value of datum line, a n, b n, c n,d nfor constant, subscript n represents different datum lines.Datum line as illustrated in Figures 5 and 6.
The method based on line of text center line position baseline that the present invention proposes, algorithm not only positioning precision is high, and calculated amount is little, and speed is fast.Line of text region is extracted by the height value and width value that limit connected region, Bock Altitude is less than the connected region that the connected region of 3 times of line of text height values and width value are greater than written width 1/10, experimental result as shown in Figure 3, has and extracts result preferably.
The location of center line of the present invention only needs the up-and-down boundary obtaining each line of text connected region, then obtains the center line of up-and-down boundary, and calculated amount is little, is beneficial to hardware implementing simultaneously, has the value of practical application.

Claims (1)

1. the bending written baseline methodology in location, comprises the following steps:
1) bending written surface chart picture is gathered;
2) greyscale transformation is carried out to surface chart picture, and carry out binaryzation;
3) according to pixel size and the empirical value of image, delete area in bianry image and be less than the object of a punctuation mark area, remove the impact of noise, obtain through pretreated bianry image;
4) set a rectangle, its length is determined according to the horizontal range between two font centers, and width is determined according to 1/2 of font height, utilizes this rectangle to carry out morphologic opening and closing operation to bianry image, is originally linked to be same connected region by often composing a piece of writing;
5) preset a height threshold according to the height of line of text, the length according to line of text presets a length threshold;
6) calculate height value and the width value of each connected region, connected region height value being greater than this height threshold is removed; Connected region length value being shorter than length threshold is removed, and then deletes the object that area is less than a font area, finally obtains each line of text connected region.
7) ask 6 respectively) in the coboundary of each line of text connected region that obtains and lower boundary, then the intermediate value often organizing coboundary and the corresponding horizontal ordinate of lower boundary is obtained, through adopting function of third order to carry out curve fitting, obtain the datum line of bending each written line of text.
CN201510198135.8A 2015-04-23 2015-04-23 Method for positioning base line of bending book cover Pending CN104835119A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510198135.8A CN104835119A (en) 2015-04-23 2015-04-23 Method for positioning base line of bending book cover

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510198135.8A CN104835119A (en) 2015-04-23 2015-04-23 Method for positioning base line of bending book cover

Publications (1)

Publication Number Publication Date
CN104835119A true CN104835119A (en) 2015-08-12

Family

ID=53812989

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510198135.8A Pending CN104835119A (en) 2015-04-23 2015-04-23 Method for positioning base line of bending book cover

Country Status (1)

Country Link
CN (1) CN104835119A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021155575A1 (en) * 2020-02-07 2021-08-12 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Electric device, method of controlling electric device, and computer readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7024043B1 (en) * 1998-12-11 2006-04-04 Fujitsu Limited Color document image recognizing apparatus
CN101267493A (en) * 2007-03-16 2008-09-17 富士通株式会社 Correction device and method for perspective distortion document image
CN202533964U (en) * 2012-02-10 2012-11-14 北方工业大学 Text recognition system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7024043B1 (en) * 1998-12-11 2006-04-04 Fujitsu Limited Color document image recognizing apparatus
CN101267493A (en) * 2007-03-16 2008-09-17 富士通株式会社 Correction device and method for perspective distortion document image
CN202533964U (en) * 2012-02-10 2012-11-14 北方工业大学 Text recognition system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曾凡锋等: ""基于文本行重构的扭曲文档快速校正方法"", 《计算机工程与设计》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021155575A1 (en) * 2020-02-07 2021-08-12 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Electric device, method of controlling electric device, and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN106156761B (en) Image table detection and identification method for mobile terminal shooting
CN104809436B (en) One kind bending written recognition methods
CN111127339B (en) Method and device for correcting trapezoidal distortion of document image
CN103034856B (en) The method of character area and device in positioning image
US20120294528A1 (en) Method of Detecting and Correcting Digital Images of Books in the Book Spine Area
CN107832767A (en) Container number identification method, device and electronic equipment
Zhang et al. A unified framework for document restoration using inpainting and shape-from-shading
KR20150037374A (en) Method, apparatus and computer-readable recording medium for converting document image captured by camera to the scanned document image
CN107133929B (en) The low quality file and picture binary coding method minimized based on background estimating and energy
CN109472249A (en) A kind of method and device of determining script superiority and inferiority grade
CN103488986A (en) Method for segmenting and extracting characters in self-adaptation mode
CN104077775A (en) Shape matching method and device combined with framework feature points and shape contexts
CN110309830A (en) Inscriptions on bones or tortoise shells word automatic division method based on mathematical morphology and the connectivity of region
CN110807454A (en) Character positioning method, device and equipment based on image segmentation and storage medium
Kaundilya et al. Automated text extraction from images using OCR system
CN102737240A (en) Method of analyzing digital document images
CN111784587A (en) Invoice photo position correction method based on deep learning network
CN103914829A (en) Method for detecting edge of noisy image
Sehad et al. Gabor filters for degraded document image binarization
CN104835120A (en) Bended book cover flattening method based on datum line
CN104835119A (en) Method for positioning base line of bending book cover
CN102915429A (en) Scanning picture matching method and device
CN108335266A (en) A kind of antidote of file and picture distortion
Chakraborty et al. Marginal Noise Reduction in Historical Handwritten Documents--A Survey
Machhale et al. Implementation of number recognition using adaptive template matching and feature extraction method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20150812

WD01 Invention patent application deemed withdrawn after publication