CN102073862B - Method for quickly calculating layout structure of document image - Google Patents

Method for quickly calculating layout structure of document image Download PDF

Info

Publication number
CN102073862B
CN102073862B CN 201110040357 CN201110040357A CN102073862B CN 102073862 B CN102073862 B CN 102073862B CN 201110040357 CN201110040357 CN 201110040357 CN 201110040357 A CN201110040357 A CN 201110040357A CN 102073862 B CN102073862 B CN 102073862B
Authority
CN
China
Prior art keywords
image
text
layout structure
line
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN 201110040357
Other languages
Chinese (zh)
Other versions
CN102073862A (en
Inventor
马磊
刘江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANDONG SHANDA OUMA SOFTWARE CO., LTD.
Original Assignee
SHANDONG SHANDA OUMA SOFTWARE CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANDONG SHANDA OUMA SOFTWARE CO Ltd filed Critical SHANDONG SHANDA OUMA SOFTWARE CO Ltd
Priority to CN 201110040357 priority Critical patent/CN102073862B/en
Publication of CN102073862A publication Critical patent/CN102073862A/en
Application granted granted Critical
Publication of CN102073862B publication Critical patent/CN102073862B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method for quickly calculating a layout structure of a document image. The method is characterized by comprising the following steps of: (1) inputting an image, and performing grayscale conversion if the input image is a true color image; (2) performing horizontal gradient calculation on the input image; (3) merging areas in and between characters of the input character;(4) marking a text line of the input image; (5) performing target tacking and positioning on the text line; (6) performing length filtering on the input image to obtain an image of the layout structure; and (7) outputting the image of the layout structure and a grayscale image. The calculation of the layout structure is simple and effective, the method has certain adaptability, the condition thatan image is deflected at a small angle can be processed, and the test of an aliasing condition of text lines of English script proves that the method for quickly calculating the layout structure of the document image has high robustness.

Description

A kind of fast file and picture layout structure computing method
Technical field
The present invention relates to a kind of layout structure computing method, specifically, relate to a kind of fast file and picture layout structure computing method.
Background technology
The layout structure of document possesses geometry and logical meaning, and for example general file and picture contains title, paragraph, row essential information.Layout structure can be thought to be comprised of several mutually disjoint rectangular blocks (block), and printed page analysis is to calculate these rectangular block features, is used for describing the architectural feature of file and picture.
Basic file and picture page geometry analytical algorithm can be classified as three classes: bottom-up class, top-down class and integrated approach class.Bottom-up method is utilized the local message of image, and (word merged in character in the merging in the zone by progressively will having same alike result; Sentence merged in word; The sentence section of merging into), obtain the understanding to the document space of a whole page.The method can be processed the document of the different spaces of a whole page and have the document of certain inclination, but the higher and regional merging rule of time cost is complicated.Top-down method is from image overall, rely on the figure of projected outline of image, progressively to Image Segmentation Using, obtain at last the geometry of image, the method has certain adaptability, the method for projection has simply, fast, the advantage that is easy to realize, but not good enough to the complex documents effect, the factor that affects the method validity comprises the randomness of literal line position, the scrambling of region shape and the inclination of file and picture etc.Therefore, the comprehensive two kinds of methods of scholar are arranged, so that its adaptability, Algorithm Performance all are improved.The present invention adopts bottom-up charcter topology analytical approach, avoid higher zone to merge rule, eliminate non-text filed by the length filtering method, top-down analytical approach is considered the link of line of text feature, in order to eliminate the staircase of text block, therefore the space of a whole page is expressed as the set of some continuous linears, their position by highly abstract be that point in the space is gathered.
The printed page analysis method can be used for a plurality of image processing field, for example file image content retrieval, coupling by layout structure increases its retrieval reliability, cut apart the field in the picture and text mixing, printed page analysis, the image property in the difference zones of different, and then adopt different enhancing algorithms to improve quality and the subjective feeling of image, in the OCR field, the accuracy of recognition performance and Character segmentation is closely related, and printed page analysis has very important status aspect Character segmentation.Therefore printed page analysis research has very important researching value and actual application background.
Summary of the invention
The technical problem to be solved in the present invention provides a kind of fast file and picture layout structure computing method, and it is simple, effective that layout structure calculates, and has certain adaptive faculty, can process the situation of low-angle image deflection, has preferably robustness.
The present invention adopts following method to realize goal of the invention:
A kind of fast file and picture layout structure computing method is characterized in that, comprise the steps:
(1) input picture if input picture is rgb image, then carries out gradation conversion;
(2) input picture being carried out horizontal gradient calculates;
(3) in the character of input picture and intercharacter zone is merged;
(4) input picture is carried out the line of text mark;
(5) line of text is carried out target following and location;
(6) input picture is carried out length filtering, obtain the layout structure image;
(7) output space of a whole page structural images and gray level image.
As the further restriction to the technical program, described step (2) comprises the steps:
(2.1) the convolution kernel factor of note horizontal gradient is
Figure 369994DEST_PATH_IMAGE001
,
Figure 442992DEST_PATH_IMAGE002
The presentation video width is
Figure 223998DEST_PATH_IMAGE003
, highly be The gray-scale value of correspondence position pixel, then gradient image Corresponding gray-scale value
Figure 276682DEST_PATH_IMAGE006
Be expressed as:
Figure 415540DEST_PATH_IMAGE007
(1)
(2.2) By the statistics gray-scale value be
Figure 61285DEST_PATH_IMAGE008
The probability that in image, occurs of pixel Obtain gradient image
Figure 109323DEST_PATH_IMAGE005
Grey level histogram
Figure 481399DEST_PATH_IMAGE010
, gray-scale value
Figure 863707DEST_PATH_IMAGE008
Corresponding one-dimension information entropy is designated as
Figure 917114DEST_PATH_IMAGE011
, cut-point then
Figure 819211DEST_PATH_IMAGE012
Calculating be equivalent to following form:
Figure 299871DEST_PATH_IMAGE013
(2)
(2.3) cut-point that maximum informational entropy is corresponding is
Figure 405361DEST_PATH_IMAGE012
, in the formula (2),
Figure 324776DEST_PATH_IMAGE011
Calculating use before the cut-point and the normalization probability behind the cut-point
Figure 19062DEST_PATH_IMAGE014
The computing information entropy.
As the further restriction to the technical program, described step (3) comprises the steps:
(3.1) note perpendicular to the expansion factor on the words direction is
Figure 982208DEST_PATH_IMAGE015
, the words direction expansion factor , average character duration is , the length filtering factor
Figure 846893DEST_PATH_IMAGE018
, then have following relation to set up:
Figure 934934DEST_PATH_IMAGE019
(3)
(3.2) note formula (2) is cut apart
Figure 264285DEST_PATH_IMAGE005
After image be
Figure 197606DEST_PATH_IMAGE020
,
Figure 178069DEST_PATH_IMAGE020
Obtain through reaching the merging of intercharacter zone in the character
Figure 499329DEST_PATH_IMAGE021
,
Figure 801128DEST_PATH_IMAGE022
Width is on the presentation video
Figure 803719DEST_PATH_IMAGE003
, highly be
Figure 389421DEST_PATH_IMAGE004
Pixel grey scale, then:
Figure 882849DEST_PATH_IMAGE023
(4)
As the further restriction to the technical program, described step (4) comprises the steps:
(4.1) note VG (vertical gradient) detection nuclear factor is
Figure 124475DEST_PATH_IMAGE024
, the line of text marking image is
Figure 727494DEST_PATH_IMAGE025
, get coboundary as the line of text mark, then have following relation to set up:
(5)
(4.2) mark of line of text has used coboundary or lower limb, in case line of text occur to be interrupted or since the undulatory property of character picture make Produce wave phenomenon, for overcoming this defective,
Figure 44840DEST_PATH_IMAGE025
Before the calculating, repair For
Figure 261113DEST_PATH_IMAGE027
,
Figure 767180DEST_PATH_IMAGE027
Have better smooth effect, be conducive to the tracking of effective target.
Compared with prior art, advantage of the present invention and good effect are: layout structure of the present invention calculates simple, effective, has certain adaptive faculty, can process the situation of low-angle image deflection, test for English handwritten form line of text aliasing situation shows that this algorithm has preferably robustness.This disposal route algorithm complex is low, and the track and localization of length filtering and line of text target has been avoided connective computation process, and image thinning uses the VG (vertical gradient) alternate algorithm, so the method can be used for real time image processing system.The method can be used for a plurality of image processing field, the field such as cuts apart such as Document image retrieval, Character segmentation, file and picture classification, file and picture analysis, picture and text.
Description of drawings
Fig. 1 is the single character iconic model of the present invention.
Fig. 2 is single character graph connectedness synoptic diagram.
Fig. 3 is pitch structure synoptic diagram between the character.
Fig. 4 is character row pitch characteristics structural representation.
Fig. 5 is row piece preparing structure synoptic diagram.
Fig. 6 is row target following structural representation.
Fig. 7 is non-text area length filtering synoptic diagram.
Fig. 8 is preferred embodiment of the present invention overall flow figure.
Fig. 9 is the space of a whole page figure of preferred embodiment one.
Figure 10 is the gradation conversion image of preferred embodiment one.
Figure 11 is the capable piece marking image of preferred embodiment one.
Figure 12 is the VG (vertical gradient) computed image of preferred embodiment one.
Figure 13 is the filtered layout structure image of preferred embodiment one length.
Figure 14 is preferred embodiment one layout structure comprehensive comparison synoptic diagram.
Figure 15 is the image layout structural drawing after preferred embodiment one tilts.
Figure 16 is the comprehensive comparative structure synoptic diagram of preferred embodiment one inclination layout structure.
Figure 17 is the file and picture of preferred embodiment two.
Figure 18 is the layout structure image of preferred embodiment two.
Figure 19 is preferred embodiment two ladder eradicating efficacy figure.
Figure 20 is preferred embodiment two comprehensive comparative structure synoptic diagram.
Embodiment
Below in conjunction with accompanying drawing and preferred embodiment the present invention is done further to describe in detail.
1, image text line character characteristics:
The file and picture general significance refers to only comprise the image of character information, and the actual file and picture that uses is very complicated, such as picture and text mixing, calligraphy work, form etc., and this brings huge challenge to printed page analysis work.But we recognize that also character picture has some singularity, and for example its color resolution is less demanding, and space requirement resolution is high, and natural image is just opposite.The feature that how to take full advantage of text image is the key point of printed page analysis, and the singularity of file and picture is mainly manifested in the following aspects:
(1) all directions gradient is larger
Character picture has strong marginal information, gradient often reflects edge definition, therefore a strong edge (gradient) pixel represents possible text area, by the expansion to the text area certain size of this pixel representative, can effectively connect the text block that belongs to a line of text, in conjunction with the straight line characteristics of line of text, we have used the horizontal gradient detection method, horizontal text area extended method.
As shown in Figure 1, single character iconic model is the circle of a standard, and the pixel on each circle has stronger marginal information, illustrates that this zone is possible text area, when a plurality of character models effectively link, they will consist of an effective text block.
(2) various piece of monocase image is not a complete connected domain
Character picture, the single character of the expression of Chinese character image particularly, may have different ingredients, therefore working as a pixel is possible text area, propagation size needs greater than the interval between the different ingredients in the basic character, otherwise the cavitation that text block is calculated will occur, bring unfavorable factor to printed page analysis work.
As shown in Figure 2, when the text area of each pixel logo was expanded, the extends perpendicular size should be at least greater than the represented number of pixels of d among the figure.
(3) between the character certain gap is arranged in the same line of text
Interval between the character is the important evidence of Character segmentation algorithm, when printed page analysis, it is an effective text block that the text area expansion connects different character zones, non-text area does not possess this key character, this equates each character picture abstract be a circle, connect the result that these circles are printed page analysis.
As shown in Figure 3, if having indicated, the spacing between the character carries out effective text area merging, the text area propagation size is at least greater than the character pitch of spacing maximum, therefore this size can effectively be removed non-text filed by length filtering much smaller than the distance between the effective gradient pixel of non-text area.
(4) certain gap is arranged between the line of text
Layout structure calculates the text area of always selecting to carry out on the words direction by a relatively large margin and expands, with different ingredients in effective concatenation character and different characters, expansion in non-legible direction causes different literal line generation aliasings, therefore takes full advantage of the text area expansion of line of text gap constraint pixel representative.
As shown in Figure 4, the character row pitch characteristics is a restrictive condition apart from d, if the text area that carries out in the vertical direction expansion surpasses this distance, then can not distinguish two effective line of text.
(5) line of text has the straight line characteristics
The printed page analysis result uses the set of one group of straight line to explain, and utilizes the straight line characteristics of line of text to carry out tracking and the location of straight line.This feature can be used for the pitch angle of file and picture and estimates and correction.
(6) character is complicated at aspects such as size, language classification, color, fonts.
Layout structure calculates
According to the analysis of above-mentioned image text line character feature, this section specifically describes computing method and the layout structure statement model of layout structure.
2.1 horizontal gradient calculates:
The line of text zone that the region representation that horizontal gradient is large is possible, the large selected threshold value of the gradient of file and picture is determined by the one-dimensional maximum entropy for segmentation method.
The convolution kernel factor of note horizontal gradient is
Figure 796447DEST_PATH_IMAGE001
, The presentation video width is , highly be The gray-scale value of correspondence position pixel, then gradient image
Figure 582558DEST_PATH_IMAGE005
Corresponding gray-scale value
Figure 268755DEST_PATH_IMAGE006
Be expressed as:
Figure 288794DEST_PATH_IMAGE007
(1)
Consider the complicacy of file and picture, choosing of its Grads threshold should have adaptivity, obtains gradient image by statistics Grey level histogram
Figure 389792DEST_PATH_IMAGE010
, gray-scale value
Figure 925684DEST_PATH_IMAGE008
Corresponding one-dimension information entropy is designated as
Figure 252760DEST_PATH_IMAGE011
, cut-point then
Figure 599428DEST_PATH_IMAGE012
Calculating be equivalent to following form:
Figure 559425DEST_PATH_IMAGE013
(2)
The cut-point that maximum informational entropy is corresponding is
Figure 587424DEST_PATH_IMAGE012
, in the formula (2),
Figure 831323DEST_PATH_IMAGE011
Calculating use before the cut-point and the normalization probability behind the cut-point
Figure 286575DEST_PATH_IMAGE014
The computing information entropy.
2.2 in the character and intercharacter zone merges:
The calculating of space of a whole page feature need be finished possible line of text region merging algorithm, also find to be difficult to estimate character pitch and different line of text spacings in the experiment, it has been generally acknowledged that the line of text spacing is more than or equal to the distance of 2 pixels, therefore the expansion factor with the words direction vertical direction is 2, guarantee not occur between the line of text obscuring of the space of a whole page, spacing and character pitch are usually less than character duration in the single character, therefore the expansion factor on the words direction is greater than character duration, usually choose the twice of character duration, too large expansion factor produces the row block length and detects error, in case the expansion factor on the words direction has determined that effectively filter length is expansion factor on the words direction.Note perpendicular to the expansion factor on the words direction is
Figure 503842DEST_PATH_IMAGE015
, the words direction expansion factor
Figure 132269DEST_PATH_IMAGE016
, average character duration is
Figure 168359DEST_PATH_IMAGE017
, the length filtering factor
Figure 607561DEST_PATH_IMAGE018
, then have following relation to set up:
Figure 729101DEST_PATH_IMAGE019
(3)
Unique needs determine it is the character mean breadth, with regard to practical application, choose suitable character duration and can satisfy the major applications demand, this is less than character duration because of most of character pitch, after two characters successfully connect, its block length just is 3 times of character durations, and this length is fit to length filtering very much, and this is the important difference of text area and non-text area just also.
Note formula (2) is cut apart
Figure 161219DEST_PATH_IMAGE005
After image be ,
Figure 426033DEST_PATH_IMAGE020
Obtain through reaching the merging of intercharacter zone in the character
Figure 97186DEST_PATH_IMAGE021
,
Figure 818149DEST_PATH_IMAGE022
Width is on the presentation video
Figure 891147DEST_PATH_IMAGE003
, highly be Pixel grey scale, then:
(4)
2.3 line of text labeling method
The line of text labeling algorithm is simplified subsequent processes with the layout information singular pixel,
Figure 853790DEST_PATH_IMAGE021
Be bianry image, therefore only use its coboundary or lower limb can describe positional information, the length information of line of text.
The note VG (vertical gradient) detects nuclear factor
Figure 984557DEST_PATH_IMAGE024
, the line of text marking image is
Figure 185731DEST_PATH_IMAGE025
, get coboundary as the line of text mark, then have following relation to set up:
Figure 503580DEST_PATH_IMAGE026
(5)
The mark of line of text has used coboundary or lower limb, in case line of text occur to be interrupted or since the undulatory property of character picture make Produce wave phenomenon, for overcoming this defective,
Figure 191099DEST_PATH_IMAGE025
Before the calculating, repair
Figure 868899DEST_PATH_IMAGE021
For
Figure 674044DEST_PATH_IMAGE027
,
Figure 789768DEST_PATH_IMAGE027
Have better smooth effect, be conducive to the tracking of effective target.As shown in Figure 5,1-9 is the represent pixel point respectively, and black shade represents text area, and when step occurred, the position of filling up 4 or 6 correspondences was text area.
2.4 line of text target following and location
For obtaining an effective line of text and length and location thereof, the line trace algorithm is judged the contiguous pixels on three directions.The elimination of step is so that the track and localization of line of text target is very accurate, and the present invention uses the continuous curve pixel count to represent the length of line of text.
As shown in Figure 6, the pixel status of 3 directions of marker for judgment of target, because coboundary or the lower limb of line of text piece have been used in the printed page analysis of line of text, therefore can fully guarantee single pixel characteristic of layout structure feature, labeling algorithm is more simple effectively, only need to judge on three adjacent directions of current pixel whether pixel corresponding to layout structure is arranged, the result that labeling algorithm produces uses a five-tuple to describe the row essential characteristic
Figure 442597DEST_PATH_IMAGE028
:
Figure 985574DEST_PATH_IMAGE029
(6)
The basic meaning of five-tuple is X MinCapable minimum widith coordinate, the X of expression portrayal straight line boundary rectangle MaxCapable breadth extreme coordinate, the у of expression portrayal straight line boundary rectangle MinCapable minimum constructive height coordinate, the у of expression portrayal straight line boundary rectangle MaxCapable maximum height coordinate and the р of expression portrayal straight line boundary rectangle TotalThe effective length of expression row information.
2.5 length filtering
Length filtering occurs in and reaches in the character after the merging of intercharacter zone, the significance of length filtering is to carry out non-text filed filtration, its basic thought is to judge whether possible text area pixel both sides satisfy the requirement of length filtering, length computation corresponding to this pixel this moment used target following, if this length is greater than filter length, stop to calculate, think that this article local area pixel is true, otherwise be non-text area pixel.
As shown in Figure 7, the possible text area (current judgement target) of " * " representative, white pixel represents non-text area, the current goal left side and about four text area pixels are respectively arranged, the effective length of current pixel is 8, and in fact, the effective length of each text area pixel shown in the legend is 8, if the filter length parameter is 5, then the text area pixel shown in the legend all keeps.
The line of text feature uses formula (6) to describe, and according to the needs of practical application, may again carry out length filtering, owing to describe the length that has calculated line of text in the process in feature, therefore can be very easy to carry out the filtering work of little target according to length.
[0008]3, the process flow diagram of integral body of the present invention
This algorithm has been avoided the preprocessing process of file and picture, use the one-dimensional maximum entropy for segmentation method of gradient image to determine that text area is in the cut-point of non-text area, printed page analysis has taken into full account the essential characteristic that file and picture is different from natural image, set up the monocase model, and considered spacing, character pitch, line space feature in the character, after length filtering, obtained extraordinary effect.
As shown in Figure 8, if input picture is rgb image, then carry out gradation conversion, image is carried out horizontal gradient to be calculated, behind the row piece mark, image is carried out VG (vertical gradient) calculate, the VG (vertical gradient) computing method adopt formula identical with the horizontal gradient computing method, do not repeat them here, then carry out the target following marking image, optionally carry out length filtering, eliminate step, the effect of calculating for the ease of comparing layout structure, output image has comprised layout structure and two kinds of information of gray level image.
Embodiment one
For the validity of testing algorithm, we choose Sohu's page of layout structure more complicated as test sample book, and the parameter initialization condition is:
Figure 543594DEST_PATH_IMAGE030
(7)
(1) picture and text vision-mix printed page analysis is referring to Fig. 9-Figure 14.
(2) image inclination is on the impact of this algorithm
Referring to Figure 15, Figure 16, in order to test the adaptability of this algorithm, our angle certain to former figure deflection (5 angle), image rotation has used bilinear interpolation method, from horizontal direction, long row has produced aliasing, bilinear interpolation often produces obscurity boundary, can produce more straight line staircase during inclination, but test case shows on length filtering impact not quite, experiment showed, that the method has certain stability to the little image in angle of inclination.When the angle of inclination is excessive, can estimate the angle of inclination, carry out again printed page analysis work behind the correction image.Pitch angle method of estimation based on file image content has some Research foundations at present, but reliability and the accuracy of using morphological method to calculate in conjunction with Hough conversion Effective Raise angle of inclination.
Embodiment two
The printed page analysis of English handwritten form
Referring to Figure 17-20, English handwritten form file and picture has some characteristic features, for example often produces aliasing in the ranks, and row piece step phenomenon is particularly evident, experiment showed, that this algorithm has good robustness to such image layout analysis, and effect is remarkable.
In sum, that layout structure of the present invention calculates is simple, effectively, has certain adaptive faculty, can process the situation of low-angle image deflection, shows that for the test of English handwritten form line of text aliasing situation this invention has preferably robustness.This disposal route algorithm complex is low, and the track and localization of length filtering and line of text target has been avoided connective computation process, and image thinning uses the VG (vertical gradient) alternate algorithm, so the method can be used for real time image processing system.The method can be used for a plurality of image processing field, the field such as cuts apart such as Document image retrieval, Character segmentation, file and picture classification, file and picture analysis, picture and text.

Claims (3)

1. file and picture layout structure computing method fast is characterized in that, comprise the steps:
(1) input picture if input picture is rgb image, then carries out gradation conversion;
(2) input picture being carried out horizontal gradient calculates;
(3) in the character of input picture and intercharacter zone is merged;
(4) input picture is carried out the line of text mark;
(5) line of text is carried out target following and location;
(6) input picture is carried out length filtering, obtain the layout structure image;
(7) output space of a whole page structural images and gray level image.
2. described file and picture layout structure computing method according to claim 1 is characterized in that described step (2) comprises the steps:
(2.1) the convolution kernel factor of note horizontal gradient is
Figure 358044DEST_PATH_IMAGE001
,
Figure DEST_PATH_IMAGE002
The presentation video width is
Figure 150550DEST_PATH_IMAGE003
, highly be
Figure DEST_PATH_IMAGE004
The gray-scale value of correspondence position pixel, then gradient image Corresponding gray-scale value Be expressed as:
Figure 70370DEST_PATH_IMAGE007
(1)
(2.2) The probability that the pixel that is by the statistics gray-scale value occurs in image obtains the grey level histogram of gradient image, gray-scale value
Figure 724970DEST_PATH_IMAGE008
Corresponding one-dimension information entropy is designated as
Figure 986187DEST_PATH_IMAGE011
, cut-point then
Figure DEST_PATH_IMAGE012
Calculating be equivalent to following form:
Figure 530432DEST_PATH_IMAGE013
(2)
(2.3) cut-point that maximum informational entropy is corresponding is , in the formula (2),
Figure 153492DEST_PATH_IMAGE011
Calculating use before the cut-point and the normalization probability behind the cut-point The computing information entropy.
3. described file and picture layout structure computing method according to claim 2 is characterized in that described step (3) comprises the steps:
(3.1) note perpendicular to the expansion factor on the words direction is
Figure 765870DEST_PATH_IMAGE015
, the words direction expansion factor
Figure DEST_PATH_IMAGE016
, average character duration is
Figure 164621DEST_PATH_IMAGE017
, the length filtering factor
Figure DEST_PATH_IMAGE018
, then have following relation to set up:
Figure 469832DEST_PATH_IMAGE019
(3)
(3.2) note formula (2) is cut apart
Figure 711457DEST_PATH_IMAGE005
After image be
Figure DEST_PATH_IMAGE020
, Obtain through reaching the merging of intercharacter zone in the character
Figure 312608DEST_PATH_IMAGE021
,
Figure DEST_PATH_IMAGE022
Width is on the presentation video
Figure 523141DEST_PATH_IMAGE003
, highly be
Figure 314379DEST_PATH_IMAGE004
Pixel grey scale, then:
Figure 206243DEST_PATH_IMAGE023
(4)。
4, described file and picture layout structure computing method according to claim 2 is characterized in that described step (4) comprises the steps:
(4.1) note VG (vertical gradient) detection nuclear factor is
Figure DEST_PATH_IMAGE024
, the line of text marking image is
Figure 579586DEST_PATH_IMAGE025
, get coboundary as the line of text mark, then have following relation to set up:
Figure DEST_PATH_IMAGE026
(5)
(4.2) mark of line of text has used coboundary or lower limb, in case line of text occur to be interrupted or since the undulatory property of character picture make
Figure 226600DEST_PATH_IMAGE021
Produce wave phenomenon, for overcoming this defective, Before the calculating, repair For
Figure 349910DEST_PATH_IMAGE027
,
Figure 292459DEST_PATH_IMAGE027
Have better smooth effect, be conducive to the tracking of effective target.
CN 201110040357 2011-02-18 2011-02-18 Method for quickly calculating layout structure of document image Active CN102073862B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110040357 CN102073862B (en) 2011-02-18 2011-02-18 Method for quickly calculating layout structure of document image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110040357 CN102073862B (en) 2011-02-18 2011-02-18 Method for quickly calculating layout structure of document image

Publications (2)

Publication Number Publication Date
CN102073862A CN102073862A (en) 2011-05-25
CN102073862B true CN102073862B (en) 2013-04-17

Family

ID=44032396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110040357 Active CN102073862B (en) 2011-02-18 2011-02-18 Method for quickly calculating layout structure of document image

Country Status (1)

Country Link
CN (1) CN102073862B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804978B (en) * 2017-04-28 2022-04-12 腾讯科技(深圳)有限公司 Layout analysis method and device
CN107943780B (en) * 2017-12-18 2021-07-06 科大讯飞股份有限公司 Layout column dividing method and device
CN109993161B (en) * 2019-02-25 2021-08-03 众安信息技术服务有限公司 Text image rotation correction method and system
CN110490904B (en) * 2019-08-12 2022-11-11 中国科学院光电技术研究所 Weak and small target detection and tracking method
CN111539412B (en) * 2020-04-21 2021-02-26 上海云从企业发展有限公司 Image analysis method, system, device and medium based on OCR

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5848184A (en) * 1993-03-15 1998-12-08 Unisys Corporation Document page analyzer and method
CN100568221C (en) * 2004-11-22 2009-12-09 北京北大方正技术研究院有限公司 A kind of method of newspaper layout being carried out the words reading sequence recovery

Also Published As

Publication number Publication date
CN102073862A (en) 2011-05-25

Similar Documents

Publication Publication Date Title
CN105184292B (en) The structural analysis of handwritten form mathematical formulae and recognition methods in natural scene image
Farooq et al. Pre-processing methods for handwritten Arabic documents
Das et al. A fast algorithm for skew detection of document images using morphology
CN102073862B (en) Method for quickly calculating layout structure of document image
CN104408455B (en) Conglutination segmentation method
CN102629322B (en) Character feature extraction method based on stroke shape of boundary point and application thereof
CN104376318A (en) Removal of underlines and table lines in document images while preserving intersecting character strokes
KR20110057536A (en) Character recognition device and control method thereof
US20110305387A1 (en) Method and system for preprocessing an image for optical character recognition
KR20150017755A (en) Form recognition method and device
CN103336961B (en) A kind of interactively natural scene Method for text detection
US20110280477A1 (en) Method and system for preprocessing an image for optical character recognition
CN101777124A (en) Method for extracting video text message and device thereof
CN107944451B (en) Line segmentation method and system for ancient Tibetan book documents
CN101673338A (en) Fuzzy license plate identification method based on multi-angle projection
CN102332097B (en) Method for segmenting complex background text images based on image segmentation
CN104574401A (en) Image registration method based on parallel line matching
CN111353961B (en) Document curved surface correction method and device
Nagabhushan et al. Tracing and straightening the baseline in handwritten persian/arabic text-line: A new approach based on painting-technique
Mello et al. Automatic image segmentation of old topographic maps and floor plans
CN103226824A (en) Video retargeting system for maintaining visual saliency
CN103377379A (en) Text detection device and method and text information extraction system and method
Li An effective approach to offline arabic handwriting recognition
Ziaratban et al. An adaptive script-independent block-based text line extraction
Cao et al. A stroke regeneration method for cleaning rule-lines in handwritten document images

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 250100, Bole Road, hi tech Zone, Shandong, Ji'nan, 128

Patentee after: SHANDONG SHANDA OUMA SOFTWARE CO., LTD.

Address before: High tech Zone Tianchen road Ji'nan City, Shandong province 250100 No. 1318

Patentee before: Shandong Shanda Ouma Software Co., Ltd.