CN106909897A

CN106909897A - A kind of text image is inverted method for quick

Info

Publication number: CN106909897A
Application number: CN201710090240.9A
Authority: CN
Inventors: 王建; 庞彦伟
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2017-02-20
Filing date: 2017-02-20
Publication date: 2017-06-30
Anticipated expiration: 2037-02-20
Also published as: CN106909897B

Abstract

The present invention relates to a kind of text image be inverted method for quick, including to be input into text image pre-process, obtain binary conversion treatment result B；Effective line of text detection is carried out, effective line of text sequence is obtained；Line of text classification is carried out, method is as follows：1) for each effective line of text s of effective line of text sequence, the blank between effective line of text s adjacent characters is filled；2) projection values of each effective line of text s in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number；3) left margin and right margin of effective line of text s are obtained；4) left margin and right margin of effective line of text sequence are obtained；5) " left indent text row " and " right indent text row " and " non-indent line of text " are judged；Text image is inverted detection.

Description

A kind of text image is inverted method for quick

Technical field

The present invention relates to text image enhancing technology, detection technique is inverted in the direction in particular for scan text image.

Background technology

With continuing to develop for computer technology, the text image digitizing technique based on OCR (optical character identification) is obtained To widely applying.During OCR is completed, words direction in text image is to character recognition performance impact to closing weight Will.When word is present to be inclined, if be not corrected for, the discrimination of word can be had a strong impact on.Particularly when word exists Put situation (i.e. with normal direction deviation 180 ° or so).Therefore, before OCR is carried out, it is necessary to judge that text image whether there is Inversion situation, is considered as carrying out rotation processing first for inversion situation, to ensure that follow-up identification process is normally performed.

For the text image that there are inclination conditions, can detect gradient and carry out corresponding by existing correction algorithm Ground correction.But existing text image method for correcting error mostly assumes the text image gradient of input within limits, first Angle of inclination information is obtained, and then completes gradient correction.But when input text image is to be fully inverted, existing angle of inclination inspection Survey method fails substantially.Zeng Fanfeng et al. proposes a kind of text image based on punctuation mark and is inverted method for quick.Should Method detects text character first；Then in conjunction with Chinese character and punctuation mark architectural feature, the mark in text image is filtered out Point symbol, according to punctuation mark pixel distribution feature, judges punctuation mark type；Punctuation mark use habit is finally combined, is sentenced Whether disconnected Chinese text image is inverted.Zhu Min et al. (patent publication No. CN102831421A) proposes a kind of based on punctuation mark Text above-below direction detection method.Patent institute's extracting method judges text according to the relative position attribute of punctuation mark and line of text This direction, its basic ideas are similar with Zeng Fanfeng institutes extracting methods.It is special that this kind of method based on punctuation mark fully relies on punctuate Levy, the text image less for punctuation mark is invalid, therefore this kind of method restricted application, without generality.

The content of the invention

The purpose of the present invention is the above-mentioned deficiency for overcoming prior art, there is provided be inverted fast in a kind of direction of text-oriented image Fast detection method.Technical scheme is as follows：

A kind of text image is inverted method for quick, comprises the following steps：

The first step：Text image to being input into is pre-processed, and obtains binary conversion treatment result B；

Second step：Effective line of text detection is carried out, effective line of text sequence is obtained；

3rd step：Line of text classification is carried out, method is as follows：

1) for each effective line of text s of effective line of text sequence, dilation operation is carried out using rectangular configuration operator, is filled out The blank filled between effective line of text s adjacent characters；

2) projection values of each effective line of text s in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number；

3) statistics meets condition V (c)>0.5×R_heiS the c values of (), c is designated as by the minimum value of c_min, referred to as effective text The left margin of one's own profession s；Maximum is designated as c respectively_max, the right margin of referred to as effective line of text s, the length of the scan line is R_leg =c_max-c_min；

4) the corresponding c of each effective line of text in same effective line of text sequence is counted_min(m) and c_max(m), by c_min(m) Minimum value be referred to as the left margin of effective line of text sequence, be designated as c_lef；By c_maxM the maximum of () is referred to as effective line of text The right margin of sequence, is designated as c_rgt；

5) for certain effective line of text m, if meeting 0.6<|c_min(m)-c_lef|/|c_max(m)-c_min(m)|<0.9, then will Effective line of text m is judged to " left indent text row "；If meeting 0.6<|c_rgt-c_max(m)|/(c_max(m)-c_min(m))<0.9, Then effective line of text m is judged to " right indent text row "；If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non-indent line of text "；

4th step：Text image is inverted detection, and method is as follows：

The number of left indent text row and right indent text row, uses N respectively in statistics single width text image_lefAnd N_rgtTable Show；Judge text image with the presence or absence of inversion using following formula：

Preferably, the method for second step is as follows：

1) each row projection value in the horizontal direction in B is calculated, is represented with H (r), wherein r represents line number sequence number；

2) maximum of H (r) is calculated, H is used_maxRepresent；

3) for r scan lines, if meeting H (r)>0.5×H_max, then the row is judged to an active line；

4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and is met m>M/100, then constitute an effective line of text sequence by this continuous m active line；

Determine the line number of the top and bottom active line in effective line of text sequence, use R_top(s) and R_bot(s) The up-and-down boundary of effective line of text sequence is represented respectively, and the height for defining effective line of text sequence is R_hei(s)=| R_top (s)-R_bot(s) |, symbol | | the symbol that takes absolute value is represented, s is the sequence number of effective line of text in formula.

Brief description of the drawings

Fig. 1 is the flow chart of institute's extracting method of the present invention

Fig. 2 is the important definition schematic diagram used in the present invention

The line of text type schematic diagram of Fig. 3 present invention definition

Fig. 4 present invention experiments text image used or so indent text number of lines schematic diagram

Specific embodiment

Input text coloured image is carried out into the pretreatment behaviour such as gray processing, bilateral filtering, contrast enhancing, binaryzation first Make, improve file and picture visual quality；Then analyzed by floor projection, the effective line of text in detection text image, and tied Position and the length characteristic of line of text are closed, line of text is classified；Finally according to left indent text row and right indent text row Relative number, judge text image with the presence or absence of be inverted.Fig. 1 show the block diagram of institute's extracting method.

Some useful definition are given first.One width text image is made up of multiple paragraphs, the character in each paragraph The features such as font, form are basically identical.The leftmost side and right-most position that each character in each paragraph can occur in the present invention, It is referred to as " paragraph left margin " and " paragraph right margin ".Fig. 2 gives the schematic diagram of a paragraph left margin and right margin.Often Individual paragraph potentially includes one or more line of text, for any line of text, by the left side of its leftmost side character and rightmost side word It is referred to as on the right side of symbol " the row left margin " and " row right margin " of this article one's own profession.Fig. 2 gives row left margin and row right margin Schematic diagram.

For certain line of text, its right boundary is essentially coincided with the right boundary of affiliated paragraph, then claim text behavior " full copy row ".If for certain line of text, the left margin of paragraph has 2~4 characters belonging to its left margin distance, while This article one's own profession right margin is essentially coincided with affiliated paragraph right margin, then claim text behavior " left indent text row ".For certain Line of text, paragraph right margin belonging to its right margin distance has 2~4 characters, while this article one's own profession left margin is left with affiliated paragraph Border essentially coincides, then claim text behavior " right indent text row ".Fig. 3 gives the schematic diagram of above-mentioned three class texts row.

According to Chinese and English writing style, each paragraph first trip character typically to the right be retracted 2~4 characters, i.e., for Paragraph comprising two or more line of text, it certainly exists a left indent text row.If text image is positive , then can necessarily detect multiple left indent text rows., whereas if text image is inverted, then multiple right sides can be detected Indent text row.The present invention is exactly based on detection and judges the relative of left indent text row and right indent text row in text image Number come judge text image whether there is inversion situation.

Institute's extracting method concrete processing procedure of the present invention includes：Pretreatment, line of text detection, line of text classification, text orientation It is inverted four key steps such as detection.

1st, pre-process

The purpose of pretreatment is to improve the visual quality of file and picture, is mainly included：Gray processing, smothing filtering, contrast The step such as enhancing and binaryzation.

(1) gray processing：

Judge whether input text image is gray level image, if gray level image, is then kept constant；If cromogram Picture, uses C_R、C_GAnd C_BThree Color Channels of red, green, blue are represented respectively, gray level image is calculated using formula (1), represented with I.

I (x, y)=min { C_R(x,y),C_G(x,y),C_B(x,y)} (1)

In formula, x=0,1,2 ..., M-1, y=0,1,2 ..., N-1, M and N be respectively text image height and width Degree, that is, scan total line number and the total columns of scanning.

(2) smothing filtering

Noise pollution is subject in collection and digitized process in view of text image, using bilateral filtering technology to gray scale Image I is filtered treatment, reduces influence of noise.The image after being processed through bilateral filtering is represented with G.

(3) contrast enhancing

Due to the influence of the reasons such as illumination, the contrast of text image may be relatively low, using histogram equalization technology to filter Ripple image G carries out enhancing treatment, and result is represented with E.

(4) binary conversion treatment

The corresponding global thresholds of E are calculated using classical Otsu methods, T is used_hRepresent.Use T_hBinary conversion treatment is carried out to E, Result represents that specific way is with B：

Wherein, value is that 1 point represents text point in B, and value is that 0 point represents background dot.

2nd, effective line of text detection

Effective line of text is completed using following algorithm to detect：

Effective line of text detection algorithm：

1) each row projection value in the horizontal direction in B is calculated, is represented with H (r), wherein r represents line number sequence number.

2) maximum of H (r) is calculated, H is used_maxRepresent.

3) for r scan lines, if meeting H (r)>0.5×H_max, then the row is judged to an active line.

4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and is met m>M/100, then constitute an effective line of text by this continuous m active line.

5) determine the line number of the top and bottom active line in effective line of text, use R_top(s) and R_bot(s) point Not Biao Shi this article one's own profession up-and-down boundary, define this article one's own profession height be R_hei(s)=| R_top(s)-R_bot(s) |, symbol | | Expression takes absolute value symbol, and s is the sequence number of effective line of text in formula.

3rd, line of text classification

Line of text is completed using following algorithm to classify：

Line of text sorting algorithm：

1) for some effective line of text s, dilation operation is carried out to text row using rectangular configuration operator, filling should Blank between line of text adjacent character.The height of rectangular configuration operator is 2 pixels, and width is this article one's own profession height 50%.

2) projection value of the line of text in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number.

3) statistics meets condition V (c)>0.5×R_heiS the c values of (), c is designated as by the minimum value of c_min, referred to as this article one's own profession Left margin；Maximum is designated as c respectively_max, the referred to as right margin of this article one's own profession, the length of the scan line is R_leg=c_max- c_min。

4) the corresponding c of each effective line of text in same paragraph is counted_min(m) and c_max(m), by c_minM the minimum value of () claims It is left circle of the paragraph, is designated as c_lef；By c_maxM the maximum of () is referred to as the right margin of the paragraph, be designated as c_rgt。

5) for certain effective line of text m, if meeting 0.6<|c_min(m)-c_lef|/|c_max(m)-c_min(m)|<0.9, then will This article one's own profession is judged to " left indent text row "；If meeting 0.6<|c_rgt-c_max(m)|/(c_max(m)-c_min(m))<0.9, then will This article one's own profession is judged to " right indent text row "；If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non-indent text One's own profession ".

4th, text image is inverted detection

The number of left indent text row and right indent text row, uses N respectively in statistics single width text image_lefAnd N_rgtTable Show.Judge text image with the presence or absence of inversion using formula (3)：

Embodiment is as follows：

Using the Matlab2015a under Windows10 specialty edition systems as experiment simulation platform, hardware platform is Intel i5-6200U CPU, 8G internal memories.

The 90 width text images voluntarily gathered from patent applicant are used as test set, wherein the width of text image 78 is inverted, The width of positive direction text image 12.In all 90 width text images, Chinese text image has 56 width, accounts for 62%, English text figure As 34 width, 38% is accounted for.Test image is processed using proposition method of the present invention, 100% Inverted Image is all normally detected. Fig. 4 gives the distribution situation of left indent text row and right indent text number of lines in 90 width file and pictures.As seen from the figure, for Positive text image, its left indent text line number is significantly greater than right indent text line number；Conversely, for upside down orientation text diagram Picture, is that right indent text line number is more than left indent text line number.Clearly it is divided into two classes, i.e. inversion text image class is (in figure Identified with symbol " * ") and positive text image class (being identified with symbol " o " in figure).

The size of test image is that 1944 × 2592 resolution ratio have reached 50,000,000 pixels, processes the average speed of piece image Degree is about 2300ms, if converted to execution efficiency C language higher writes algorithm, processing speed faster can disclosure satisfy that in real time The requirement for the treatment of.

From experimental result, using the method for the invention, the scan text figure of input can be fast and effectively judged As whether there is inversion situation, and the text image including the multilingual type including Chinese and English can be processed.

Step of the invention is summarized as follows：

Step 1：Judge input scan text image type, if gray level image, then keep constant；If cromogram Picture, then be converted to gray level image with formula (1), and gray level image is represented with I.

Step 2：Treatment is filtered to gray level image I using bilateral filtering technology, filter result is represented with G.

Step 3：Using histogram equalization technology, enhancing treatment is carried out to filter result image G, result is represented with E.

Step 4：The global threshold of enhancing result images is calculated using Otsu methods, and convolution (2) is carried out at binaryzation to E Reason, result is represented with B.

Step 5：Using effective line of text detection algorithm, the effective line of text in detection scan text image.

Step 6：Using line of text sorting algorithm, each effective line of text is classified, determine left indent text row With the number N of right indent text row_lefAnd N_rgt。

Step 7：Convolution (3), judges that scan text image whether there is inversion situation.

Claims

1. a kind of text image is inverted method for quick, comprises the following steps：

3rd step：Line of text classification is carried out, method is as follows：

1) for each effective line of text s of effective line of text sequence, dilation operation is carried out using rectangular configuration operator, filling should Blank between effective line of text s adjacent characters；

3) statistics meets condition V (c)>0.5×R_heiS the c values of (), c is designated as by the minimum value of c_min, referred to as effective line of text The left margin of s；Maximum is designated as c respectively_max, the right margin of referred to as effective line of text s, the length of the scan line is R_leg= c_max-c_min；

4) the corresponding c of each effective line of text in same effective line of text sequence is counted_min(m) and c_max(m), by c_minM () most Small value is referred to as the left margin of effective line of text sequence, is designated as c_lef；By c_maxM the maximum of () is referred to as effective line of text sequence Right margin, be designated as c_rgt；

5) for certain effective line of text m, if meeting 0.6<|c_min(m)-c_lef|/|c_max(m)-c_min(m)|<0.9, then this is had Effect line of text m is judged to " left indent text row "；If meeting 0.6<|c_rgt-c_max(m)|/(c_max(m)-c_min(m))<0.9, then will Effective line of text m is judged to " right indent text row "；If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non- Indent text row "；

4th step：Text image is inverted detection, and method is as follows：

The number of left indent text row and right indent text row, uses N respectively in statistics single width text image_lefAnd N_rgtRepresent；Make Judge text image with the presence or absence of inversion with following formula：

2. text image according to claim 1 is inverted method for quick, it is characterised in that the method for second step is such as Under：

2) maximum of H (r) is calculated, H is used_maxRepresent；

4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and m is met>M/ 100, then an effective line of text sequence is constituted by this continuous m active line；

Determine the line number of the top and bottom active line in effective line of text sequence, use R_top(s) and R_bot(s) difference The up-and-down boundary of effective line of text sequence is represented, the height for defining effective line of text sequence is R_hei(s)=| R_top(s)- R_bot(s) |, symbol | | the symbol that takes absolute value is represented, s is the sequence number of effective line of text in formula.