CN106909897A - A kind of text image is inverted method for quick - Google Patents

A kind of text image is inverted method for quick Download PDF

Info

Publication number
CN106909897A
CN106909897A CN201710090240.9A CN201710090240A CN106909897A CN 106909897 A CN106909897 A CN 106909897A CN 201710090240 A CN201710090240 A CN 201710090240A CN 106909897 A CN106909897 A CN 106909897A
Authority
CN
China
Prior art keywords
text
effective line
line
row
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710090240.9A
Other languages
Chinese (zh)
Other versions
CN106909897B (en
Inventor
王建
庞彦伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201710090240.9A priority Critical patent/CN106909897B/en
Publication of CN106909897A publication Critical patent/CN106909897A/en
Application granted granted Critical
Publication of CN106909897B publication Critical patent/CN106909897B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to a kind of text image be inverted method for quick, including to be input into text image pre-process, obtain binary conversion treatment result B;Effective line of text detection is carried out, effective line of text sequence is obtained;Line of text classification is carried out, method is as follows:1) for each effective line of text s of effective line of text sequence, the blank between effective line of text s adjacent characters is filled;2) projection values of each effective line of text s in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number;3) left margin and right margin of effective line of text s are obtained;4) left margin and right margin of effective line of text sequence are obtained;5) " left indent text row " and " right indent text row " and " non-indent line of text " are judged;Text image is inverted detection.

Description

A kind of text image is inverted method for quick
Technical field
The present invention relates to text image enhancing technology, detection technique is inverted in the direction in particular for scan text image.
Background technology
With continuing to develop for computer technology, the text image digitizing technique based on OCR (optical character identification) is obtained To widely applying.During OCR is completed, words direction in text image is to character recognition performance impact to closing weight Will.When word is present to be inclined, if be not corrected for, the discrimination of word can be had a strong impact on.Particularly when word exists Put situation (i.e. with normal direction deviation 180 ° or so).Therefore, before OCR is carried out, it is necessary to judge that text image whether there is Inversion situation, is considered as carrying out rotation processing first for inversion situation, to ensure that follow-up identification process is normally performed.
For the text image that there are inclination conditions, can detect gradient and carry out corresponding by existing correction algorithm Ground correction.But existing text image method for correcting error mostly assumes the text image gradient of input within limits, first Angle of inclination information is obtained, and then completes gradient correction.But when input text image is to be fully inverted, existing angle of inclination inspection Survey method fails substantially.Zeng Fanfeng et al. proposes a kind of text image based on punctuation mark and is inverted method for quick.Should Method detects text character first;Then in conjunction with Chinese character and punctuation mark architectural feature, the mark in text image is filtered out Point symbol, according to punctuation mark pixel distribution feature, judges punctuation mark type;Punctuation mark use habit is finally combined, is sentenced Whether disconnected Chinese text image is inverted.Zhu Min et al. (patent publication No. CN102831421A) proposes a kind of based on punctuation mark Text above-below direction detection method.Patent institute's extracting method judges text according to the relative position attribute of punctuation mark and line of text This direction, its basic ideas are similar with Zeng Fanfeng institutes extracting methods.It is special that this kind of method based on punctuation mark fully relies on punctuate Levy, the text image less for punctuation mark is invalid, therefore this kind of method restricted application, without generality.
The content of the invention
The purpose of the present invention is the above-mentioned deficiency for overcoming prior art, there is provided be inverted fast in a kind of direction of text-oriented image Fast detection method.Technical scheme is as follows:
A kind of text image is inverted method for quick, comprises the following steps:
The first step:Text image to being input into is pre-processed, and obtains binary conversion treatment result B;
Second step:Effective line of text detection is carried out, effective line of text sequence is obtained;
3rd step:Line of text classification is carried out, method is as follows:
1) for each effective line of text s of effective line of text sequence, dilation operation is carried out using rectangular configuration operator, is filled out The blank filled between effective line of text s adjacent characters;
2) projection values of each effective line of text s in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number;
3) statistics meets condition V (c)>0.5×RheiS the c values of (), c is designated as by the minimum value of cmin, referred to as effective text The left margin of one's own profession s;Maximum is designated as c respectivelymax, the right margin of referred to as effective line of text s, the length of the scan line is Rleg =cmax-cmin
4) the corresponding c of each effective line of text in same effective line of text sequence is countedmin(m) and cmax(m), by cmin(m) Minimum value be referred to as the left margin of effective line of text sequence, be designated as clef;By cmaxM the maximum of () is referred to as effective line of text The right margin of sequence, is designated as crgt
5) for certain effective line of text m, if meeting 0.6<|cmin(m)-clef|/|cmax(m)-cmin(m)|<0.9, then will Effective line of text m is judged to " left indent text row ";If meeting 0.6<|crgt-cmax(m)|/(cmax(m)-cmin(m))<0.9, Then effective line of text m is judged to " right indent text row ";If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non-indent line of text ";
4th step:Text image is inverted detection, and method is as follows:
The number of left indent text row and right indent text row, uses N respectively in statistics single width text imagelefAnd NrgtTable Show;Judge text image with the presence or absence of inversion using following formula:
Preferably, the method for second step is as follows:
1) each row projection value in the horizontal direction in B is calculated, is represented with H (r), wherein r represents line number sequence number;
2) maximum of H (r) is calculated, H is usedmaxRepresent;
3) for r scan lines, if meeting H (r)>0.5×Hmax, then the row is judged to an active line;
4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and is met m>M/100, then constitute an effective line of text sequence by this continuous m active line;
Determine the line number of the top and bottom active line in effective line of text sequence, use Rtop(s) and Rbot(s) The up-and-down boundary of effective line of text sequence is represented respectively, and the height for defining effective line of text sequence is Rhei(s)=| Rtop (s)-Rbot(s) |, symbol | | the symbol that takes absolute value is represented, s is the sequence number of effective line of text in formula.
Brief description of the drawings
Fig. 1 is the flow chart of institute's extracting method of the present invention
Fig. 2 is the important definition schematic diagram used in the present invention
The line of text type schematic diagram of Fig. 3 present invention definition
Fig. 4 present invention experiments text image used or so indent text number of lines schematic diagram
Specific embodiment
Input text coloured image is carried out into the pretreatment behaviour such as gray processing, bilateral filtering, contrast enhancing, binaryzation first Make, improve file and picture visual quality;Then analyzed by floor projection, the effective line of text in detection text image, and tied Position and the length characteristic of line of text are closed, line of text is classified;Finally according to left indent text row and right indent text row Relative number, judge text image with the presence or absence of be inverted.Fig. 1 show the block diagram of institute's extracting method.
Some useful definition are given first.One width text image is made up of multiple paragraphs, the character in each paragraph The features such as font, form are basically identical.The leftmost side and right-most position that each character in each paragraph can occur in the present invention, It is referred to as " paragraph left margin " and " paragraph right margin ".Fig. 2 gives the schematic diagram of a paragraph left margin and right margin.Often Individual paragraph potentially includes one or more line of text, for any line of text, by the left side of its leftmost side character and rightmost side word It is referred to as on the right side of symbol " the row left margin " and " row right margin " of this article one's own profession.Fig. 2 gives row left margin and row right margin Schematic diagram.
For certain line of text, its right boundary is essentially coincided with the right boundary of affiliated paragraph, then claim text behavior " full copy row ".If for certain line of text, the left margin of paragraph has 2~4 characters belonging to its left margin distance, while This article one's own profession right margin is essentially coincided with affiliated paragraph right margin, then claim text behavior " left indent text row ".For certain Line of text, paragraph right margin belonging to its right margin distance has 2~4 characters, while this article one's own profession left margin is left with affiliated paragraph Border essentially coincides, then claim text behavior " right indent text row ".Fig. 3 gives the schematic diagram of above-mentioned three class texts row.
According to Chinese and English writing style, each paragraph first trip character typically to the right be retracted 2~4 characters, i.e., for Paragraph comprising two or more line of text, it certainly exists a left indent text row.If text image is positive , then can necessarily detect multiple left indent text rows., whereas if text image is inverted, then multiple right sides can be detected Indent text row.The present invention is exactly based on detection and judges the relative of left indent text row and right indent text row in text image Number come judge text image whether there is inversion situation.
Institute's extracting method concrete processing procedure of the present invention includes:Pretreatment, line of text detection, line of text classification, text orientation It is inverted four key steps such as detection.
1st, pre-process
The purpose of pretreatment is to improve the visual quality of file and picture, is mainly included:Gray processing, smothing filtering, contrast The step such as enhancing and binaryzation.
(1) gray processing:
Judge whether input text image is gray level image, if gray level image, is then kept constant;If cromogram Picture, uses CR、CGAnd CBThree Color Channels of red, green, blue are represented respectively, gray level image is calculated using formula (1), represented with I.
I (x, y)=min { CR(x,y),CG(x,y),CB(x,y)} (1)
In formula, x=0,1,2 ..., M-1, y=0,1,2 ..., N-1, M and N be respectively text image height and width Degree, that is, scan total line number and the total columns of scanning.
(2) smothing filtering
Noise pollution is subject in collection and digitized process in view of text image, using bilateral filtering technology to gray scale Image I is filtered treatment, reduces influence of noise.The image after being processed through bilateral filtering is represented with G.
(3) contrast enhancing
Due to the influence of the reasons such as illumination, the contrast of text image may be relatively low, using histogram equalization technology to filter Ripple image G carries out enhancing treatment, and result is represented with E.
(4) binary conversion treatment
The corresponding global thresholds of E are calculated using classical Otsu methods, T is usedhRepresent.Use ThBinary conversion treatment is carried out to E, Result represents that specific way is with B:
Wherein, value is that 1 point represents text point in B, and value is that 0 point represents background dot.
2nd, effective line of text detection
Effective line of text is completed using following algorithm to detect:
Effective line of text detection algorithm:
1) each row projection value in the horizontal direction in B is calculated, is represented with H (r), wherein r represents line number sequence number.
2) maximum of H (r) is calculated, H is usedmaxRepresent.
3) for r scan lines, if meeting H (r)>0.5×Hmax, then the row is judged to an active line.
4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and is met m>M/100, then constitute an effective line of text by this continuous m active line.
5) determine the line number of the top and bottom active line in effective line of text, use Rtop(s) and Rbot(s) point Not Biao Shi this article one's own profession up-and-down boundary, define this article one's own profession height be Rhei(s)=| Rtop(s)-Rbot(s) |, symbol | | Expression takes absolute value symbol, and s is the sequence number of effective line of text in formula.
3rd, line of text classification
Line of text is completed using following algorithm to classify:
Line of text sorting algorithm:
1) for some effective line of text s, dilation operation is carried out to text row using rectangular configuration operator, filling should Blank between line of text adjacent character.The height of rectangular configuration operator is 2 pixels, and width is this article one's own profession height 50%.
2) projection value of the line of text in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number.
3) statistics meets condition V (c)>0.5×RheiS the c values of (), c is designated as by the minimum value of cmin, referred to as this article one's own profession Left margin;Maximum is designated as c respectivelymax, the referred to as right margin of this article one's own profession, the length of the scan line is Rleg=cmax- cmin
4) the corresponding c of each effective line of text in same paragraph is countedmin(m) and cmax(m), by cminM the minimum value of () claims It is left circle of the paragraph, is designated as clef;By cmaxM the maximum of () is referred to as the right margin of the paragraph, be designated as crgt
5) for certain effective line of text m, if meeting 0.6<|cmin(m)-clef|/|cmax(m)-cmin(m)|<0.9, then will This article one's own profession is judged to " left indent text row ";If meeting 0.6<|crgt-cmax(m)|/(cmax(m)-cmin(m))<0.9, then will This article one's own profession is judged to " right indent text row ";If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non-indent text One's own profession ".
4th, text image is inverted detection
The number of left indent text row and right indent text row, uses N respectively in statistics single width text imagelefAnd NrgtTable Show.Judge text image with the presence or absence of inversion using formula (3):
Embodiment is as follows:
Using the Matlab2015a under Windows10 specialty edition systems as experiment simulation platform, hardware platform is Intel i5-6200U CPU, 8G internal memories.
The 90 width text images voluntarily gathered from patent applicant are used as test set, wherein the width of text image 78 is inverted, The width of positive direction text image 12.In all 90 width text images, Chinese text image has 56 width, accounts for 62%, English text figure As 34 width, 38% is accounted for.Test image is processed using proposition method of the present invention, 100% Inverted Image is all normally detected. Fig. 4 gives the distribution situation of left indent text row and right indent text number of lines in 90 width file and pictures.As seen from the figure, for Positive text image, its left indent text line number is significantly greater than right indent text line number;Conversely, for upside down orientation text diagram Picture, is that right indent text line number is more than left indent text line number.Clearly it is divided into two classes, i.e. inversion text image class is (in figure Identified with symbol " * ") and positive text image class (being identified with symbol " o " in figure).
The size of test image is that 1944 × 2592 resolution ratio have reached 50,000,000 pixels, processes the average speed of piece image Degree is about 2300ms, if converted to execution efficiency C language higher writes algorithm, processing speed faster can disclosure satisfy that in real time The requirement for the treatment of.
From experimental result, using the method for the invention, the scan text figure of input can be fast and effectively judged As whether there is inversion situation, and the text image including the multilingual type including Chinese and English can be processed.
Step of the invention is summarized as follows:
Step 1:Judge input scan text image type, if gray level image, then keep constant;If cromogram Picture, then be converted to gray level image with formula (1), and gray level image is represented with I.
Step 2:Treatment is filtered to gray level image I using bilateral filtering technology, filter result is represented with G.
Step 3:Using histogram equalization technology, enhancing treatment is carried out to filter result image G, result is represented with E.
Step 4:The global threshold of enhancing result images is calculated using Otsu methods, and convolution (2) is carried out at binaryzation to E Reason, result is represented with B.
Step 5:Using effective line of text detection algorithm, the effective line of text in detection scan text image.
Step 6:Using line of text sorting algorithm, each effective line of text is classified, determine left indent text row With the number N of right indent text rowlefAnd Nrgt
Step 7:Convolution (3), judges that scan text image whether there is inversion situation.

Claims (2)

1. a kind of text image is inverted method for quick, comprises the following steps:
The first step:Text image to being input into is pre-processed, and obtains binary conversion treatment result B;
Second step:Effective line of text detection is carried out, effective line of text sequence is obtained;
3rd step:Line of text classification is carried out, method is as follows:
1) for each effective line of text s of effective line of text sequence, dilation operation is carried out using rectangular configuration operator, filling should Blank between effective line of text s adjacent characters;
2) projection values of each effective line of text s in vertical direction is calculated, is represented with V (c), wherein c represents row sequence number;
3) statistics meets condition V (c)>0.5×RheiS the c values of (), c is designated as by the minimum value of cmin, referred to as effective line of text The left margin of s;Maximum is designated as c respectivelymax, the right margin of referred to as effective line of text s, the length of the scan line is Rleg= cmax-cmin
4) the corresponding c of each effective line of text in same effective line of text sequence is countedmin(m) and cmax(m), by cminM () most Small value is referred to as the left margin of effective line of text sequence, is designated as clef;By cmaxM the maximum of () is referred to as effective line of text sequence Right margin, be designated as crgt
5) for certain effective line of text m, if meeting 0.6<|cmin(m)-clef|/|cmax(m)-cmin(m)|<0.9, then this is had Effect line of text m is judged to " left indent text row ";If meeting 0.6<|crgt-cmax(m)|/(cmax(m)-cmin(m))<0.9, then will Effective line of text m is judged to " right indent text row ";If above-mentioned two condition is all unsatisfactory for, this article one's own profession is judged to " non- Indent text row ";
4th step:Text image is inverted detection, and method is as follows:
The number of left indent text row and right indent text row, uses N respectively in statistics single width text imagelefAnd NrgtRepresent;Make Judge text image with the presence or absence of inversion with following formula:
2. text image according to claim 1 is inverted method for quick, it is characterised in that the method for second step is such as Under:
1) each row projection value in the horizontal direction in B is calculated, is represented with H (r), wherein r represents line number sequence number;
2) maximum of H (r) is calculated, H is usedmaxRepresent;
3) for r scan lines, if meeting H (r)>0.5×Hmax, then the row is judged to an active line;
4) distribution situation of each active line is counted, if detecting continuous m rows is judged to active line, and m is met>M/ 100, then an effective line of text sequence is constituted by this continuous m active line;
Determine the line number of the top and bottom active line in effective line of text sequence, use Rtop(s) and Rbot(s) difference The up-and-down boundary of effective line of text sequence is represented, the height for defining effective line of text sequence is Rhei(s)=| Rtop(s)- Rbot(s) |, symbol | | the symbol that takes absolute value is represented, s is the sequence number of effective line of text in formula.
CN201710090240.9A 2017-02-20 2017-02-20 Text image inversion rapid detection method Expired - Fee Related CN106909897B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710090240.9A CN106909897B (en) 2017-02-20 2017-02-20 Text image inversion rapid detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710090240.9A CN106909897B (en) 2017-02-20 2017-02-20 Text image inversion rapid detection method

Publications (2)

Publication Number Publication Date
CN106909897A true CN106909897A (en) 2017-06-30
CN106909897B CN106909897B (en) 2020-03-13

Family

ID=59208458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710090240.9A Expired - Fee Related CN106909897B (en) 2017-02-20 2017-02-20 Text image inversion rapid detection method

Country Status (1)

Country Link
CN (1) CN106909897B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609482A (en) * 2017-08-15 2018-01-19 天津大学 A kind of Chinese text image inversion method of discrimination based on Chinese-character stroke feature
CN111414866A (en) * 2020-03-24 2020-07-14 上海眼控科技股份有限公司 Vehicle application form detection method and device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831421A (en) * 2012-08-29 2012-12-19 华东师范大学 Method for detecting document up-down direction based on punctuation marks
CN106097254A (en) * 2016-06-07 2016-11-09 天津大学 A kind of scanning document image method for correcting error

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831421A (en) * 2012-08-29 2012-12-19 华东师范大学 Method for detecting document up-down direction based on punctuation marks
CN106097254A (en) * 2016-06-07 2016-11-09 天津大学 A kind of scanning document image method for correcting error

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
曾凡锋等: "中文文本图像倒置快速检测算法", 《计算机工程与设计》 *
朱其猛: "基于文字结构特征的文本图像方向的研究与应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609482A (en) * 2017-08-15 2018-01-19 天津大学 A kind of Chinese text image inversion method of discrimination based on Chinese-character stroke feature
CN111414866A (en) * 2020-03-24 2020-07-14 上海眼控科技股份有限公司 Vehicle application form detection method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN106909897B (en) 2020-03-13

Similar Documents

Publication Publication Date Title
Karthick et al. Steps involved in text recognition and recent research in OCR; a study
CN106682629B (en) Identification algorithm for identity card number under complex background
LeBourgeois Robust multifont OCR system from gray level images
Pal et al. Identification of different script lines from multi-script documents
CN107944451B (en) Line segmentation method and system for ancient Tibetan book documents
CN113128442A (en) Chinese character calligraphy style identification method and scoring method based on convolutional neural network
WO2023284502A1 (en) Image processing method and apparatus, device, and storage medium
CN111259878A (en) Method and equipment for detecting text
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
Al Abodi et al. An effective approach to offline Arabic handwriting recognition
Chamchong et al. Character segmentation from ancient palm leaf manuscripts in Thailand
CN111626292B (en) Text recognition method of building indication mark based on deep learning technology
Kumar et al. Multi-script robust reading competition in ICDAR 2013
CN108133216A (en) The charactron Recognition of Reading method that achievable decimal point based on machine vision is read
CN102737240B (en) Method of analyzing digital document images
CN112446262A (en) Text analysis method, text analysis device, text analysis terminal and computer-readable storage medium
CN106909897A (en) A kind of text image is inverted method for quick
CN111626302A (en) Method and system for cutting adhered text lines of ancient book document images of Ujin Tibetan
CN107958261B (en) Braille point detection method and system
CN107609482B (en) Chinese text image inversion discrimination method based on Chinese character stroke characteristics
CN107798355A (en) A kind of method automatically analyzed based on file and picture format with judging
CN108062548B (en) Braille square self-adaptive positioning method and system
Munir et al. Automatic character extraction from handwritten scanned documents to build large scale database
CN110298236A (en) A kind of braille automatic distinguishing method for image and system based on deep learning
Mostafa An adaptive algorithm for the automatic segmentation of printed Arabic text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200313