CN104408449B

CN104408449B - Intelligent mobile terminal scene literal processing method

Info

Publication number: CN104408449B
Application number: CN201410581464.6A
Authority: CN
Inventors: 卢朝阳; 李静; 刘晓佩; 姜维; 通天意; 汪文芳
Original assignee: XIDIAN-NINGBO INFORMATION TECHNOLOGY INSTITUTE
Current assignee: XIDIAN-NINGBO INFORMATION TECHNOLOGY INSTITUTE
Priority date: 2014-10-27
Filing date: 2014-10-27
Publication date: 2018-01-30
Anticipated expiration: 2034-10-27
Also published as: CN104408449A

Abstract

The present invention relates to a kind of intelligent mobile terminal scene literal processing method, including step 1：Text rough detection based on edge；The text filed progress stroke width of each candidate and geometrical Characteristics Analysis in the stroke width figure T of step 2, acquisition input scene image I, set S text filed to candidate, reject undesirable non-textual region, finally export positioning result figure L1；Step 3, identification pretreatment；Step 4, the monocase after cutting is standardized and the extraction of directional element features operation；Step 5, the disaggregated classification based on Gabor characteristic.Compared with prior art, the advantage of the invention is that：Accuracy rate, which has, significantly to be lifted, and recall rate is higher, is had on time performance and is substantially improved, and the accuracy rate of character recognition is substantially improved.

Description

Intelligent mobile terminal scene literal processing method

Technical field

The present invention relates to the type mode in pattern-recognition to identify field, more particularly to intelligent mobile terminal scene word Processing method, the scene word for being shot to intelligent mobile terminal are identified.

Background technology

With developing rapidly for information technology, pattern-recognition is widely used and paid attention in many sciemtifec and technical spheres, Such as artificial intelligence, medical science, Neurobiology, weapon manufacture, navigation field.In these areas, common application has fingerprint Identification, recognition of face, optical character identification, Text region, precise guidance, fault detect, speech recognition and translation etc..Pattern is known The high speed development of other technology and extensive use, are greatly promoted the development of the national economy and science and techniques of defence modernization construction.

Word processing is an important branch of area of pattern recognition.In real world, people be unable to do without word, natural field The processing of scape word is always one of hot issue in pattern-recognition.Since the nineties in last century, international documentation analysis With identification meeting (International Conference of Document Analysis and Recognition, ICDAR) held once every 2 years, be greatly promoted the development of word processing technology.

With the popularization and development of mobile intelligent terminal, smart mobile phone increasingly obtains people with its exclusive convenient and intelligence Dote on.In daily life, it is seen that oneself word interested, can be shot into figure using the mobile phone of oneself at any time Piece, text information therein is then extracted, can so remove the trouble of people's handwriting input from, make the life of people more convenient. Meanwhile the word processing on mobile terminal can also be applied to other multiple fields, such as the guideboard in identification street, with reference to GPS Positioning, can give blind man navigation；License plate is identified, can more facilitate traffic police's management and record information；Extract shop doorplate Text information and translate into language known to user, their travellings abroad etc. can be facilitated.Therefore, in smart mobile phone Enterprising style of writing word processing has great application prospect.

However, realizing that above-mentioned application has larger technological challenge on smart mobile phone, following two aspect is mainly reflected in： On the one hand, the diversity of word and uncertainty cause it is abnormal tired to become the word processing in natural scene in natural scene It is difficult；On the other hand, CPU, GPU of smart mobile phone limitation, the degree of accuracy and real-time to literal processing method propose higher Requirement.

To sum up, natural scene word processing is always a difficult point of field of image recognition, especially on smart mobile phone Word processing is carried out, carrying out the development studied based on the scene word processing on smart mobile phone to artificial intelligence has actual anticipate Justice, the informatization to China also play an important roll.

The content of the invention

The technical problems to be solved by the invention are to provide a kind of intelligent mobile terminal scene text for above-mentioned prior art Word processing method, this method have taken into account speed and accuracy rate, are adapted to use in mobile platform.

Technical scheme is used by the present invention solves above-mentioned technical problem：A kind of intelligent mobile terminal scene word processing Method, it is characterised in that：Comprise the following steps：

Step 1：Text rough detection based on edge, is specifically included：

(1-1), using color image edge detection method in intelligent mobile terminal input scene image I carry out edge Detection, obtains the first intermediate processed images；

(1-2), morphology operations are carried out to the first intermediate processed images, connect the fracture text in the first intermediate processed images Word and adjacent word, obtain the second intermediate processed images；

(1-3), by finding the method for connected domain the second intermediate processed images are handled, so as to obtain input scene The image I text filed set S of candidate；

Each candidate in the stroke width figure T of step 2, acquisition input scene image I, set S text filed to candidate Text filed progress stroke width and geometrical Characteristics Analysis, reject undesirable non-textual region, finally output positioning knot Fruit schemes L1；

Step 3, identification pretreatment, are specifically included：

(3-1), the text filed carry out contrast enhancing to positioning result figure L1；

(3-2), to enhanced text filed carry out medium filtering；

(3-3), to the text filed carry out binaryzation after medium filtering；

(3-4), to the text filed carry out character cutting after binaryzation；

Step 4, the monocase after cutting is standardized and the extraction of directional element features operation, specifically include：

(4-1), each character after cutting is cut, the white background gone around dropping character, to every after cutting out Its size is uniformly transformed to N × N by width character picture using bilinear interpolation；

The profile of single character after (4-2), extraction uniform sizes, and calculate its directional element features；

(4-3), using distance classifier each character is identified, obtains the immediate X character of each character；

Step 5, the disaggregated classification based on Gabor characteristic, are specifically included：

(5-1), its size is uniformly transformed to M × M by each character using bilinear interpolation.

(5-2), Gabor transformation is carried out to the character after uniform sizes, extract Gabor characteristic；

(5-3), on the basis of obtaining the immediate X character of each character after (4-3) identification, distance classifier is utilized It is identified again, draws the recognition result of each character.

As an improvement, the step 2 specifically includes：

(2-1), using Canny edge detection methods to input scene image I carry out rim detection, obtain input scene figure As I edge graph, while record the gradient direction of each edge pixel point；

(2-2), stroke width conversion is carried out to edge pixel：

(2-2-1), assume that p is an edge pixel point, if dp is edge pixel point p gradient direction, according to dp directions Another matched edge pixel point is found in the edge pixel point of edge graph along route r=p+ndp (n≤0) Q, if dq is edge pixel point q gradient direction, dq and dp be in opposite direction or dq=-dp ± pi/2s；

It is not opposite, route r=p that if p, which does not find matched pixel q or dq with dp direction, + ndp (n≤0) goes out of use, it is necessary to reselect new edge pixel point p and find edge pixel point q on the other side；

If finding the pixel q to match, the stroke width of each pixel corresponded on [p, q] this route Value is each specified as | | p-q | |, | | p-q | | the Euclidean distance between pixel p and pixel q, if [p, q] this route On pixel had a stroke width value S, then take S with | | p-q | | in a less value as the pixel Actual stroke width value；

(2-2-2), (2-2-1) is repeated, the stroke width value until calculating pixel on all routes not gone out of use；

(2-2-3), again traversal either with or without the route being dropped, the stroke for calculating whole pixels on each route is wide Average M is spent, then finds out the pixel that all stroke width values on this route are more than M, then the stroke width value these pixels M is set to, after all routes travel through, finally gives input scene image I stroke width figure T；

It is corresponding to find step 1 (2-3), on the basis of the stroke width figure T for the input scene image I that step (2-2) obtains The obtained text filed set S of candidate, then set S text filed to candidate screen, screening rule is as follows：

(2-3-a), by the text filed rejecting of candidate of the Aspect Ratio not between 0.1 to 10；

(2-3-b), by character duration not between W/20 and W pixel, candidate text of the height not between H/20 and H Region rejecting, wherein W and H represent the width and height of image respectively；

(2-3-c), by area be less than 20 pixels the text filed deletion of candidate；

(2-3-d), set S text filed to candidate carry out binaryzation, calculate the ratio Rb shared by black pixel point, will be black The text filed rejecting of candidates of the ratio Rb not between 0.2 and 0.8 shared by colour vegetarian refreshments, Rb definition are

Wherein, f (i, j) represent be (i, j) position in the text filed image of candidate pixel value, w, what h was represented respectively It is the text filed width of candidate and height, what ⊕ was represented is XOR；

(2-3-e), set S text filed to candidate carry out binaryzation, the intercrossing Rcc in the region are calculated, by intercrossing The text filed rejecting of candidates of the Rcc not between 0.05 and 0.6, intercrossing Rcc definition are：

Wherein, what f (i, j) was represented is the pixel value of (i, j) position in the text filed image of candidate, what f (i, j+1) was represented It is the pixel value of (i, j+1) position in the text filed image of candidate, what w, h were represented respectively is candidate's text filed width and height Degree, what ⊕ was represented is XOR；

(2-3-f), set S text filed to candidate carry out stroke width conversion, obtain all candidates are text filed First stroke width figure, stroke width conversion will be carried out again after the text filed set S inverses of candidate, obtained all candidate's texts The second stroke width figure in region, if in text filed the first stroke width figure and the second stroke width figure of a certain candidate, Stroke width variance more than stroke width average value half, and the stroke width ratio of adjacent pixel is more than 3.0, then By the text filed rejecting of the candidate；

(2-4), text detection export：After the screening of (2-3), obtain it is final text filed, then according to each Text filed position relationship, it is ranked up and numbers according to rule from top to bottom, from left to right, will after sequence is completed Text area exports.

Preferably, the text filed of positioning result figure L1 is carried out pair using algorithm of histogram equalization in described (3-1) Strengthen than degree；Medium filtering is carried out to enhanced region using 3 × 3 rectangular slide templates in (3-2), i.e., using 3 × 3 Rectangular slide template, the pixel in template is ranked up according to the size of pixel value, generates the two dimension of monotone increasing or decline Data sequence, then the value with each pixel in the intermediate value replacement template of this group, are then exported；The step (3-3) is using maximum Ostu method carries out binaryzation to the region after medium filtering.

Compared with prior art, the advantage of the invention is that：

(1), the present invention is compared with the Method for text detection for being based purely on edge, and accuracy rate, which has, significantly to be lifted, and this is Because present invention employs being screened based on stroke width conversion to candidate regions, it is uneven effectively to eliminate many stroke widths Even non-textual region, so as to reduce text filed false drop rate；Stroke width Method for text detection of the invention and simple Compare, recall rate is higher, because the present invention uses the text detection algorithm based on edge as rough detection；

(2), the present invention has small size decline, but have on time performance compared with Gabor characteristic identifies on recognition performance It is substantially improved, the recognition time of single character averagely shortens about 41%, because present invention employs directional line element feature work Candidate characters roughing has been subjected to for thick feature；The present invention is compared with individually using to linear element feature, the accuracy rate of character recognition It is substantially improved, because being enhanced present invention employs Gabor characteristic as thin feature to the separating capacity of character；Therefore The present invention fully combines the two-fold advantage of the rapidity of extraction directional element features and the accuracy of Gabor characteristic identification；Compared with Speed and accuracy rate have been taken into account well, therefore are more suitable for using in mobile platform.

Brief description of the drawings

Fig. 1 is intelligent mobile terminal scene literal processing method flow chart in the embodiment of the present invention.

Embodiment

The present invention is described in further detail below in conjunction with accompanying drawing embodiment.

The invention provides a kind of intelligent mobile terminal scene literal processing method, it comprises the following steps, referring to Fig. 1 institutes Show：

Step 1：Text rough detection based on edge：

The rough detection of text is the first step, and its main task is the text as much as possible detected in input scene image I Word, only when " recall ratio " of the rough detection of text is high, text candidates area screening below just would make sense, so as to overall inspection Surveying accuracy rate just can be higher, because rim detection is than faster and " recall ratio " is high, being adapted to smart mobile phone to use, so of the invention The method of the text rough detection of use is the text detection algorithm based on edge, is specifically included：

(1-1), using color image edge detection method in intelligent mobile terminal input scene image I carry out edge Detection, obtains the first intermediate processed images；The present invention carries out rim detection using color image edge detection method, because This method is preferable to the effect of coloured image, and the edge lines detected are thicker, comparatively facilitate follow-up text rough detection, Color image edge detection method is conventional method of the prior art, respectively to tri- points of the RGB of image in 3 × 3 neighborhoods Amount seeks edge, takes marginal value of the maximum in four direction as present component, after the marginal value for trying to achieve whole pixels, use Nibalck algorithms carry out binaryzation to edge, finally give the first intermediate processed images；

(1-2), morphology operations are carried out to the first intermediate processed images, connect the fracture text in the first intermediate processed images Word and adjacent word, obtain the second intermediate processed images；Morphology operations are also conventional algorithm of the prior art, this hair It is bright to be more beneficial for the text detection based on edge below, the form student movement that the present invention uses by carrying out morphology operations At last to the dilation operation for both vertically and horizontally having carried out 3 pixels respectively of image, then respectively in the vertical of image Direction and horizontal direction have carried out the closed operation of 3 pixels respectively；

(1-3), by finding the method for connected domain the second intermediate processed images are handled, so as to obtain input scene The image I text filed set S of candidate；The method for finding connected domain is also conventional method of the prior art；

Text candidates area screening be the present invention second step, in order to rough detection result is analyzed, screen with Non-textual region is rejected, research shows, the text element in natural scene has nearly constant stroke width, and adjacent Strokes of characters width in text filed is roughly equal, therefore can distinguish text filed and non-textual area using this feature Domain；According to the characteristics of strokes of characters tends to fixed width in natural scene, the present invention proposes a kind of based on stroke width change The text candidates area screening technique changed, it is as follows the step of specific implementation：

(2-2), stroke width conversion is carried out to edge pixel：

(2-2-1), assume that p is an edge pixel point, if dp is edge pixel point p gradient direction, according to dp directions Another matched edge pixel point is found in the edge pixel point of edge graph along route r=p+ndp (n >=0) Q, if dq is edge pixel point q gradient direction, dq and dp be in opposite direction or dq=-dp ± pi/2s；

Need to particularly point out, said process but there may also be dark bottom in practice mainly for the positive text for dark word of putting one's cards on the table The reverse text of bright word, therefore, in (2-2), repeat (2-2-1), (2-2-2) and (2-2-3) once, repeat When, in (2-2-1), found according to dp directions in the edge pixel point of edge graph along route r=p+ndp (n≤0) and its Another edge pixel point q of matching.In addition, as can be seen that pixel to be detected during the stroke width map function Quantity greatly reduces, because the Gradient Features of a pixel, which are only worked as, finds gradient side that another matches and opposite with it To pixel when just it is effective；

It is corresponding to find step 1 (2-3), on the basis of the stroke width figure T for the input scene image I that step (2-3) obtains The obtained text filed set S of candidate, then set S text filed to candidate screen, screening rule is as follows：

(2-3-a), by the text filed rejecting of candidate of the Aspect Ratio not between 0.1 to 10；The text filed length of candidate It is wide than being to exist a range of, typically between 0.1 to 10, some Aspect Ratios that condition is not satisfied are excessive or too small Region should be removed；

(2-3-b), by character duration not between W/20 and W pixel, candidate text of the height not between H/20 and H Region rejecting, wherein W and H represent the width and height of image respectively；Character should not be excessive, also should not be too small, the width of character Should be between W/20 and W pixel, highly between H/20 and H, wherein W and H represents the width and height of image respectively, without Meeting the character zone of the condition should be removed

(2-3-c), by area be less than 20 pixels the text filed deletion of candidate；Candidate region area is too small, then is judged to It is set to non-textual region, therefore area is less than the candidates of 20 pixels and text filed should be deleted；

The ratio shared by black pixel point in (2-3-d), a region should not be excessive, also should not be too small, text area Ratio shared by black pixel point carries out binaryzation typically between 0.2 and 0.8 to the text filed set S of candidate, calculates Ratio Rb shared by black pixel point, candidates of the ratio Rb not between 0.2 and 0.8 shared by black pixel point is text filed Reject, Rb definition is

The intercrossing of (2-3-e), character area from the intercrossing in non-legible area be it is different, it is generally, non-legible The intercrossing in region is that do not have inerratic, and word is regularly arranged, therefore the intercrossing of character area is certain In the range of, therefore binaryzation is carried out to the text filed set S of candidate, the intercrossing Rcc in the region is calculated, by intercrossing Rcc The text filed rejecting of candidate not between 0.05 and 0.6, intercrossing Rcc definition are：

Some situation elements similar to text element in (2-3-f), natural scene be present, such as leaf, it is difficult to by they Made a distinction with word；In addition, the stroke width of the word in natural scene differ establish a capital it is equal, possible stroke width not wait but Amplitude of variation is little；General one text filed stroke width variance is no more than the half of the average value of stroke width, and The stroke width ratio of adjacent pixel is no more than 3.0, therefore should be rejected for the candidate region of stroke width change too greatly；This Invention set S text filed to candidate carries out stroke width conversion, obtains the first text filed stroke width of all candidates Figure, will carry out stroke width conversion again after the text filed set S inverses of candidate, obtain text filed second of all candidates Width figure is drawn, if in text filed the first stroke width figure and the second stroke width figure of a certain candidate, stroke width variance More than stroke width average value half, and the stroke width ratio of adjacent pixel is more than 3.0, then by candidate's text Reject in region；

(2-4), text detection export：After the screening of (2-3), obtain it is final text filed, then according to each Text filed position relationship, it is ranked up and numbers according to rule from top to bottom, from left to right, will after sequence is completed Text area exports, and the result of output is positioning result figure L1；

Step 3, identification pretreatment, are specifically included：

(3-1), the text filed carry out contrast enhancing to positioning result figure L1；In order to save operation time, the present invention Using text filed the contrasting that computing is simple and the algorithm of histogram equalization of positive effect is to positioning result figure L1 Degree enhancing, algorithm of histogram equalization is conventional algorithm of the prior art, and enhanced new images add grey scale pixel value Dynamic range, so as to reach enhancing picture contrast effect；

(3-2), to enhanced text filed carry out medium filtering, 3 × 3 rectangular slide templates are used in of the invention to increasing Text filed carry out medium filtering after strong, this method is also conventional method of the prior art, i.e., --- with 3 × 3 rectangles Sleiding form, the pixel in template is ranked up according to the size of pixel value, generates monotone increasing or the 2-D data of decline Sequence, then the value with each pixel in the intermediate value replacement template of this group, are then exported, the image after medium filtering is not only good The marginal information of original image is saved, and the gray scale of image is become more smooth；

(3-3), to the text filed carry out binaryzation after medium filtering；The present invention considers the efficiency of algorithm performs and wanted Not text filed situations such as there may be uneven illumination is treated, employs maximum variance between clusters, maximum variance between clusters are existing Conventional algorithm in technology,

(3-4), to the text filed carry out character cutting after binaryzation；Present invention employs one kind to project cutting method pair Word is split, and this method is also conventional method of the prior art, and this method needs to obtain text before cutting is carried out Edge image, projection cutting is then carried out to edge image, it is probably comprising multirow that one is text filed, it is also possible to is wrapped Containing multiple row, therefore when carrying out cutting, it is necessary to enter every trade cutting and row cutting, the algorithm complex of this method is smaller, Perform speed；

(4-1), each character after cutting is cut, the white background gone around dropping character, to every after cutting out Its size is uniformly transformed to N × N by width character picture using bilinear interpolation；The present invention needs to cut out each character area Cut, the white background gone around dropping character, because word is of different sizes, the feature of same word also can be different, therefore, , it is necessary to which every width character picture is normalized before extraction character feature, character area of different sizes is transformed to size one All single character areas are normalized to 64 × 64 rectangular area by the character area of cause, the present invention；

(4-3), according to directional element features, each character is identified using distance classifier, obtains each character Immediate X character；

(5-1), its size is uniformly transformed to M × M by each character using bilinear interpolation, M takes 40 here.

The present invention is using euclid-distance classifier to directional element features when being classified, the first discrimination be only 45% with On, and the discrimination of preceding 100 candidate characters is more than 89%, therefore X values take it is 100 proper, then in this 100 candidates The further disaggregated classification of Gabor characteristic is used in character；Gabor characteristic is the subdivision category feature of the present invention, is had to Chinese character preferable Discrimination, when using grader of the m-cosine angle as Gabor characteristic, the first discrimination up to more than 78%, so, The present invention can use grader of the m-cosine angle as Gabor characteristic, from 100 candidates of upper level grader classification As a result one result most matched of middle selection, is then exported this result as final recognition result.Therefore the present invention uses The mode of cascade, first pass through extraction directional element features and rough sort is carried out to Chinese character to be identified, by candidate's range shorter of identification To 100, then in this small range, accurately identified by extracting the thin features of Gabor, export final identification knot Fruit；Compared with Gabor characteristic identifies, there is small size decline on recognition performance, but have on time performance and be substantially improved, it is single The recognition time of character averagely shortens about 41%, because present invention employs directional line element feature as thick feature by candidate word Symbol is reduced to 100 from 3755；With merely using directional element features compared with, the accuracy rate of character recognition is substantially improved, this It is because present invention employs Gabor characteristic as thin feature, the separating capacity of character is enhanced；The present invention fully combines The two-fold advantage for extracting the rapidity of directional element features and the accuracy of Gabor characteristic identification has taken into account speed and accuracy rate, Therefore it is more suitable for using in mobile platform.

Claims

A kind of 1. intelligent mobile terminal scene literal processing method, it is characterised in that：Comprise the following steps：

Step 1：Text rough detection based on edge, is specifically included：

(1-1), using color image edge detection method in intelligent mobile terminal input scene image I carry out edge inspection Survey, obtain the first intermediate processed images；

(1-2), to the first intermediate processed images carry out morphology operations, connect the first intermediate processed images in fracture word with And adjacent word, obtain the second intermediate processed images；

(1-3), by finding the method for connected domain the second intermediate processed images are handled, so as to obtain input scene image The I text filed set S of candidate；

Each candidate's text in the stroke width figure T of step 2, acquisition input scene image I, set S text filed to candidate Region carries out stroke width and geometrical Characteristics Analysis, rejects undesirable non-textual region, finally exports positioning result figure L1；

Step 3, identification pretreatment, are specifically included：

(3-1), the text filed carry out contrast enhancing to positioning result figure L1；

(3-2), to enhanced text filed carry out medium filtering；

(3-3), to the text filed carry out binaryzation after medium filtering；

(3-4), to the text filed carry out character cutting after binaryzation；

Step 4, the monocase after cutting is standardized and the extraction of directional element features operation, specifically include：

(4-1), each character after cutting is cut, the white background gone around dropping character, to every width word after cutting out Its size is uniformly transformed to N × N by symbol image using bilinear interpolation；

The profile of single character after (4-2), extraction uniform sizes, and calculate its directional element features；

(4-3), using distance classifier each character is identified, obtains the immediate X character of each character；

Step 5, the disaggregated classification based on Gabor characteristic, are specifically included：

(5-1), its size is uniformly transformed to M × M by each character using bilinear interpolation；

(5-2), Gabor transformation is carried out to the character after uniform sizes, extract Gabor characteristic；

(5-3), on the basis of obtaining the immediate X character of each character after (4-3) identification, using distance classifier again It is identified, draws the recognition result of each character.
2. intelligent mobile terminal scene literal processing method according to claim 1, it is characterised in that：The step 2 has Body includes：

(2-1), using Canny edge detection methods to input scene image I carry out rim detection, obtain input scene image I Edge graph, while record the gradient direction of each edge pixel point；

(2-2), stroke width conversion is carried out to edge pixel：

(2-2-1), assume p be an edge pixel point, if dp be edge pixel point p gradient direction, according to dp directions along Route r=p+ndp (n≤0) finds matched another edge pixel point q in the edge pixel point of edge graph, if Dq is edge pixel point q gradient direction, and dq and dp be in opposite direction or dq=-dp ± pi/2s；

It is not opposite, route r=p+n that if p, which does not find matched pixel q or dq with dp direction, Dp (n≤0) goes out of use, it is necessary to reselect new edge pixel point p and find edge pixel point q on the other side；

If finding the pixel q to match, the stroke width value of each pixel corresponded on [p, q] this route is equal It is designated as | | p-q | |, | | p-q | | the Euclidean distance between pixel p and pixel q, if on [p, q] this route Pixel has had a stroke width value S, then take S with | | p-q | | in reality of the less value as the pixel Stroke width value；

(2-2-2), (2-2-1) is repeated, the stroke width value until calculating pixel on all routes not gone out of use；

(2-2-3), again traversal either with or without the route being dropped, the stroke width for calculating whole pixels on each route is equal Value M, then find out all stroke width values on this route and be more than M pixel, then the stroke width value of these pixels is set to M, after all routes travel through, finally give input scene image I stroke width figure T；

(2-3), on the basis of the stroke width figure T for the input scene image I that step (2-2) obtains, correspondingly find step 1 and obtain The text filed set S of candidate, then set S text filed to candidate screen, screening rule is as follows：

(2-3-a), by the text filed rejecting of candidate of the Aspect Ratio not between 0.1 to 10；

(2-3-b), by character duration not between W/20 and W pixel, candidate of the height not between H/20 and H is text filed Rejecting, wherein W and H represent the width and height of image respectively；

(2-3-c), by area be less than 20 pixels the text filed deletion of candidate；

(2-3-d), set S text filed to candidate carry out binaryzation, the ratio Rb shared by black pixel point are calculated, by black picture The text filed rejecting of candidates of the ratio Rb not between 0.2 and 0.8 shared by vegetarian refreshments, Rb definition are

Wherein, what f (i, j) was represented is the pixel value of (i, j) position in the text filed image of candidate, and what w, h were represented respectively is to wait The width and height of selection one's respective area, what ⊕ was represented is XOR；

(2-3-e), set S text filed to candidate carry out binaryzation, calculate the intercrossing Rcc in the region, by intercrossing Rcc not The text filed rejecting of candidate between 0.05 and 0.6, intercrossing Rcc definition are：

Wherein, what f (i, j) was represented is the pixel value of (i, j) position in the text filed image of candidate, and what f (i, j+1) was represented is to wait The pixel value of (i, j+1) position in selection local area area image, what w, h were represented respectively is the text filed width of candidate and height, What ⊕ was represented is XOR；

(2-3-f), set S text filed to candidate carry out stroke width conversion, obtain by all candidates it is text filed first Stroke width figure, stroke width conversion will be carried out again after the text filed set S inverses of candidate, obtained all candidates are text filed The second stroke width figure, if in text filed the first stroke width figure and the second stroke width figure of a certain candidate, stroke Width variance exceedes the half of the average value of stroke width, and the stroke width ratio of adjacent pixel then should more than 3.0 The text filed rejecting of candidate；

(2-4), text detection export：After the screening of (2-3), obtain it is final text filed, then according to each text The position relationship in region, it is ranked up and numbers according to rule from top to bottom, from left to right, after sequence is completed, by text Area exports.
3. intelligent mobile terminal scene literal processing method according to claim 1, it is characterised in that：In (3-1) Using text filed carry out contrast enhancing of the algorithm of histogram equalization to positioning result figure L1；(3-2) middle use 3 × 3 rectangular slide templates carry out medium filtering to enhanced region, i.e., using 3 × 3 rectangular slide templates, by the pixel in template It is ranked up according to the size of pixel value, generates the 2-D data sequence of monotone increasing or decline, then replaced with the intermediate value of this group The value of each pixel, is then exported in template；The step (3-3) is using maximum variance between clusters to the region after medium filtering Carry out binaryzation.
4. intelligent mobile terminal scene literal processing method according to claim 2, it is characterised in that：In (2-2), (2-2-1), (2-2-2) and (2-2-3) is repeated once, when repeating, in (2-2-1), according to dp directions along route r =p+ndp (n≤0) finds matched another edge pixel point q in the edge pixel point of edge graph.