CN103440487B - A kind of natural scene text location method of local tone difference - Google Patents
A kind of natural scene text location method of local tone difference Download PDFInfo
- Publication number
- CN103440487B CN103440487B CN201310377443.8A CN201310377443A CN103440487B CN 103440487 B CN103440487 B CN 103440487B CN 201310377443 A CN201310377443 A CN 201310377443A CN 103440487 B CN103440487 B CN 103440487B
- Authority
- CN
- China
- Prior art keywords
- hue
- box
- candidate frame
- difference
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Abstract
The present invention provides the natural scene text location method of a kind of local tone difference.The present invention not only make use of the textural characteristics of word, and make use of the character area feature different from peripheral region tone, effectively positions the word in scene.By taking the average hue difference near edge pixel point, this average hue difference is utilized relatively to judge whether this region contains word compared with threshold value, do so can add the local color information of region word, utilizes the colour consistency of word to position word from the different of background.And the present invention utilizes adaptive thresholding method to obtain threshold value, this threshold value is to be obtained by the meansigma methods of the dominant hue difference in the region up and down of all candidate frames, the purpose of do so is that the colouring information utilizing view picture figure is contributed for local color information, and the threshold value obtained can characterize the character area of scene graph and the hue difference of background.Word in natural scene can be accurately positioned by the present invention quickly.
Description
Technical field
The invention belongs to image procossing and technical field of computer vision, particularly to a kind of natural scene text location method.
Background technology
Scene picture Chinese word is detected automatically, segmentation, identify, will provide the biggest to the acquisition of information of people
Help, also automatic Understanding and retrieval to the semantic information of image have very important meaning.In onboard navigation system,
If can automatically the road sign in front, retail shop's title, traffic signs etc. be positioned, identify, then by the trip for people
Safety guarantee is provided, driver can be reminded to slow down, and correct traffic route.Multimedia and the high speed development of computer
In, picture becomes the important medium of transmission with its deep specific form of image, and retrieval based on key word can not meet people
Demand, and retrieval based on image content, have become as the trend of development, in retrieval, the location of word, identification becomes
Crucial technology, attracts the concern of more and more scholar, and text location can be that the reading of blind person provides and assists help simultaneously.
During in scene, text location shows methodical integrated learning, the method for text location substantially can be divided into two kinds of methods:
1, text location method based on texture;2, text location method based on region.Text location method based on texture, just
Be utilize textural characteristics to distinguish word and non-legible, by one or one piece of region are the cluster of word to together, this side
The robustness of method is good, but the complexity that also result in algorithm is higher.Text positioning method based on region, it is simply that according to one piece
The pixel in region meets certain similarity to distinguish word with non-legible, such as, can make according to consistency of colour in region
Being characterized, come separately text and background area, this method is simple, but a kind of feature tends not to meet all of classification,
Robustness is not enough, effect bad in the scene picture performance process complex background.
Summary of the invention
The technical problem to be solved is to provide one and effectively can position word in natural scene, speed simultaneously
Comparatively fast, the text location method that practicality is stronger.
The present invention solves that the technical scheme that above-mentioned technical problem is sampled is, the natural scene word of a kind of local tone difference is fixed
Method for position, comprises the following steps:
1) by grader, scene picture is scanned, the candidate frame corresponding to obtain candidate character region;
2) scene picture is converted into HSI color model, extracts tone H component, calculate all candidate frames box (i) and adjacent region
Dominant hue difference hue_aver in territory:
Wherein, box_domihue (i) is the dominant hue of i-th candidate frame box (i), box_neighbour_domihue (i) be with
The dominant hue of candidate frame box (i) adjacent area;N is candidate frame sum in current scene picture;
3) take edge pixel point in scene picture, seek the average tone between all edge pixel points and neighbor pixel point in each candidate frame
Difference local_hue (i);
4) average hue difference local_hue (i) that relatively each candidate frame is corresponding and the size of dominant hue difference hue_aver, work as candidate
Average hue difference local_hue (i) that frame is corresponding is more than dominant hue difference hue_aver, then will regard current candidate frame as comprising word
Region, otherwise gives up current candidate frame;After all candidate frames judge, final scene text location completes.
The present invention not only make use of the textural characteristics of word, and make use of the character area feature different from peripheral region tone,
Effectively the word in scene is positioned.By taking the average hue difference near edge pixel point, utilize this average hue difference
Relatively judging whether this region contains word compared with threshold value, do so can add the local color information of region word, utilizes
The colour consistency of word positions word from the different of background.And the present invention utilizes adaptive thresholding method to obtain threshold
Value, this threshold value is to be obtained by the meansigma methods of the dominant hue difference in the region up and down of all candidate frames, and the purpose of do so is profit
Contributing for local color information with the colouring information of view picture figure, the threshold value obtained can characterize character area and the back of the body of scene graph
The hue difference of scape.
The invention has the beneficial effects as follows, can quickly word in natural scene be accurately positioned.
Accompanying drawing explanation
Fig. 1: embodiment candidate frame handling process schematic diagram;
The natural scene picture of Fig. 2: input;
Fig. 3: process the scene graph text location effect obtained through feature and grader;
Fig. 4: tone difference processes the scene graph text location effect obtained through local.
Detailed description of the invention
Specifically comprising the following steps that of the text location of embodiment
Step one: the training of character features and the design of grader.
1) set up a Sample Storehouse, wherein contain positive sample number (containing word) 3000, be masked as 1, and negative sample is (no
Containing word) 7000, it is masked as-1, all samples have been normalized into 48*96 size.
2) to all sample extraction gradient orientation histograms (HOG) in Sample Storehouse and the uniform mould with rotational invariance
The LBP feature of formula, is unified into a characteristic vector.
3) it is input in grader be trained obtaining one point by the characteristic vector of positive sample and negative sample totally 10000 samples
Class device.
The training method of word grader is not belonging to the focus of the present invention, and those skilled in the art can be according to according to existing disclosure
Method combine real needs design grader.
Step 2: obtain scene picture, as in figure 2 it is shown, use the grader trained that natural scene picture is scanned,
To obtain candidate character region.
1) scene picture is converted into gray-scale map ImGray, scene graph is carried out scaling simultaneously, use the slip of 48*96
Scene picture is scanned by window;
2) grader that each window obtained sliding uses step one to obtain judges, stays if it is judged that be 1
This window, otherwise casts out, and is expanded with the ratio that scene graph scales by the window obtained and obtains candidate frame;
3) all candidate frames are judged, if the ratio of the region intersected of two windows and the overall area of two windows is more than
0.5, then merge into a candidate frame, obtain final candidate frame box, as shown in Figure 3.
Originally it is implemented in this process providing a preferred acquisition candidate frame, merges to simplify follow-up place to overlapping candidate frame
Reason.
Step 3: candidate frame is carried out the judgement of local tone difference, to obtain final text location window, shown in Fig. 1.
1) it is HSI (Hue-Saturation-Intensity) model by the RGB model conversation of scene graph, extracts the tone of scene graph
H component, in H component, respectively one frame of extraction of each candidate frame box (i) obtaining step 2, if should
The a certain bar limit of candidate frame box (i) is positioned at border, then the frame of this boundary direction does not takes, the synthesis of frame up and down that will obtain
Individual region is as peripheral region box_neighbour (i) changing candidate frame.
2) each box_neighbour (i) and box (i) are asked for the rectangular histogram of H chrominance component, and this enforcement is by rectangular histogram intermediate value
That maximum tone is as the dominant hue of this frame, and those skilled in the art can also make otherwise to determine in conjunction with actual demand
Dominant hue, subtracts each other the dominant hue of each box (i) with box_neighbour (i) and takes absolute value, as the dominant hue difference of box (i),
The averaged dominant color obtaining all candidate frames adjusts difference hue_aver, and the threshold value as next step uses.
Wherein: i represents the i-th candidate frame, N represents candidate frame number, box_domihue (i), represents the dominant hue of i-th frame,
Box_neighbour_domihue (i) represents the field dominant hue of i-th frame.
3rd step: use Canny operator that the gray-scale map ImGray in scene is carried out computing, ask for edge graph ImCanny,
In each candidate frame that step 2 obtains, find edge graph ImCanny be not equal to 0 pixel pixel, i.e. edge pixel point,
The chrominance component H of the pixel up and down asking for edge pixel point is poor, and the chrominance component H of left and right pixel is poor, obtain up and down and
The meansigma methods of left and right hue difference, asks for all ImCanny in box (i) candidate frame the most again and is not equal to the average of pixel p ixel of 0
Hue difference local_hue (i), compares with threshold value hue_aver, if greater than this threshold value, then retains this frame, otherwise casts out,
Obtain final scene text location, as shown in Figure 4.
Wherein: x represents the position of the pixel of edge pixel in candidate frame, pixel_up (x), pixel_down (x),
Pixel_left (x), pixel_right (x) represent the tone value up and down of pixel i respectively, and M represents candidate frame box (i)
Middle edge pixel sum.
Claims (5)
1. the natural scene text location method of a local tone difference, it is characterised in that comprise the following steps:
1) by grader, scene picture is scanned, the candidate frame corresponding to obtain candidate character region;
2) scene picture is converted into HSI color model, extracts tone H component, calculate all candidate frames box (i) and adjacent region
Dominant hue difference hue_aver in territory:
Wherein, box_domihue (i) is the dominant hue of i-th candidate frame box (i), box_neighbour_domihue (i) be with
The dominant hue of candidate frame box (i) adjacent area;N is candidate frame sum in current scene picture;
3) take edge pixel point in scene picture, seek the average tone between all edge pixel points and neighbor pixel point in each candidate frame
Difference local_hue (i);
4) average hue difference local_hue (i) that relatively each candidate frame is corresponding and the size of dominant hue difference hue_aver, work as candidate
Average hue difference local_hue (i) that frame is corresponding is more than dominant hue difference hue_aver, then will regard current candidate frame as comprising word
Region, otherwise gives up current candidate frame;After all candidate frames judge, final scene text location completes.
The natural scene text location method of a kind of local the most as claimed in claim 1 tone difference, it is characterised in that candidate frame
The extracting method of adjacent area be:
When candidate frame box (i) does not has while being positioned at the border of scene picture, then in the respectively extraction one of candidate frame box (i)
Frame, is positioned at the border of scene picture when candidate frame box (i) has, does not takes frame at this boundary direction;To candidate frame box (i)
After being extracted adjacent frame, these frames are synthesized region adjacent area box_neighbour (i) as this candidate frame.
The natural scene text location method of a kind of local the most as claimed in claim 1 tone difference, it is characterised in that each candidate
In frame, the computational methods of average hue difference local_hue (i) between all edge pixel points and neighbor pixel point are:
Wherein, x represents the position of the edge pixel point in i-th candidate frame, pixel_up (x), pixel_down (x),
Pixel_left (x), pixel_right (x) represent the tone value of the upper and lower, left and right of pixel x respectively, and M represents i-th
Edge pixel sum in individual candidate frame.
The natural scene text location method of a kind of local the most as claimed in claim 1 tone difference, it is characterised in that step 3)
In the edge graph asked for by Canny operator obtain edge pixel point.
The natural scene text location method of a kind of local the most as claimed in claim 1 tone difference, it is characterised in that described master
Tone is the maximum tone value in H histogram of component.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310377443.8A CN103440487B (en) | 2013-08-27 | 2013-08-27 | A kind of natural scene text location method of local tone difference |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310377443.8A CN103440487B (en) | 2013-08-27 | 2013-08-27 | A kind of natural scene text location method of local tone difference |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103440487A CN103440487A (en) | 2013-12-11 |
CN103440487B true CN103440487B (en) | 2016-11-02 |
Family
ID=49694180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310377443.8A Expired - Fee Related CN103440487B (en) | 2013-08-27 | 2013-08-27 | A kind of natural scene text location method of local tone difference |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103440487B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017059576A1 (en) * | 2015-10-09 | 2017-04-13 | Beijing Sensetime Technology Development Co., Ltd | Apparatus and method for pedestrian detection |
CN108564084A (en) * | 2018-05-08 | 2018-09-21 | 北京市商汤科技开发有限公司 | character detecting method, device, terminal and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102163284A (en) * | 2011-04-11 | 2011-08-24 | 西安电子科技大学 | Chinese environment-oriented complex scene text positioning method |
US8331684B2 (en) * | 2010-03-12 | 2012-12-11 | Sony Corporation | Color and intensity based meaningful object of interest detection |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7418141B2 (en) * | 2003-03-31 | 2008-08-26 | American Megatrends, Inc. | Method, apparatus, and computer-readable medium for identifying character coordinates |
-
2013
- 2013-08-27 CN CN201310377443.8A patent/CN103440487B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8331684B2 (en) * | 2010-03-12 | 2012-12-11 | Sony Corporation | Color and intensity based meaningful object of interest detection |
CN102163284A (en) * | 2011-04-11 | 2011-08-24 | 西安电子科技大学 | Chinese environment-oriented complex scene text positioning method |
Non-Patent Citations (2)
Title |
---|
Fast and robust text detection in images and video frames;Qixiang Ye等;《Image and Vision Computing》;20051231;第565-576页 * |
基于颜色散布分析的自然场景文本定位;周慧灿等;《计算机工程》;20100430;第36卷(第8期);第197-200页 * |
Also Published As
Publication number | Publication date |
---|---|
CN103440487A (en) | 2013-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104050471B (en) | Natural scene character detection method and system | |
CN107346420B (en) | Character detection and positioning method in natural scene based on deep learning | |
CN102968637B (en) | Complicated background image and character division method | |
CN105373794B (en) | A kind of licence plate recognition method | |
CN105046196B (en) | Front truck information of vehicles structuring output method based on concatenated convolutional neutral net | |
CN103198315B (en) | Based on the Character Segmentation of License Plate of character outline and template matches | |
EP2575077A2 (en) | Road sign detecting method and road sign detecting apparatus | |
US9092696B2 (en) | Image sign classifier | |
CN108805018A (en) | Road signs detection recognition method, electronic equipment, storage medium and system | |
CN103824081B (en) | Method for detecting rapid robustness traffic signs on outdoor bad illumination condition | |
CN104408449B (en) | Intelligent mobile terminal scene literal processing method | |
CN105160691A (en) | Color histogram based vehicle body color identification method | |
CN104751142A (en) | Natural scene text detection algorithm based on stroke features | |
CN103336961B (en) | A kind of interactively natural scene Method for text detection | |
CN104850850A (en) | Binocular stereoscopic vision image feature extraction method combining shape and color | |
CN108765443A (en) | A kind of mark enhancing processing method of adaptive color Threshold segmentation | |
CN105005766A (en) | Vehicle body color identification method | |
Duan et al. | A WBC segmentation methord based on HSI color space | |
CN103106409A (en) | Composite character extraction method aiming at head shoulder detection | |
CN106529432A (en) | Hand area segmentation method deeply integrating significance detection and prior knowledge | |
CN109753962B (en) | Method for processing text region in natural scene image based on hybrid network | |
CN108664969A (en) | Landmark identification method based on condition random field | |
CN104598907A (en) | Stroke width figure based method for extracting Chinese character data from image | |
CN109145746B (en) | Signal lamp detection method based on image processing | |
CN106874848A (en) | A kind of pedestrian detection method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20161102 Termination date: 20190827 |