CN103942797B - Scene image text detection method and system based on histogram and super-pixels - Google Patents
Scene image text detection method and system based on histogram and super-pixels Download PDFInfo
- Publication number
- CN103942797B CN103942797B CN201410168244.0A CN201410168244A CN103942797B CN 103942797 B CN103942797 B CN 103942797B CN 201410168244 A CN201410168244 A CN 201410168244A CN 103942797 B CN103942797 B CN 103942797B
- Authority
- CN
- China
- Prior art keywords
- pixel
- edge
- module
- stroke width
- super
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Image Analysis (AREA)
Abstract
The invention relates to a scene image text detection method based on a histogram and super-pixels. The scene image text detection method comprises the steps that firstly, stroke width values of text which may exist in a target image are estimated, and the stroke histogram is generated on the basis of the stroke width values; secondly, edge detection is conducted on the target image, comparison and correction are conducted, and connected domains with the highest edge detection quality are obtained; thirdly, skeletonization is conducted on the connected domains, skeleton pixels are obtained, and a high-precision stroke width is estimated according to the skeleton pixels; fourthly, characters and non-characters are filtered according to the high-precision stroke width; fifthly, the characters and the non-characters are further filtered through spatial distribution of the connected domains by means of geometric constraint, and text lines and non-text lines are filtered; sixthly, detection of the characters and the text lines in the target image is completed. According to the scene image text detection method based on the histogram and the super-pixels, a high-speed and high-precision stroke width calculation method is provided, and therefore precision and efficiency of filtering the connected domains between text and non-text can be improved.
Description
Technical field
The present invention relates to the scene image words detection method based on rectangular histogram and super-pixel and system, belong to information security
And computer vision field.
Background technology
In recent years, with the increase of the mobile device of built-in camera, all kinds of number of pictures shooting in natural scene become
Explosive increase.A lot of very valuable applications, for example: the picture query based on Word message, intelligent driving auxiliary, vision
Understanding reading auxiliary and scene of obstacle personnel etc., all relies on the method obtaining Word message from picture.Therefore, natural
Word Input in scene and identification, as the key problem processing this new data source, become computer vision in recent years and grind
The much-talked-about topic studied carefully.
Character detecting method includes the method based on connected domain analysis and the method based on sliding window.Divided based on connected domain
The method of analysis by being analyzed to the connected domain in picture, and by filtering to text space distribution constraint and geometrical property
Character and non-character.Epshtein etc. [1] proposes to extract the edge in picture using edge detection algorithm, and using gradient letter
Breath etc. comes as classification foundation calculating " stroke " width of these edge compositing areas;On the basis of epshtein work,
Huang Lin etc. [2] proposes to need when calculating " stroke " width to keep the colour consistency of " stroke ", and is retouched using covariance
State symbol the line of text detecting and character are filtered.The algorithm of another kind of text detection mainly passes through sliding window cause for gossip
Existing, what such as cunzhao shi etc. [3] proposed constructs the tree construction text detection calculation based on part using histogram of gradients
Method;What jung etc. [4] proposed carries out multiple dimensioned text detection using stroke wave filter.With the method phase based on sliding window
The method computation complexity based on connected domain is low, but compares the quality depending on rim detection, in illumination complexity and picture for ratio
In the relatively low environment of quality, effect is slightly poor.Because the species of the text color in scene image and font etc., change are more, and
Method based on sliding window needs to be based on analysis in multiple yardsticks to image, and therefore, the method computation complexity is higher, and
Usually need a big training set that grader is trained.In the method based on connected domain analysis, wide based on " stroke "
The algorithm of degree obtains a lot of concerns due to its simplicity and effectiveness, and occurs in that some innovatory algorithm to this algorithm.
However, in the case that word is more by partial occlusion or noise, the degree of accuracy by rim detection and gradient estimation is affected,
The performance of these algorithms is not still very good.
Content of the invention
The technical problem to be solved is to use super-pixel correction edge in complex environment for prior art
The deficiency that detection was lost efficacy, provide a kind of improve the recall rate of detection algorithm and accuracy rate based on stroke width rectangular histogram and super picture
The scene image words detection method of element.
The technical scheme is that the scene image words based on rectangular histogram and super-pixel
Detection method, specifically includes following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke
Width value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Target Photo is carried out
Rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain here
In the case of stroke width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is estimated
Calculation obtains high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain
Precisely character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete to the detection to accurate character and line of text in Target Photo.
The invention has the beneficial effects as follows: the local edge that the present invention is directed to the word in text detection problem improves edge inspection
Mass metering;A kind of high speed and high-precision stroke width computational methods are proposed, to improve what word and non-legible connected domain filtered
Precision and efficiency.
On the basis of technique scheme, the present invention can also do following improvement.
Further, also include step 7: the distance between each accurate character value in statistics line of text, set the word in word
Distance threshold between symbol distance threshold and word;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
Beneficial effect using above-mentioned further scheme is, according to distance threshold between character distance threshold and word to line of text
After being divided into character, facility can be provided for follow-up character recognition.
Further, the geometrical constraint described in described step 5 include stroke width concordance, the ratio of width to height, between connected domain
Plyability etc..
Further, step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use
Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, the gradient direction to reference edge pixel scans for
The edge pixel of all presence;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution
Step 1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge whether the Grad of mapping edge pixel and reference edge pixel gradient value difference value arrive at 150 degree
Between 210 degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return and execute step
Rapid 1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step
1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
Further, step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search of super-pixel
Value;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point minimum position of gradient nearby
Put the initial barycenter as super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel
And border;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, to revised edge on a large scale
Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance conversion at edge on a large scale
Figure (is calculated using the algorithms most in use that range conversion of the prior art is image), obtains the situation in this stroke width value
Under, rim detection quality highest connected domain.
Further, described step 3, will be wherein terraced particularly as follows: calculate the gradient of Euclidean distance Transformation Graphs using sobel operator
Degree is set to Skeleton pixel close to zero pixel;Estimation is carried out according to Skeleton pixel to stroke width value and obtains high accuracy stroke width
Degree.
The technical problem to be solved is to use super-pixel correction edge in complex environment for prior art
The deficiency that detection was lost efficacy, provide a kind of improve the recall rate of detection algorithm and accuracy rate based on stroke width rectangular histogram and super picture
The scene image words detecting system of element.
The technical scheme is that the scene image words based on rectangular histogram and super-pixel
Detecting system, comprising: estimation module, edge detection module, skeletonizing module, filtering module and secondary filter module;
Described estimation module carries out estimation to text width value that may be present in Target Photo and obtains stroke width value, base
Generate a stroke rectangular histogram in stroke width value, and stroke rectangular histogram is sent to edge detection module;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module;To mesh
Piece of marking on a map carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and repaiies
Just, obtain in the case of this stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to
Skeletonizing module;
Described skeletonizing module carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel to stroke width
Value carries out estimation and obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module;
Described filtering module filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains
To character;
Described secondary filter module is entered to the character obtaining using geometrical constraint further by the spatial distribution of connected domain
Row filters, and obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text.
The invention has the beneficial effects as follows: the local edge that the present invention is directed to the word in text detection problem improves edge inspection
Mass metering;A kind of high speed and high-precision stroke width computational methods are proposed, to improve what word and non-legible connected domain filtered
Precision and efficiency.
On the basis of technique scheme, the present invention can also do following improvement.
Further, statistical module and segmentation module are also included;
Described statistical module is used for counting the distance between each accurate character value in line of text, sets the pitch character in word
Distance threshold between threshold value and word;
Described segmentation module is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
Further, the geometrical constraint described in described secondary filter module includes stroke width concordance, the ratio of width to height, connection
Plyability between domain etc..
Further, described estimation module includes: gradient modules, the paired module of search, search mapping block and computing module;
Described gradient modules are calculated the multiple edge pixels in Target Photo using canny edge detection operator;Make
It is calculated the Grad of Target Photo with sobel operator;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module edge pixel on the basis of an edge pixel, to the gradient direction of reference edge pixel
Scan for the edge pixel of all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block search Grad and reference edge pixel gradient value difference value are between 150 degree to 210 degree
Mapping edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width
Value.
Further, described edge detection module includes: step-length selecting module, barycenter selecting module, iteration update module, big
Range detection module, correcting module and connected domain analysis module;
Described step-length selecting module selects the larger several stroke width values of stroke rectangular histogram medium frequency as super-pixel
Step-size in search value;
Described barycenter selecting module searches the lattice point obtaining that gap size is step-size in search value, selects this lattice point gradient nearby
Minimum position is as the initial barycenter of super-pixel;
Described iteration update module updates for iteration and calculates actual barycenter on picture for each super-pixel and side
Boundary;
Described detection module on a large scale reduces the threshold value of canny edge detection operator, the new side on a large scale of detection picture
Edge;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module, to revised on a large scale
Edge removes the interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module is used for carrying out connected domain analysis to the edge on a large scale of picture, calculates edge on a large scale
Euclidean distance Transformation Graphs (using range conversion of the prior art be image algorithms most in use calculated).
Further, described skeletonizing module, will specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator
Wherein gradient is set to Skeleton pixel close to zero pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy pen
Draw width.
Brief description
Fig. 1 is the scene image words detection method flow chart based on rectangular histogram and super-pixel of the present invention;
Fig. 2 is the tool based on step 1 in the scene image words detection method of rectangular histogram and super-pixel of the present invention
Body flow chart;
Fig. 3 is the tool based on step 2 in the scene image words detection method of rectangular histogram and super-pixel of the present invention
Body flow chart;
Fig. 4 is the scene image words detecting system structured flowchart based on rectangular histogram and super-pixel of the present invention.
In accompanying drawing, the list of parts representated by each label is as follows:
1st, estimation module, 2, edge detection module, 3, skeletonizing module, 4, filtering module, 5, secondary filter module, 6, system
Meter module, 7, segmentation module, 11, gradient modules, 12, search for paired module, 13, search mapping block, 14, computing module, 21,
Step-length selecting module, 22, barycenter selecting module, 23, iteration update module, 24, detection module on a large scale, 25, correcting module,
26th, connected domain analysis module.
Specific embodiment
Below in conjunction with accompanying drawing, the principle of the present invention and feature are described, example is served only for explaining the present invention, and
Non- for limiting the scope of the present invention.
As shown in figure 1, being the scene image words detection method based on rectangular histogram and super-pixel of the present invention, specifically
Comprise the following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke
Width value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Target Photo is carried out
Rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain here
In the case of stroke width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is estimated
Calculation obtains high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain
Precisely character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete to the detection to accurate character and line of text in Target Photo;
Step 7: the distance between each accurate character value in statistics line of text, set the character distance threshold in word and word
Between distance threshold;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
Geometrical constraint described in described step 5 includes stroke width concordance, the ratio of width to height, the plyability between connected domain
Deng.
As shown in Fig. 2 step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use
Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, the gradient direction to reference edge pixel scans for
The edge pixel of all presence;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution
Step 1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge whether the Grad of mapping edge pixel and reference edge pixel gradient value difference value arrive at 150 degree
Between 210 degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return and execute step
Rapid 1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step
1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
As shown in figure 3, step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search of super-pixel
Value;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point minimum position of gradient nearby
Put the initial barycenter as super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel
And border;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, to revised edge on a large scale
Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance conversion at edge on a large scale
Figure (is calculated using the algorithms most in use that range conversion of the prior art is image).
Described step 3, will be close for wherein gradient particularly as follows: calculate the gradient of Euclidean distance Transformation Graphs using sobel operator
Zero pixel is set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width.
As shown in figure 4, being the scene image words detecting system based on rectangular histogram and super-pixel of the present invention, bag
Include: estimation module 1, edge detection module 2, skeletonizing module 3, filtering module 4 and secondary filter module 5;
Described estimation module 1 carries out estimation to text width value that may be present in Target Photo and obtains stroke width value,
One stroke rectangular histogram is generated based on stroke width value, and stroke rectangular histogram is sent to edge detection module 2;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module 2;Right
Target Photo carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and repaiies
Just, obtain in the case of this stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to
Skeletonizing module 3;
Described skeletonizing module 3 carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel to stroke width
Value carries out estimation and obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module 4;
Described filtering module 4 filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains
To character;
Described secondary filter module 5 is entered to the character obtaining using geometrical constraint further by the spatial distribution of connected domain
Row filters, and obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text.
Also include statistical module 6 and segmentation module 7;
Described statistical module 6 is used for counting the distance between each accurate character value in line of text, sets the character in word
Distance threshold between distance threshold and word;
Described segmentation module 7 is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
Geometrical constraint described in described secondary filter module 5 include stroke width concordance, the ratio of width to height, between connected domain
Plyability etc..
Described estimation module 1 includes: gradient modules 11, the paired module 12 of search, search mapping block 13 and computing module
14;
Described gradient modules 11 are calculated the multiple edge pixels in Target Photo using canny edge detection operator;
It is calculated the Grad of Target Photo using sobel operator;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module 12 edge pixel on the basis of an edge pixel, to the gradient side of reference edge pixel
To the edge pixel scanning for all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block 13 search Grad and reference edge pixel gradient value difference value 150 degree to 210 degree it
Between mapping edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module 14 is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width
Angle value.
Described edge detection module 2 includes: step-length selecting module 21, barycenter selecting module 22, iteration update module 23, big
Range detection module 24, correcting module 25 and connected domain analysis module 26;
Described step-length selecting module 21 selects the larger several stroke width values of stroke rectangular histogram medium frequency as super-pixel
Step-size in search value;
Described barycenter selecting module 22 searches the lattice point obtaining that gap size is step-size in search value, selects this lattice point ladder nearby
The minimum position of degree is as the initial barycenter of super-pixel;
Described iteration update module 23 update for iteration and calculate actual barycenter on picture for each super-pixel and
Border;
Described detection module on a large scale 24 reduces the threshold value of canny edge detection operator, detection picture new on a large scale
Edge;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module 25, to revised big model
Peripheral edge removes the interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module 26 is used for carrying out connected domain analysis to the edge on a large scale of picture, calculates side on a large scale
The Euclidean distance Transformation Graphs (being calculated using the algorithms most in use that range conversion of the prior art is image) of edge.
Described skeletonizing module 3, will be wherein terraced specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator
Degree is set to Skeleton pixel close to zero pixel;Estimation is carried out according to Skeleton pixel to stroke width value and obtains high accuracy stroke width
Degree.
The present invention mainly comprises two aspects: the local edge that (1) is directed to the word in text detection problem improves edge
Detection quality;(2) a kind of high speed and high-precision stroke width computational methods are proposed, to improve word and non-legible connected domain mistake
The precision of filter and efficiency.
Connected domain analysis method is belonged to based on the character detecting method of stroke width, this method is assumed in unified line of text
Word stroke width roughly the same.The advantage of such method is simply, and does not need to make adjustment for language-specific.
But, as such method belongs to the method for connected domain analysis with other, compare and depend on high-quality rim detection.In figure
Piece noise is more, illumination condition is undesirable or word is blocked in the case that the rim detection that causes lost efficacy by railing etc., this kind of side
French word Detection results are poor.Additionally, also there is the relatively low and slow problem of precision in the method.
For these problems, it is contemplated that using the super-pixel correction problem that rim detection lost efficacy in complex environment,
And the stroke width computational methods proposing a kind of quick high accuracy are to improve the degree of accuracy communicating with filter domain and efficiency.This
Bright inclusion herein below:
First with the pen to word that may be present in Target Photo for stroke width transform (swt) algorithm
Draw width to be estimated, then set up a stroke width rectangular histogram using this information;
According in stroke rectangular histogram stroke width arrange super-pixel step parameter, experiment find stroke width value with
When super-pixel step value is close, can effectively lift rim detection effect and partial occlusion and class text region can be removed;Afterwards, will
The result of the boundary between super-pixel and canny rim detection is compared and is revised, following in certain stroke width to reach
Edge detects quality highest effect;
Using range conversion and gradient operator by the connected domain detecting skeletonizing, using the skeleton picture obtaining after skeletonizing
Element re-evaluates high-precision stroke width, using the foundation as filtering characters and non-character;
By the spatial distribution of connected domain utilize stroke width concordance, the ratio of width to height, the plyability between connected domain some
Geometrical constraint comes further filtering characters and non-character, line of text and non-textual row;
Based on the experimental result on extensive public data collection it was demonstrated that the stroke width rectangular histogram proposing, super-pixel are calculated
The effectiveness of the stroke width computational methods of method and the quick skeletonizing of connected domain.
The present invention based on stroke width histogrammic super-pixel initial method with based on the quick skeletonizing of connected domain
Stroke width computational methods include following four step:
(1) calculate edge present in picture using canny edge detection operator.Calculated whole using sobel operator
The gradient of pictures.Then whether paired edge pixel is had in this direction according to the gradient direction search of edge pixel.As mistake
Paired edge pixel can be found and the gradient of this pixel and initial edge points gradient direction difference are between 150 degree and 210, then
Calculate the distance between they and stroke width is set to the distance between they;
(2) stroke width obtaining in step () is utilized to generate stroke width rectangular histogram.Calculate complexity in order to reduce
Degree, order and v are respectively average and the standard deviation that different histogram respective pixel are counted out, and histogrammic siding-to-siding block length h leads to
Cross the l2 risk of computational minimization, that is, so that minimizing to determine;
(3) simple linear iterative clustering (slic) algorithm is used as super-pixel algorithm.Choosing
Take the larger more main several stroke widths of stroke width rectangular histogram medium frequency as the step-size in search size of super-pixel and right
Should select to be spaced apart the initial barycenter as super-pixel for the position of partial gradient minimum at the lattice point of step sizes in ground.Iteratively more
New and calculate actual barycenter on picture for each super-pixel and border.The threshold value reducing canny edge detection operator is with more
Complete detection goes out the edge in picture, then compares to revise these edges by the border with super-pixel, with removal and currently
The different interference of stroke width, makes edge detection results meet stroke width rule as far as possible, improves rim detection effect.
(4) new edge detection results are carried out with connected domain analysis, and calculate the Euclidean distance Transformation Graphs at edge.Again
Gradient using sobel operator computed range Transformation Graphs.Due to connected domain stroke center range conversion value change all than
Relatively slow, so the pixel that wherein gradient is approximately zero is considered as Skeleton pixel.So far, can pass through these Skeleton pixels away from
It is worth to the stroke width of connected domain from conversion.
Just high-precision connected domain stroke width has been obtained after above step.So far we can be according to each connection
Whether the stroke width in domain unanimously carries out preliminary filtration to connected domain.Because in scene image, character seldom individually occurs, because
This is filtered to these connected domains further using the individual features of line of text, such as character boundary in one text row,
The ratio of width to height, stroke width and color should be close etc., and the connected domain being unsatisfactory for these constraints will be filtered.Finally according to literary composition
The statistical value of the distance between each character in one's own profession, sets distance threshold between character distance threshold and word in word, and then by literary composition
One's own profession is divided into character, is available for successive character identification module and uses.
For verifying effectiveness of the invention, have chosen common data sets icdar2005 and icdar2011 and tried
Test.Icdar2005 data set comprises 509 colour pictures, resolution between 307 × 93 to 1280 × 960, wherein training set
And have 258 and 251 pictures in test set respectively, have 1114 characters in picture.Icdar2011 data set comprises 484
Picture, including 229 training pictures and 255 test pictures, has 1189 characters.All experimental results are all based on text
Row is carried out.The comparing result such as table 1 of the present invention and other main flow detection algorithms in recent years on icdar2005 and icdar2011
With shown in table 2, test result indicate that the present invention can obtain optimal Detection results.
Algorithm | Accuracy rate | Recall rate | f-measure |
The present invention | 81% | 67% | 73% |
epshtein[1] | 73% | 60% | 66% |
fabrizio[5] | 46% | 39% | 43% |
huang[2] | 81% | 74% | 72% |
yao[6] | 69% | 66% | 67% |
Table 1 present invention and Comparative result on icdar2005 for other algorithms
Algorithm | Accuracy rate | Recall rate | f-measure |
The present invention | 80% | 69% | 74% |
huang[2] | 82% | 75% | 73% |
neumann[7] | 73% | 65% | 69% |
yi[8] | 76% | 68% | 67% |
neumann[9] | 67% | 58% | 62% |
Table 1 present invention and Comparative result on icdar2011 for other algorithms
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvement made etc., should be included within the scope of the present invention.
Claims (8)
1. the scene image words detection method based on rectangular histogram and super-pixel is it is characterised in that specifically include following steps:
Step 1: estimation is carried out to text width value that may be present in Target Photo and obtains stroke width value, based on stroke width
Value generates a stroke rectangular histogram;
Step 2: the stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel;Edge is carried out to Target Photo
Detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain in described pen
In the case of drawing width value, rim detection quality highest connected domain;
Step 3: skeletonizing is carried out to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is carried out estimating
To high accuracy stroke width;
Step 4: according to high accuracy stroke width, Target Photo is filtered, distinguish character and non-character, obtain character;
Step 5: further the character obtaining is filtered using geometrical constraint by the spatial distribution of connected domain, obtain precisely
Character, and it is based on accurate character area partial objectives for text in picture row and non-textual row, obtain line of text;
Step 6: complete the detection to character accurate in Target Photo and line of text;
Described step 2 specifically includes following steps:
Step 2.1: select the larger several stroke width values of stroke rectangular histogram medium frequency as the step-size in search value of super-pixel;
Step 2.2: search the lattice point obtaining that gap size is step-size in search value, select this lattice point nearby to make the minimum position of gradient
Initial barycenter for super-pixel;
Step 2.3: iteration execution step 2.1 and 2.2, update and calculate actual barycenter on picture for each super-pixel and side
Boundary;
Step 2.4: reduce the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Step 2.5: compared in the border at edge on a large scale and super-pixel and revise, revised edge on a large scale is removed
The interference different from current stroke width, is met the edge on a large scale of the picture of stroke width rule;
Step 2.6: connected domain analysis are carried out to the edge on a large scale of picture, calculates the Euclidean distance Transformation Graphs at edge on a large scale,
Obtain in the case of described stroke width value, rim detection quality highest connected domain.
2. the scene image words detection method based on rectangular histogram and super-pixel according to claim 1 it is characterised in that
Also include step 7: the distance between each accurate character value in statistics line of text, set between character distance threshold and the word in word
Distance threshold;
Step 8: accurate character is divided into according to distance threshold between character distance threshold and word to line of text.
3. the scene image words detection method based on rectangular histogram and super-pixel according to claim 2 it is characterised in that
Described step 3 particularly as follows: using sobel operator calculate Euclidean distance Transformation Graphs gradient, by wherein gradient close to zero pixel
It is set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width;
Geometrical constraint described in described step 5 includes stroke width concordance, the ratio of width to height, the plyability between connected domain.
4. the scene image words detection method based on rectangular histogram and super-pixel according to claim 3 it is characterised in that
Step 1 specifically includes following steps:
Step 1.1: be calculated the multiple edge pixels in Target Photo using canny edge detection operator;Calculated using sobel
Son is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Step 1.2: edge pixel on the basis of an edge pixel, scan for the gradient direction of reference edge pixel owning
The edge pixel existing;Judge whether the mapping edge pixel paired with reference edge pixel, if it does, execution step
1.3;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step 1.2;
Step 1.3: judge the Grad of mapping edge pixel and reference edge pixel gradient value difference value whether at 150 degree to 210
Between degree, if it is, execution step 1.4;Otherwise, delete this as the edge pixel of reference edge pixel, return execution step
1.2;
Step 1.4: calculate the distance between mapping edge pixel and reference edge pixel and obtain stroke width value;
Step 1.5: judge whether also there is edge pixel, if it does, returning execution step 1.2;Otherwise, execution step 1.6;
Step 1.6: stroke rectangular histogram is generated based on the stroke width value that step 1.4 obtains.
5. the scene image words detecting system based on rectangular histogram and super-pixel is it is characterised in that include: estimation module, edge
Detection module, skeletonizing module, filtering module and secondary filter module;
Described estimation module carries out estimation to text width value that may be present in Target Photo and obtains stroke width value, based on pen
Draw width value and generate a stroke rectangular histogram, and stroke rectangular histogram is sent to edge detection module;
Stroke width value in stroke rectangular histogram is set to the step parameter of super-pixel by described edge detection module;To target figure
Piece carries out rim detection, the result of the above-mentioned super-pixel and rim detection setting step parameter is compared and revises, obtain
Arrive in the case of described stroke width value, rim detection quality highest connected domain;And the connected domain obtaining is sent to bone
Frame module;
Described skeletonizing module carries out skeletonizing to connected domain, obtains Skeleton pixel;According to Skeleton pixel, stroke width value is entered
Row estimation obtains high accuracy stroke width, and high accuracy stroke width is sent to filtering module;
Described filtering module filters to Target Photo according to high accuracy stroke width, distinguishes character and non-character, obtains word
Symbol;
Described secondary filter module was carried out to the character obtaining using geometrical constraint further by the spatial distribution of connected domain
Filter, obtains accurate character, and is based on accurate character area partial objectives for text in picture row and non-textual row, obtains line of text;
Described edge detection module includes: step-length selecting module, barycenter selecting module, iteration update module, on a large scale detection mould
Block, correcting module and connected domain analysis module;
Described step-length selecting module selects the larger several stroke width values of stroke rectangular histogram medium frequency as the search of super-pixel
Step value;
Described barycenter selecting module searches the lattice point obtaining that gap size is step-size in search value, and nearby gradient is minimum to select this lattice point
Position as super-pixel initial barycenter;
Described iteration update module updates for iteration and calculates actual barycenter on picture for each super-pixel and border;
Described detection module on a large scale reduces the threshold value of canny edge detection operator, the new edge on a large scale of detection picture;
Compared and revise in the border at edge on a large scale and super-pixel by described correcting module, to revised edge on a large scale
Remove the interference different from current stroke width, be met the edge on a large scale of the picture of stroke width rule;
Described connected domain analysis module is used for the edge on a large scale of picture is carried out connected domain analysis, calculates the Europe at edge on a large scale
Formula range conversion figure.
6. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 5 it is characterised in that
Also include statistical module and segmentation module;
Described statistical module is used for counting the distance between each accurate character value in line of text, and the character in setting word is apart from threshold
Distance threshold between value and word;
Described segmentation module is divided into accurate character according to distance threshold between character distance threshold and word to line of text.
7. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 6 it is characterised in that
Described skeletonizing module specifically for calculating the gradient of Euclidean distance Transformation Graphs using sobel operator, will wherein gradient close to zero
Pixel be set to Skeleton pixel;Estimation is carried out to stroke width value according to Skeleton pixel and obtains high accuracy stroke width;Described two
Geometrical constraint described in secondary filtering module includes stroke width concordance, the ratio of width to height, the plyability between connected domain.
8. the scene image words detecting system based on rectangular histogram and super-pixel according to claim 7 it is characterised in that
Described estimation module includes: gradient modules, the paired module of search, search mapping block and computing module;
Described gradient modules are calculated the multiple edge pixels in Target Photo using canny edge detection operator;Use
Sobel operator is calculated the Grad of Target Photo;Obtain the Grad of all edge pixels in Target Photo;
Described search paired module edge pixel on the basis of an edge pixel, is carried out to the gradient direction of reference edge pixel
Search for the edge pixel of all presence;Search for the mapping edge pixel paired with reference edge pixel;
Described search mapping block search Grad and reference edge pixel gradient value difference value reflecting between 150 degree to 210 degree
Penetrate edge pixel, and the described mapping edge pixel obtaining is sent to computing module;
Described computing module is used for calculating the distance between mapping edge pixel and reference edge pixel and obtains stroke width value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410168244.0A CN103942797B (en) | 2014-04-24 | 2014-04-24 | Scene image text detection method and system based on histogram and super-pixels |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410168244.0A CN103942797B (en) | 2014-04-24 | 2014-04-24 | Scene image text detection method and system based on histogram and super-pixels |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103942797A CN103942797A (en) | 2014-07-23 |
CN103942797B true CN103942797B (en) | 2017-01-25 |
Family
ID=51190448
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410168244.0A Expired - Fee Related CN103942797B (en) | 2014-04-24 | 2014-04-24 | Scene image text detection method and system based on histogram and super-pixels |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103942797B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104599275B (en) * | 2015-01-27 | 2018-06-12 | 浙江大学 | The RGB-D scene understanding methods of imparametrization based on probability graph model |
CN105005764B (en) * | 2015-06-29 | 2018-02-13 | 东南大学 | The multi-direction Method for text detection of natural scene |
CN106845474B (en) * | 2015-12-07 | 2020-05-08 | 富士通株式会社 | Image processing apparatus and method |
CN107301651A (en) * | 2016-04-13 | 2017-10-27 | 索尼公司 | Object tracking apparatus and method |
CN106446920B (en) * | 2016-09-05 | 2019-10-01 | 电子科技大学 | A kind of stroke width transform method based on gradient amplitude constraint |
CN107844803B (en) * | 2017-10-30 | 2021-12-28 | 中国银联股份有限公司 | Picture comparison method and device |
CN108573260A (en) * | 2018-03-29 | 2018-09-25 | 广东欧珀移动通信有限公司 | Information processing method and device, electronic equipment, computer readable storage medium |
CN108921155A (en) * | 2018-04-23 | 2018-11-30 | 新疆大学 | A kind of hand script Chinese input equipment Uighur words Slant Rectify method |
CN109117843B (en) * | 2018-08-01 | 2022-04-15 | 百度在线网络技术(北京)有限公司 | Character occlusion detection method and device |
CN109472221A (en) * | 2018-10-25 | 2019-03-15 | 辽宁工业大学 | A kind of image text detection method based on stroke width transformation |
CN110047083B (en) * | 2019-04-01 | 2021-01-29 | 江西博微新技术有限公司 | Image noise point identification method, server and storage medium |
CN111639646B (en) * | 2020-05-18 | 2021-04-13 | 山东大学 | Test paper handwritten English character recognition method and system based on deep learning |
CN111709419A (en) * | 2020-06-10 | 2020-09-25 | 中国工商银行股份有限公司 | Method, system and equipment for positioning banknote serial number and readable storage medium |
CN112801088B (en) * | 2020-12-31 | 2024-05-31 | 科大讯飞股份有限公司 | Method and related device for correcting distorted text line image |
CN117831037B (en) * | 2024-01-04 | 2024-08-02 | 北京和气聚力教育科技有限公司 | Method and device for determining answer condition of objective questions in answer sheet |
-
2014
- 2014-04-24 CN CN201410168244.0A patent/CN103942797B/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
CN103942797A (en) | 2014-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103942797B (en) | Scene image text detection method and system based on histogram and super-pixels | |
Lalimi et al. | A vehicle license plate detection method using region and edge based methods | |
CN102999886B (en) | Image Edge Detector and scale grating grid precision detection system | |
CN104751142B (en) | A kind of natural scene Method for text detection based on stroke feature | |
CN107045634B (en) | Text positioning method based on maximum stable extremum region and stroke width | |
CN104361336A (en) | Character recognition method for underwater video images | |
Paunwala et al. | A novel multiple license plate extraction technique for complex background in Indian traffic conditions | |
CN104899554A (en) | Vehicle ranging method based on monocular vision | |
CN102096821A (en) | Number plate identification method under strong interference environment on basis of complex network theory | |
CN108038481A (en) | A kind of combination maximum extreme value stability region and the text positioning method of stroke width change | |
CN106815583B (en) | Method for positioning license plate of vehicle at night based on combination of MSER and SWT | |
CN114299275A (en) | Hough transform-based license plate inclination correction method | |
CN109409356B (en) | Multi-direction Chinese print font character detection method based on SWT | |
CN111353961B (en) | Document curved surface correction method and device | |
CN103793708A (en) | Multi-scale license plate precise locating method based on affine correction | |
Hidayatullah et al. | Optical character recognition improvement for license plate recognition in Indonesia | |
CN105354571B (en) | Distortion text image baseline estimation method based on curve projection | |
CN104008542A (en) | Fast angle point matching method for specific plane figure | |
CN110335280A (en) | A kind of financial documents image segmentation and antidote based on mobile terminal | |
Wei et al. | Detection of lane line based on Robert operator | |
Choudhury et al. | A new zone based algorithm for detection of license plate from Indian vehicle | |
Ziaratban et al. | An adaptive script-independent block-based text line extraction | |
CN110276260B (en) | Commodity detection method based on depth camera | |
CN109410227B (en) | GVF model-based land utilization pattern spot contour extraction algorithm | |
CN111241862B (en) | Bar code positioning method based on edge characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170125 |