CN100565559C - Image text location method and device based on connected component and support vector machine - Google Patents

Image text location method and device based on connected component and support vector machine Download PDF

Info

Publication number
CN100565559C
CN100565559C CNB2007100643881A CN200710064388A CN100565559C CN 100565559 C CN100565559 C CN 100565559C CN B2007100643881 A CNB2007100643881 A CN B2007100643881A CN 200710064388 A CN200710064388 A CN 200710064388A CN 100565559 C CN100565559 C CN 100565559C
Authority
CN
China
Prior art keywords
connected component
text filed
threshold value
feature
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2007100643881A
Other languages
Chinese (zh)
Other versions
CN101266654A (en
Inventor
姚金良
杨一平
台宪青
薛文芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CNB2007100643881A priority Critical patent/CN100565559C/en
Publication of CN101266654A publication Critical patent/CN101266654A/en
Application granted granted Critical
Publication of CN100565559C publication Critical patent/CN100565559C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The open method and apparatus of locating of the present invention based on the image Chinese version of connected component and support vector machine, image segmentation unit, connected domain analytic unit; The connected component feature is obtained and the threshold value confirmation unit, the support vector machine taxon, and connected component is combined into text filed unit, and text filed statistical nature obtains and confirmation unit; Method is cut apart input picture and is obtained image layered result, foreground layer analysis is obtained the set of connected component with the connected domain analysis; Whether extract the connected component feature, get rid of a large amount of non-character connected components by the structure of using cascade threshold value sorter, using support vector machine classification method to discern again to remaining candidate characters connected component is the character connected component; Remaining connected component made peace according to feature one, and to be combined into the candidate apart from the phase approximately principle text filed, and extract the text filed feature of these candidates, and these features are confirmed whether be text filed with experimental threshold value.

Description

Image text location method and device based on connected component and support vector machine
Technical field
The invention belongs to the preprocessing technical field of the optical character identification (OCR) of computer vision, relate to a kind ofly, be used for intelligent digital graphical analysis and understanding based on the real scene image of connected component and support vector machine or the localization method and the device of video sequence image Chinese version.
Background technology
Text in digital picture or the frame of video has a large amount of semantic informations, such as: road sign, advertisement, indication poster etc.Therefore a robust at the text localization method in the complex background image, and the associating literal identification utilization that can bring various reality, for example: the content-based video index and the retrieval of image, automobile assistant driving, application such as mobile robot's vision guided navigation.This method is added that a machine translation system helps international visitors to overcome linguistic obstacle.Yet because the difficulty on complex background image Chinese version location, the optical character identification of traditional file and picture that obtains at scanning is difficult to directly apply at the character recognition in the general pattern.In order to discern the literal that is embedded in the complicated image, at first need to can be good at locating the accurate frame of the character zone in the complicated image, could finely utilize existing optical character recognition.
In recent years, existing a lot of research institutions have carried out the research work of this respect, have proposed certain methods, and obtained certain effect, reference, Zhong Y., Kary K., the paper of Jain A.K. (exercise question: " Locating text in complex color images ", be published in " Patternrecognition ", Vol.28, No.10,1995, pp1523-1535).These methods mainly are divided into two classes: based on the method for texture with based on the method for connected component.Thinking text filed based on the method for texture is a kind of texture, being used of a large amount of methods that is similar to Texture Segmentation.These class methods at first will be determined a frame, this frame constantly moves on image, extract the feature in the frame, and this frame is classified with sorting technique, not obvious in order to overcome big character texture characteristics, these class methods generally will be carried out pyramid decomposition to original image, discern on the image that each decomposition is obtained.These class methods generally are difficult to navigate to text filed external surrounding frame accurately, and less text filed of some number of characters is difficult to carry out effective recognition, are difficult to be removed accurately to containing some leaves of enriching textural characteristics and window etc.And think that based on the method in zone character has consistent color, earlier image is cut apart, to respectively cutting the connected component that back image utilization connected component analytical approach obtains each layer, then all use some regular methods to confirm whether be the connected component of character then.These class methods are owing to only use some rule and methods to be difficult to complicated background is carried out effective recognition.
Summary of the invention
Purpose of the present invention mainly is that the robustness at existing text filed localization method is not very high, existing certain methods is based on too many artificial hypothesis, and the invention provides a kind of method and device at various complex background digital picture Chinese version zone locations based on connected component and support vector machine, robust, realization is carried out the text filed localization method of robust to difficult digital pictures such as literal size, font, color, background complexity height, thereby is that follow-up literal identification is prepared.
In order to realize described purpose, a first aspect of the present invention, the digital picture Chinese version localization method based on connected component and support vector machine comprises step:
Step S1: will need the image of locating to carry out the dividing processing of image, and obtain image layered result according to its gray value information;
Step S2: each layer after the layering is carried out the connected domain analysis as prospect, obtain the set of candidate characters connected component;
Step S3: extract the feature of candidate characters connected component, and get rid of non-character connected component with the structure of a cascade threshold value sorter; The threshold value of each threshold value sorter obtains by the statistical sample data;
Step S4: to the candidate characters connected component that uses cascade threshold value sorter not to be excluded, whether employing is the classification of character connected component based on the sorting technique of support vector machine, and the proper vector of support vector machine is above-mentioned all character connected component features of obtaining;
Step S5: will be the connected component of character in the support vector machine classification results, make up according to its position relation and feature consistance in image, obtain the set of candidate characters connected component, the minimum rectangle frame that comprises all connected components in the subclass is called the text filed of this subclass correspondence, and this subclass is called text filed middle connected component set simultaneously;
Step S6: the variance of calculating connected component feature in the text filed middle connected component set is as the text filed feature of candidate, and the threshold value of use experience confirms whether be text filed.
Particularly, described image segmentation is that gray level image is analyzed, need be converted into monochromatic gray level image if input picture is a coloured image, then according to grey scale pixel value and this pixel be the mean value of grey scale pixel value in the window at center and variance determine under the pixel layer.
Particularly, the feature of described extraction connected component and threshold value sorter confirm to constitute the structure of a cascade threshold value sorter, obtain a feature and get rid of this connected component, make it follow-up feature calculation and need not again the connected component of having got rid of is calculated with regard to judge whether this with the threshold value sorter.
Particularly, the setting of the threshold value of threshold value sorter is to obtain by the character pair of sample database character connected component is added up, and the threshold value of acquisition guarantees that it is the character connected component that the character connected component in the sample is all confirmed as.
Particularly, support vector machine has adopted the LIBSVM instrument, and uses the kernel function of radial basis function as employed support vector machine.
Particularly, following steps: step S51 is adopted in the described combination of step S5: whether adjacent by judging any two connected components in the set of candidate's connected component, and the feature that whether has a unanimity confirms whether they belong to the one text zone; If belong to the one text zone, then between these two connected components, set up a limit, like this a non-directed graph of candidate's connected component set just changing into; Step S52: the non-directed graph that obtains is carried out depth-first travel round nomography, obtain connected component wherein, the set of the connected component during the corresponding candidate of connected component is text filed.
Particularly, to the text filed connected component set of resulting candidate, if its element that comprises then extracts the variance of connected component feature greater than 1: the variance of the variance of the variance of stroke width, connected component height, the variance of connected component width, connected component gray-scale value; Set the threshold value of four variances and confirm text regional connectivity divides duration set whether to have character to constitute: if any one variance yields, thinks then that the set of text area characters connected component does not have the character connected component to constitute greater than given threshold value; If text filed character connected component set element number equals 1, then character down connected component characteristic threshold value is confirmed again to the only element in text filed.
Particularly, to the text filed connected component set of confirming, obtain the minimum rectangle frame that can comprise each connected component in the text filed connected component set, as text filed positioning result.
In order to realize described purpose, a second aspect of the present invention provides the digital picture text locating device based on connected component and support vector machine, and the device according to the method in above-mentioned location number word image Chinese words zone provides comprises:
The image segmentation unit is used for the digital picture of input is carried out the layering of image;
The connected domain analytic unit is used for the image of layering is carried out connectivity analysis, obtains candidate's character connected component set and obtains simple connected component feature;
The connected component feature is obtained and the threshold value confirmation unit, is used to extract the feature of candidate characters connected component, and carries out the affirmation of connected component with cascade threshold value sorter structure, gets rid of tangible non-character connected component;
Support vector machine class validation unit is used for the candidate characters connected component to using simple threshold values not get rid of, and uses based on the non-character connected component of support vector machine classification eliminating;
Connected component is combined into text filed unit, is used for the character connected component that obtains is made up with the connected component feature is consistent according to the position is adjacent, and the connected component subclass that combination obtains is as a text filed connected component that has;
Text filed statistical nature obtains and confirmation unit, and the variance that is used for obtaining each text filed connected component set connected component feature is as text filed statistical nature, and differentiates with empirical value whether the candidate is text filed has the character connected component to constitute.
The present invention is based on the method for connected component, and make full use of the various features of character connected component, and utilized the sorting technique of support vector machine to discern, the structure decrease of cascade threshold value sorter the computation burden of support vector machine classification, having overcome the support vector machine classification needs the weakness of intensive, has utilized the outstanding advantage of its classifying quality simultaneously.And to the combination the text filed statistical value of connected component that extracted as feature, the similar textural characteristics of this category feature carries out effective text filed affirmation, make based on texture method with based on the connected component method and carry out to a certain extent fusion, reached higher accuracy rate and recall rate.To 2003, obtain the threshold value in this method in the internationalconference of document analysis and recognition meeting on the disclosed training image database, and support vector machine trained, on its test pattern database, test then, reach accuracy rate and recall rate and be respectively 0.67 and 0.61.
Description of drawings
Fig. 1 has represented the present invention is based on the character zone locating device of connected component and the FB(flow block) of the whole process of method.
Fig. 2 represents the structured flowchart of cascade threshold value sorter of the present invention.
Fig. 3 represents to judge two examples of connected component edge pixel point for the neighbor situation of coarse point.
Width of cloth test pattern among Fig. 4 (a) embodiment.
Fig. 4 (b) test pattern is through the result images after the image segmentation.
The result who obtains behind Fig. 4 (c) test pattern process cascade threshold value sorter.
The result of Fig. 4 (d) test pattern through obtaining after the text filed affirmation, text filed in the black surround for what locate.
The result of Fig. 4 (e) test pattern through not using the support vector machine classification to obtain, there is an empty inspection text box in the lower left corner.
Fig. 4 (f) test pattern obtains the result through total system.
Embodiment
Below in conjunction with accompanying drawing the present invention is described in detail, be to be noted that described embodiment only is intended to be convenient to the understanding of the present invention, and it is not played any qualification effect.
Among the present invention, the image of its input can be the image that various image acquisition units obtain, for example: digital camera take the image obtain, band shoot function mobile phone, band camera function PDA or can be a frame etc. in the video image.If input picture is a simulating signal, need a number mould crossover tool, analog image is converted to digital picture handles.The image that the inventive method is handled can be at various coding format, for example: JPEG, BMP etc., as long as can be bitmap images with the image transitions of this form.The image of supposition input has been a bitmap images in this embodiment.In the following description, image is exactly the index word image, is not specifically indicating.Disclosed training and testing image library in the next internationalconference of document analysis and recognition meeting in the parameter learning that relates to of the inventive method storehouse of using simultaneously, image all is an English character in its storehouse, so present embodiment is to train the parameter that obtains at English character.But the inventive method can be suitable for other various language equally.
Introduce embodiments of the invention below with reference to the accompanying drawings in detail.
Fig. 1 is a FB(flow block), has represented to the present invention is based on the process flow diagram of the text localization method of connected component and support vector machine.
With reference to image segmentation unit 10 among the figure 1, adopt dividing method based on local auto-adaptive, the segmentation threshold of each picture element is that the average and the variance of pixel gray-scale value in the window at center obtains by calculating with this pixel.The image segmentation of image segmentation unit 10 partly is the layering that realizes input picture, and cutting procedure is exactly the process that image is divided into different layers according to the color of image or half-tone information.Concerning entire method, its partitioning portion can use any existing dividing method.What present embodiment used carries out the layering of image based on the dividing method of local pixel gray-scale value average and variance.Its computing formula is as follows:
T ±(x,y)=Mean(x,y,W B)□k□Variance(x,y,W F)
(x y) is a pixel of input picture, T to I ±(x y) is respective pixel I (x, upper and lower threshold value y).T ±(x is y) by calculating local average Mean (x, y, W B) and variance Variance (x, y, W F) obtain.Mean (x, y, W B) be that (x y) is the center, W with pixel I in the image BAverage for grey scale pixel value in the window of window size.Variance (x, y, W F) be that (x y) is the center, W with I FVariance for grey scale pixel value in the window of window size." Offset " is a positive integer, and it makes more pixel be split to gray layer, thereby makes these pixels not need to handle in the processing of back and the quantity of having suppressed the noise connected component again, reduced the follow-up processing time.Parameter k, offset, W B, W FBe set to 0.2,3,71,11 respectively by experience.
In order to improve the speed of calculating, the inventive method is whole variance of grey scale pixel value in the corresponding windows of pixels in the computed image not, but all uses in 3 * 3 windows center pixel with W to 9 pixels in one 3 * 3 the window FVariance for window size.The segmentation result that image obtains after over-segmentation is a two-dimensional array, and size is: picture traverse * picture altitude, and value 255 these pixels of expression of each element are white layer, and 0 is black layer, and 100 is the middle layer.Accompanying drawing 4 (a) is original test pattern, and accompanying drawing 4 (b) is through the result images after the image segmentation.
Connected domain analytic unit 20 adopts the algorithm based on region growing to carry out the connected component analysis; The connected domain analysis part is that the result to image segmentation carries out connectivity analysis, connectivity analysis can be labeled as a unique sign for each adjacent and pixel that belong to identical dividing layer in the image, and obtain the size of whole connected component, information such as position.In the present embodiment obtain after the connected domain analysis underlined connected component call the set of candidate characters connected component.The whole process that connected domain is analyzed is as follows: at first, with the segmentation result value be 255 carry out the connected domain analysis as foreground layer, other is worth as background layer.Make that to use the same method to value be that 0 black layer carries out the analysis of connected component as foreground layer.And be that 100 gray layer is not carried out the analysis of connected component to value, because character is written on the particular color background usually, character and background have certain colour-difference, this dividing method seldom can be with Character segmentation in gray layer.
The connected component analytical approach has a variety of, and method of the present invention is not limit employed in an embodiment method.The connected component analytical approach of Cai Yonging is based on the algorithm (the real Zhao Qingjie of Wesly E.Snyder Hairong Qi work forestry Models-and-options Cui Jin etc. translates machine vision study course mechanical industry and publishes chamber first published P142) of region growing in the present embodiment.Connectedness is decided to be eight connections.Simultaneously the connected component analytical approach has been carried out suitable modification, when carrying out each connected component of mark, calculate the foundation characteristic of each connected component, number of pixels comprising connected component, connected component marginal point number (with non-this connected component label adjacent pixels number), peripheral rectangle frame size and position (the connected component external surrounding frame is meant the minimum level rectangle frame that comprises all pixels of connected component) of connected component, and layer (white layer and the black layer) of this connected component under in segmentation result.Write down these features and can fall the connected component of a large amount of non-characters with the threshold value simple filtering.Suppose now white layer carried out the connected component analysis that black layer can be realized its detailed process following steps with same step:
Step 1): find one not the pixel of label (be SegmentResult[x, y]=255, LabelArr[x, y]=0).For this pixel is chosen new label sequence number (being that N adds).If all by label, algorithm stops all pixels.
Step 2): LabelArr[x, y]=and N, this connected component pixel count adds one.Upgrade the left upper end point value and the bottom righthand side point value of external surrounding frame simultaneously.
Step 3): if SegmentResult[x-1, y]=255, and LabelArr[x-1, y]=0, (x-1 y) is pressed into storehouse with coordinate.
If SegmentResult[x+1, y]=255, and LabelArr[x+1, y]=0, (x+1 y) is pressed into storehouse with coordinate.
If SegmentResult[x, y+1]=255, and LabelArr[x, y+1]=0, (x y+1) is pressed into storehouse with coordinate.
If SegmentResult[x, y-1]=255, and LabelArr[x, y-1]=0, (x y-1) is pressed into storehouse with coordinate.
If SegmentResult[x-1, y+1]=255, and LabelArr[x-1, y+1]=0, (x-1 y+1) is pressed into storehouse with coordinate.
If SegmentResult[x-1, y-1]=255, and LabelArr[x-1, y-1]=0, (x-1 y-1) is pressed into storehouse with coordinate.
If SegmentResult[x+1, y+1]=255, and LabelArr[x+1, y+1]=0, (x+1 y+1) is pressed into storehouse with coordinate.
If SegmentResult[x+1, y-1]=255, and LabelArr[x+1, y-1]=0, (x+1 y-1) is pressed into storehouse with coordinate.
Wherein x-1 is more than or equal to zero, and x+1 is less than picture traverse, and y-1 is more than or equal to zero, and y+1 is less than picture altitude.
If in above eight consecutive point, there is a pixel value to be not equal to 255, then count and add one in the edge of this connected component.
Step 4): if the storehouse non-NULL takes out a value as new (x y), and jumps to step 2) from storehouse.If storehouse is empty, with this connected component pixel count, edge pixel point number, the external surrounding frame data are preserved, and are the temporary variable assignment initial value simultaneously, jump to step 1).
By above algorithmic procedure, we can obtain all connected components from split image, comprise character connected component and a large amount of non-character connected components.
Character feature obtains and threshold value confirmation unit 30, adopts the method for a cascade threshold value sorter, and as shown in Figure 2, input is a connected component.At first, obtain " feature one " of connected component, judge whether to belong to the character connected component by this eigenwert and threshold ratio then, if then this connected component is input to next feature getter and obtains " feature two "; Then this connected component is abandoned if not connected component, not the feature of calculated for subsequent; If it is the character connected component that the threshold value sorter of cascade is all judged connected component, then this connected component just is cascaded the threshold value sorter and confirms as the character connected component.The sorter structure of cascade helps to improve the speed of system, after extracting a feature, if it does not satisfy specific threshold value, then this connected component is got rid of, and can avoid calculating the feature of this connected component cascade classifier back like this.The feature that mainly comprises in the present embodiment has: the number of pixels of connected component, edge pixel number, external surrounding frame size, roughness, stroke width, stroke width variance, contrast.Can pass through to make up acquisition and the irrelevant consistent features of character boundary by above feature, as: the height of external surrounding frame and width ratio, connected component number of pixels and external surrounding frame area ratio, the number of pixels of connected component and duplicate ratio, the roughness and external surrounding frame aspect ratio, stroke width and aspect ratio, stroke width variance and stroke width ratio of edge pixel number.By these simple combinations can make feature more effectively, and make method can locate the character of all size, and need not carry out the decomposition of multiresolution again to image.Below introduce the computing method of obtaining of each feature.
The number of pixels of connected component, edge pixel number, external surrounding frame size obtain when carrying out the connected component analysis.Roughness is that the edge of hypothesis character connected component mainly is made up of relative more straight line segment, and the burr that the pixel on the edge is relative is less.Can be by using morphologic filtering operational computations roughness.Owing to causing easily stroke width is thought by mistake higher roughness less than three character connected component based on morphology methods, so in the present embodiment, adopted a kind of by judging that edge pixel puts the structure of eight field neighbor pixels and judge whether this pixel is a coarse point.All coarse pixel numbers of this connected component are removed and are this connected component roughness in the marginal point number then.
Two examples of eight field partial structurtes of coarse point as shown in Figure 3,1 is prospect, 0 is background.Having defined 180 similar structures in the method altogether is coarse dot structure, and the criterion of its definition is that the burr on the edge is defined as coarse some pixel.The present invention is not limited to such roughness and calculates mode, and other calculating also is suitable for.
The acquisition methods of connected component stroke width and stroke width variance feature, stroke width and stroke width variance are meant that respectively pixel is to the mean value and the variance of the twice of the bee-line of non-connected component pixel on the axis of connected component.All form based on the character connected component, so the stroke width variance should be a less value by consistent relatively lines (stroke).This method adopts fast parallel algorithm (the T.Y.Zhang and C.Y.Suen of Zhang, " A fast parallelalgorithm for thinning digital patterns ", Commun ACM, vol.27, no.3, pp.236--239,1984) calculate the axis of stroke, then in width and the variance of calculating stroke.
Contrast is meant the distance between the color and background color of connected component.It is generally acknowledged between the color and background color of character connected component bigger distance is arranged.Computing method are to add up the mean value of each pixel color on the connected component as foreground color, and the pixel color mean value of the interior non-connected component of statistics connected component external surrounding frame is color as a setting, then with Euclidean distance as a comparison.If input picture is gray level image then calculates gray value differences and spend as a comparison.Use gray scale difference to spend as a comparison in the present embodiment.
More than be the computing method that obtain each connected component feature, according to the computing time of each feature of connected component and the ability of the non-character connected component of eliminating, arrange the sequencing of each feature in cascade threshold value sorter structure simultaneously.Its sequencing is in the present embodiment: the number of pixels of connected component, edge pixel number, the external surrounding frame of connected component, connected component roughness, connected component stroke width and variance, connected component contrast.The threshold value of each sorter is determined to obtain by adopting the eigenwert of the character connected component in the sample database added up.In the method, the method character connected component that will adopt the method for unit 10 to obtain chooses by hand, and calculates all above-mentioned features of these character connected components.
Each character feature to all character connected components of image in the storehouse is tried to achieve maximal value and minimum value.If with these maximal values and the minimum value threshold value as sorter, these sorters can reach 100% recall rate on training sample, but accuracy rate is relatively low.We can come balance accuracy rate and recall rate by adjusting these threshold values.After through cascade threshold value sorter, a large amount of non-character connected components are excluded, but still have some comparing class to fail to be removed like the non-character connected component of character, so only use character connected component feature to be not enough to obtain better positioning effect.
Fig. 4 (a) is the test original image, the result of the non-character connected component of eliminating that Fig. 4 (c) process cascade threshold value sorter obtains.
Support vector machine taxon 40 adopts the support vector machine sorting algorithm; Support vector machine is a kind of effective machine learning classification method, is not under the very big situation at sample particularly.In the present embodiment, adopted the LibSvm support vector machine application program interface function storehouse of increasing income to calculate.The proper vector that adopts is that above all that calculate do not make up proper vector of basic connected component feature composition, and its dimension is 13 dimensions, is training and is dividing time-like all to use the normalization operation.
The parameter of using when embodiment trains the model of support vector collection is as follows: mistake penalty coefficient C is 2000, and gamma is 1.8445, and kernel function is radial basis function (rbf).The positive sample of training is exactly the character connected component that uses when obtaining the connected component characteristic threshold value, and negative sample also is the connected component of the manual non-character that obtains from the sample database image.Whole training process has used positive and negative sample number all to be 4374.The support vector number of the model that obtains after the training is 1512, and wherein positive vector is 397.Can effectively classify by the model that training obtains to unlabelled connected component.
Accompanying drawing 4 (f) is the result of use support vector machine, and the result of Fig. 4 (e) for not using support vector machine to obtain, empty inspection that Fig. 4 (e) lower left corner is many.
Connected component combine text territory element 50, the concrete technical scheme that adopts is at first to confirm between each connected component whether be to belong to the one text zone, by judging whether all connected components of not getting rid of through support vector machine classification method have consistent feature between any two, and the position is close, utilize depth-first to travel round nomography then and find all connected components, the set of connected component during the corresponding candidate of each connected component is text filed; In the present embodiment, thus in the sample database of its use all texts all be that the process of the approximate horizontal combination of arranging is exactly to find all horizontal and close character connected components.If the situation of homeotropic alignment will be considered that wherein its processing mode is consistent at Chinese certainly.The set of the candidate characters connected component that obtains afterwards classifying through support vector machine.From set, search feature class like, the position roughly is on the same horizontal line and connected component that face mutually is combined into a subclass, as the set of the connected component of a text filed correspondence of candidate.
In the present embodiment, by using two constraints: position constraint and feature constraint judge two connected components in the set whether belong to same candidate text filed in, same candidate is text filed just to make up a limit if belong to.Shown in the following formula of combined constraint conditions on the horizontal direction, CCi and CCj. are any two connected components in the set of candidate's connected component, and CCj_XXX is connected component CCi, certain attribute, for example, CCj_Width is the width attribute of connected component CCj.
(1) position constraint
MinHeight=Min(CC i_Height,CC j_Height)
(CC i_Bottom-CC j_top)>k 1*MinHeight
(CC j_Bottom-CC i_top)>k 1*MinHeight (1)
Formula (1) is guaranteed two connected domains capable of being combined on same horizontal direction, k 1It is the parameter of a control line of text tiltable degree.In the present embodiment, k 1Be set to 0.75.
CC i_Right-CC j_Left>k 2*MinHeight
||CC j_Right-CC i_Left>k 2*MinHeight (2)
Formula (2) guarantees that two connected domain distances capable of being combined are very near, k 2It is the distance parameter of a control connected domain capable of being combined.K in the present embodiment 2Be set to 3.
(2) attribute constraint:
CC i_GreyValue-CC j_GreyValue<k 3 (3)
| CC i _ StrokeWidth - CC j _ StrokeWidth | CC i _ StrokeWidth + CC j _ StrokeWidth < k 4
k 5*MinHeight>MaxHeight (5)
K in the formula (3) (4) (5) 3, k 4, k 5Be respectively 23,0.15,2.1.
If above all constraints are satisfied, so these two connected components can be incorporated into same candidate text filed in, promptly have the limit of a connection between two connected components.It is right to travel through all connected components, and then whole connected component set adds the limit of acquisition, has formed one " non-directed graph ".Use depth-first to travel round figure and will obtain each connected component among the figure.And it is text filed that all connected components in the connected component that obtains are defined as a candidate, position, size, the gray scale of the connected component in simultaneously can be text filed by the candidate, calculate the text filed position of this candidate, size, gray scale, and as the text filed feature of candidate.
The result of Fig. 4 (d) test pattern through obtaining after the text filed affirmation, but do not use support vector machine that candidate's connected component is confirmed, text filed in the black surround for what locate.
Obtaining and confirmation unit 60 of text filed statistical nature, the technical scheme of employing be calculate text filed in the variance of character connected component feature, then with empirical value confirm the candidate text filed whether be to have character to constitute; Because the character connected component in the one text zone generally all has consistent color, stroke width, highly.By 50 steps obtain the candidate text filed after, the connected component number that the candidate is comprised in text filed is greater than one zone, the variance that can add up connected component feature in the text zone (gray scale, stroke width, external surrounding frame height).If this zone is text filed, these variances generally have less value, so passing threshold can carry out effectively confirming text filed.In the present embodiment, text filed gray variance is less than 28, the variance of stroke width is in the stroke width average and is less than 0.4, and the height variance is removed and is less than 0.3 in the height average, just thinks that the text filed of this candidate is text filed if satisfy above condition.If have only a connected component in the text zone, then judge with the threshold value of more strict connected component feature whether this connected component is the character connected component, the more strict threshold value of the feature that will be obtained in the cascade threshold value sorter that is meant is adjusted, and makes it to get rid of as much as possible non-character connected component.These threshold values all obtain by experience, also can obtain by acquired sample statistics.
The front has specifically described embodiment of the present invention, be to be understood that, for a people with the common skill in present technique field, under the situation that does not deviate from scope of the present invention, above-mentioned and in additional claim, change and adjust in the special scope of the present invention that proposes and to reach purpose of the present invention equally.

Claims (9)

1. image text location method based on connected component and support vector machine is characterized in that step is as follows:
Step S1: will need the image of locating to carry out the dividing processing of image, and obtain image layered result according to its gray value information;
Step S2: each layer after cutting apart is carried out the connected domain analysis as prospect, obtain the set of candidate characters connected component;
Step S3: extract the feature of candidate characters connected component, and get rid of non-character connected component with the structure of a cascade threshold value sorter; The threshold value of each threshold value sorter obtains by the statistical sample data;
Step S4: the candidate characters connected component to using cascade threshold value sorter not to be excluded, adopt whether the sorting technique based on support vector machine is the classification of character connected component; The proper vector of support vector machine is the feature of all candidate characters connected components of said extracted;
Step S5: will be the connected component of character in the support vector machine classification results, make up according to its position relation and feature consistance in image, obtain each subclass in the set of candidate characters connected component, the minimum rectangle frame that comprises all connected components in the subclass is called the text filed of this subclass correspondence, and this subclass is called text filed middle connected component set simultaneously;
Step S6: the variance of calculating connected component feature in the text filed middle connected component set is as the text filed feature of candidate, and the threshold value of use experience confirms whether be text filed.
2. the method for claim 1, it is characterized in that, it is described that to carry out that image segmentation handles be that gray level image is analyzed, need be converted into monochromatic gray level image if input picture is a coloured image, then according to grey scale pixel value and this pixel be the mean value of grey scale pixel value in the window at center and variance determine under the pixel layer.
3. the method for claim 1 is characterized in that, the structure of described cascade threshold value sorter is made of the feature of extracting connected component and the affirmation of threshold value sorter.
4. method as claimed in claim 3, it is characterized in that, the setting of the threshold value of threshold value sorter is to obtain by the character pair of sample database character connected component is added up, and the threshold value of acquisition guarantees that it is the character connected component that the character connected component in the sample all is confirmed to be.
5. the method for claim 1 is characterized in that, support vector machine has adopted the LIBSVM instrument, and uses the kernel function of radial basis function as employed support vector machine.
6. the method for claim 1 is characterized in that, following steps are adopted in described combination to step S5:
Step S51: whether adjacent by judging any two connected components in the candidate's connected component set, and whether have consistent feature and confirm whether they belong to the one text zone; If belong to the one text zone, then between these two connected components, set up a limit, like this a non-directed graph of candidate's connected component set just changing into;
Step S52: the non-directed graph that obtains is carried out depth-first travel round nomography, obtain connected component wherein, the set of the connected component during the corresponding candidate of connected component is text filed.
7. method as claimed in claim 6, it is characterized in that, the connected component set text filed to resulting candidate, if its element that comprises then extracts being characterized as of connected component greater than 1: the variance of the variance of the variance of stroke width, connected component height, the variance of connected component width, connected component gray-scale value;
Set the threshold value of above-mentioned four variances and confirm whether the text filed connected component set of this candidate is made of character:
Text filed connected component set is not made of the character connected component if any one variance yields, is then thought this candidate greater than given threshold value;
If the connected component set element number during the candidate is text filed equals 1, then reduce the threshold value of connected component feature, the only element to this candidate in text filed is confirmed again.
8. method as claimed in claim 7 is characterized in that, to the text filed connected component set of confirming, obtains the minimum rectangle frame that can comprise each connected component in the text filed connected component set, as text filed positioning result.
9. the image text locating device based on connected component and support vector machine is characterized in that, comprising:
Image segmentation unit (10) is used for the image of needs location is carried out the dividing processing of image according to its gray value information, obtains image layered result;
Connected domain analytic unit (20) is used for each layer after cutting apart is carried out the connected domain analysis as prospect, obtains the set of candidate characters connected component;
The connected component feature is obtained and threshold value confirmation unit (30), is used to extract the feature of candidate characters connected component, and gets rid of tangible non-character connected component with the structure of a cascade threshold value sorter; The threshold value of each threshold value sorter obtains by the statistical sample data;
Support vector machine taxon (40) is used for the candidate characters connected component to using cascade threshold value sorter not to be excluded, and uses whether the sorting technique based on support vector machine is the classification of character connected component; The proper vector of support vector machine is the feature of the candidate characters connected component of said extracted;
Connected component is combined into text filed unit (50), be used for the connected component of support vector machine classification results for character, make up according to its position relation and feature consistance in image, the connected component subclass that combination obtains, the minimum rectangle frame that comprises all connected components in the subclass is called the text filed of this subclass correspondence, and this subclass is as a text filed connected component set that has simultaneously;
Whether text filed statistical nature obtains and confirmation unit (60), and the variance that is used for calculating connected component feature in the text filed connected component set is as the text filed feature of candidate, and be text filed with the empirical value affirmation.
CNB2007100643881A 2007-03-14 2007-03-14 Image text location method and device based on connected component and support vector machine Expired - Fee Related CN100565559C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2007100643881A CN100565559C (en) 2007-03-14 2007-03-14 Image text location method and device based on connected component and support vector machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007100643881A CN100565559C (en) 2007-03-14 2007-03-14 Image text location method and device based on connected component and support vector machine

Publications (2)

Publication Number Publication Date
CN101266654A CN101266654A (en) 2008-09-17
CN100565559C true CN100565559C (en) 2009-12-02

Family

ID=39989059

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007100643881A Expired - Fee Related CN100565559C (en) 2007-03-14 2007-03-14 Image text location method and device based on connected component and support vector machine

Country Status (1)

Country Link
CN (1) CN100565559C (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102163284B (en) * 2011-04-11 2013-02-27 西安电子科技大学 Chinese environment-oriented complex scene text positioning method
US8611662B2 (en) * 2011-11-21 2013-12-17 Nokia Corporation Text detection using multi-layer connected components with histograms
CN102663383A (en) * 2012-04-26 2012-09-12 北京科技大学 Method for positioning texts in images of natural scene
CN102915433B (en) * 2012-09-13 2015-06-10 中国科学院自动化研究所 Character combination-based license plate positioning and identifying method
CN107122778B (en) * 2012-11-26 2020-06-23 阿里巴巴集团控股有限公司 Method and device for merging single character areas
CN104182722B (en) * 2013-05-24 2018-05-18 佳能株式会社 Method for text detection and device and text message extracting method and system
CN104182744B (en) * 2013-05-24 2018-09-04 佳能株式会社 Method for text detection and device and text message extracting method and system
CN104751153B (en) * 2013-12-31 2018-08-14 中国科学院深圳先进技术研究院 A kind of method and device of identification scene word
CN105868758B (en) * 2015-01-21 2019-12-17 阿里巴巴集团控股有限公司 method and device for detecting text area in image and electronic equipment
CN104573685B (en) * 2015-01-29 2017-11-21 中南大学 A kind of natural scene Method for text detection based on linear structure extraction
CN106157250B (en) * 2015-03-26 2019-03-01 富士通株式会社 The method and apparatus for removing the marking in file and picture
CN106326921B (en) * 2016-08-18 2020-01-31 宁波傲视智绘光电科技有限公司 Text detection method
CN106503732B (en) * 2016-10-13 2019-07-19 北京云江科技有限公司 The classification method and categorizing system of text image and non-textual image
CN106934386B (en) * 2017-03-30 2019-06-25 湖南师范大学 A kind of natural scene character detecting method and system based on from heuristic strategies
CN108806059B (en) * 2018-05-08 2020-05-22 中山大学 Text region positioning method based on note alignment and eight-neighborhood connector offset correction of feature points
CN108564084A (en) * 2018-05-08 2018-09-21 北京市商汤科技开发有限公司 character detecting method, device, terminal and storage medium
CN109558876B (en) * 2018-11-20 2021-11-16 浙江口碑网络技术有限公司 Character recognition processing method and device
CN112013921B (en) * 2019-05-30 2023-06-23 杭州海康威视数字技术股份有限公司 Method, device and system for acquiring water level information based on water level gauge measurement image
CN110909732B (en) * 2019-10-14 2022-03-25 杭州电子科技大学上虞科学与工程研究院有限公司 Automatic extraction method of data in graph
CN113190717A (en) * 2021-04-27 2021-07-30 携程计算机技术(上海)有限公司 Method, system, device and medium for identifying user identification
CN113344967B (en) * 2021-06-07 2023-04-07 哈尔滨理工大学 Dynamic target identification tracking method under complex background

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1310825A (en) * 1998-06-23 2001-08-29 微软公司 Methods and apparatus for classifying text and for building a text classifier
US20040172457A1 (en) * 1999-07-30 2004-09-02 Eric Horvitz Integration of a computer-based message priority system with mobile electronic devices
CN1585458A (en) * 2004-05-27 2005-02-23 上海交通大学 Method for positioning and extracting video frequency caption by supporting vector computer
US20050251560A1 (en) * 1999-07-30 2005-11-10 Microsoft Corporation Methods for routing items for communications based on a measure of criticality

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1310825A (en) * 1998-06-23 2001-08-29 微软公司 Methods and apparatus for classifying text and for building a text classifier
US20040172457A1 (en) * 1999-07-30 2004-09-02 Eric Horvitz Integration of a computer-based message priority system with mobile electronic devices
US20050251560A1 (en) * 1999-07-30 2005-11-10 Microsoft Corporation Methods for routing items for communications based on a measure of criticality
CN1585458A (en) * 2004-05-27 2005-02-23 上海交通大学 Method for positioning and extracting video frequency caption by supporting vector computer

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于颜色边缘与SVM的图像文本定位. 许剑锋,黎绍发.计算机应用研究,第3期. 2006 *

Also Published As

Publication number Publication date
CN101266654A (en) 2008-09-17

Similar Documents

Publication Publication Date Title
CN100565559C (en) Image text location method and device based on connected component and support vector machine
CN103761531B (en) The sparse coding license plate character recognition method of Shape-based interpolation contour feature
CN102509091B (en) Airplane tail number recognition method
CN102043945B (en) License plate character recognition method based on real-time vehicle tracking and binary index classification
CN102332096B (en) Video caption text extraction and identification method
CN103049763B (en) Context-constraint-based target identification method
CN110032938B (en) Tibetan recognition method and device and electronic equipment
CN110969129B (en) End-to-end tax bill text detection and recognition method
CN107506763A (en) A kind of multiple dimensioned car plate precise positioning method based on convolutional neural networks
CN110210413A (en) A kind of multidisciplinary paper content detection based on deep learning and identifying system and method
CN108009518A (en) A kind of stratification traffic mark recognition methods based on quick two points of convolutional neural networks
CN105354568A (en) Convolutional neural network based vehicle logo identification method
CN105261017A (en) Method for extracting regions of interest of pedestrian by using image segmentation method on the basis of road restriction
CN104077577A (en) Trademark detection method based on convolutional neural network
CN106529532A (en) License plate identification system based on integral feature channels and gray projection
CN112232371B (en) American license plate recognition method based on YOLOv3 and text recognition
CN104881662A (en) Single-image pedestrian detection method
CN104573685A (en) Natural scene text detecting method based on extraction of linear structures
CN105608454A (en) Text structure part detection neural network based text detection method and system
CN101673338A (en) Fuzzy license plate identification method based on multi-angle projection
CN106408030A (en) SAR image classification method based on middle lamella semantic attribute and convolution neural network
Jiao et al. A survey of road feature extraction methods from raster maps
CN111563452A (en) Multi-human body posture detection and state discrimination method based on example segmentation
CN111914838A (en) License plate recognition method based on text line recognition
CN115424282A (en) Unstructured text table identification method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091202

Termination date: 20200314

CF01 Termination of patent right due to non-payment of annual fee