CN102081731A - Method and device for extracting text from image - Google Patents

Method and device for extracting text from image Download PDF

Info

Publication number
CN102081731A
CN102081731A CN2009102415658A CN200910241565A CN102081731A CN 102081731 A CN102081731 A CN 102081731A CN 2009102415658 A CN2009102415658 A CN 2009102415658A CN 200910241565 A CN200910241565 A CN 200910241565A CN 102081731 A CN102081731 A CN 102081731A
Authority
CN
China
Prior art keywords
image
feature
homogeney
carried out
utilize
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2009102415658A
Other languages
Chinese (zh)
Other versions
CN102081731B (en
Inventor
舒波
孔轶
陈东明
李英
黄昭文
李志锋
吕汉鑫
黄克书
林茂
陈涛
雷志勇
余士韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Guangdong Co Ltd filed Critical China Mobile Group Guangdong Co Ltd
Priority to CN 200910241565 priority Critical patent/CN102081731B/en
Publication of CN102081731A publication Critical patent/CN102081731A/en
Application granted granted Critical
Publication of CN102081731B publication Critical patent/CN102081731B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and device for extracting a text from an image. The method comprises the following steps: carrying out grey level nonlinear dynamic compression on an original image to obtain an enhanced image; extracting textural characteristics and edge characteristics of the enhanced image; constructing the textural characteristics and the edge characteristics into homogeneous characteristics, and mapping the enhanced image to a homogeneous space according to the homogeneous characteristics to obtain a characteristic image; extracting a text area from the characteristic image by a text area detector; and extracting and identifying characters from the text area of the enhanced image. According to the invention, the accuracy and robustness in text detection can be effectively improved.

Description

A kind of method and apparatus that from image, extracts text
Technical field
The invention belongs to pattern-recognition and technical field of computer vision, particularly a kind of method and apparatus that from image, extracts text.
Background technology
Along with the development of multimedia information retrieval, internet and 3G network stream media technology, image and video have become the main flow information carrier that multimedia messages now exchanges and serves.Word message in image and the video also just seems more and more important to the expression and the retrieval of magnanimity information, how to realize that the automatic detection of image Chinese words is to realize based on the image retrieval of Word message and the first step of image susceptibility differentiation with extracting.
Text filed in the image has and significantly is different from non-text filed feature, have abundant edge, a special texture as text filed, usually text filedly constitute by delegation or multirow literal, and arrange and to be generally level or vertical direction, the solid colour of literal and stronger contrast is arranged with background.These features can be used for image Chinese version and non-text filed discriminating.Main method to detection, extraction and the identification of image Chinese version information is to utilize priori localization of text zones such as text filed above-mentioned feature and rule, then the text filed picture quality of carrying out is strengthened,, the text filed character that carries out is extracted and identification text and background separation by binary conversion treatment by OCR software.
Text filed extracting method mainly comprises based on the method in zone with based on the method two big classes of texture.Utilize the color gray difference of the text filed and background in the image to be feature based on the method in zone, carry out text filed detection.Adopt bottom-up strategy, earlier image is divided into a lot of subimages, determine text filedly then according to the subimage message structure, utilize features such as character size size, literal length breadth ratio and literal line projection to do further screening again, finally determine text filed.This method is insensitive to literal size, font, and arithmetic speed is very fast.
According to the difference of subimage message structure, can be divided into again based on connected domain with based on the method for rim detection based on the method in zone.Solid colour based on the method for connected domain supposition image Chinese words utilizes color cluster to determine candidate's character area, utilizes heuristic rule to carry out the character area screening again.Utilize literal and background to have relative higher contrast ratio based on the method for rim detection, detect the edge earlier, with morphological operator the edge is connected to become character area then, utilize heuristic rule to screen at last.
Method based on texture is that character area is regarded as special texture, utilizes text filed and the different texture characteristic background area to carry out the detection of literal, extract and discern.Pixel window of General Definition, with a fixed step size image is carried out slip scan, detect the zonule texture, the sorter that utilization trains judges whether the current area territory is text filed, it is text filed to merge all text zonule formation candidates at last, finishes the extraction and the identification of literal on this basis.Usually, under complex background, have more robustness based on the method for texture than method, and versatility is relatively good based on connected domain.
In the method for prior art, based on the method in zone when image background complexity or picture quality are relatively poor, very difficult extraction is connected domain accurately, the formulation of the heuristic rule that is adopted when the screening of character area depends on priori in addition, and these prioris generally are difficult to obtain exactly, and it is rigidity that a lot of threshold values are established a capital really, causes the robustness of algorithm poor.
Though the method versatility based on texture is better, calculation of complex, computing cost height, and it is to literal size and font sensitivity relatively causes the bearing accuracy of the versatility of sorter and character area lower.When comprising the periodic structure texture of similar literal in the image background, these class methods also can be met difficulty.
Summary of the invention
Technical matters to be solved by this invention provides a kind of method and apparatus that extracts text from image, to improve the accuracy and the robustness of text detection.
For solving the problems of the technologies described above, it is as follows to the invention provides technical scheme:
A kind of method of extracting text from image comprises:
Original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Extract described enhancing image texture features and edge feature;
Described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
It is text filed to utilize text filed detecting device to extract from described characteristic image;
From described enhancing image described text filed, extract and identification character.
Above-mentioned method, wherein, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:
Utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Utilize monotone increasing formula S type function that the gray level in the described grey level histogram is carried out the Nonlinear Dynamic compression.
Above-mentioned method, wherein, described equalization transforming function transformation function is:
k = INT [ ( L - 1 ) · Σ j = 0 k * P ( j ) + 0.5 ]
Wherein, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Above-mentioned method, wherein, described monotone increasing formula S type function is:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; < k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Above-mentioned method wherein, utilizes WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
Above-mentioned method wherein, utilizes the Sobel operator to extract described enhancing edge of image feature.
Above-mentioned method wherein, is constructed described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Above-mentioned method wherein, also comprises:
Obtain described text filed detecting device by sample image being trained in described homogeney space.
Above-mentioned method wherein, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Above-mentioned method, wherein, described from described enhancing image described text filed, extract and identification character comprises:
Utilize the described text filed marginal information of Canny operator extraction;
Described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
Described literal line edge of image is carried out vertical projection, to isolate character picture;
Construct the grey level histogram of described character picture;
Utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
Character picture after the described removal background is discerned.
Above-mentioned method, wherein, described character picture after the described removing background is discerned comprises:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
A kind of device that extracts text from image comprises:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Feature extraction unit is used to extract described enhancing image texture features and edge feature;
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image;
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.
Above-mentioned device, wherein, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
Above-mentioned device, wherein, described equalization transforming function transformation function is:
k = INT [ ( L - 1 ) &CenterDot; &Sigma; j = 0 k * P ( j ) + 0.5 ]
Wherein, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Above-mentioned device, wherein, described monotone increasing formula S type function is:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; < k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Above-mentioned device wherein, comprises in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
Above-mentioned device wherein, comprises in the described feature extraction unit:
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
Above-mentioned device wherein, comprises in the described homogeney map unit:
Homogeney latent structure subelement is used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Above-mentioned device wherein, also comprises:
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.
Above-mentioned device, wherein, described training unit is further used for:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine its weights according to the performance of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Above-mentioned device, wherein, described character extracts and recognition unit comprises:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned.
Above-mentioned device, wherein, described character recognition subelement is further used for:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
The embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.
Description of drawings
Fig. 1 is the method flow diagram that extracts text from image of the embodiment of the invention;
Fig. 2 is the monotone increasing formula S type function synoptic diagram in the embodiment of the invention;
Fig. 3 is the structure drawing of device that extracts text from image of the embodiment of the invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with the accompanying drawings and the specific embodiments.
With reference to Fig. 1, the method that detects text from image of the embodiment of the invention mainly comprises the steps:
Step 101: original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Image is in the process of generation, transmission or conversion, be subjected to influence of various factors to cause the quality of image to descend, by original image being carried out the Nonlinear Dynamic compression of gray level, can fully give prominence to edge of image, attributive character such as texture, suppress garbage, improve the use value of image, for follow-up text filed extraction provides distincter feature.Concrete steps are as follows:
(1a) original image is carried out the grey level histogram adjustment, introduce the equalization transforming function transformation function
k = INT [ ( L - 1 ) &CenterDot; &Sigma; j = 0 k * P ( j ) + 0.5 ]
Adjust the tonal range of original image, increase image definition, wherein, INT[] for rounding symbol, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels;
(1b) image after the equalization conversion is handled, selected monotone increasing formula S type function as Nonlinear Dynamic compressed transform function, as shown in Figure 2, its form is as follows:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; < k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, each parameter in the described monotone increasing formula S type function adopts following steps to determine:
(1b1) α is set at first peak value corresponding gray of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, obviously α<γ;
(1b2) utilize the maximum informational entropy criterion to determine the value of parameter beta: establish P (k) for original image through the grey level histogram after the equalization conversion, by grey level histogram definition P (k)=n k/ n, k=0,1 ..., L-1 P (k) as can be known is the probability distribution statistical of image gray levels k, n kFor the gradation of image value is the number of pixels of k, n is total number of pixels of image; Choose a gray threshold v, it is divided into two zones with image: the gray-scale value in a zone is 0~v, and entropy that then should the zone is:
H 1 ( v ) = - &Sigma; k = 0 v P ( k ) P ( v ) ln P ( k ) P ( v )
Another regional gray-scale value is v+1~L-1, and entropy that then should the zone is:
H 2 ( v ) = - &Sigma; k = v + 1 L - 1 P ( k ) 1 - P ( v ) ln P ( k ) 1 - P ( v )
Wherein, Can release thus total entropy H (v)=H 1(v)+H 2(v).Can realize the separation of two class data when entropy is maximum as can be known by information theory, so that total entropy H (v) Zui Da v is optimal threshold, i.e. v *=arg vMax[H 1(v)+H 2(v)]; Make β=v *, finish the setting of β parameter.
Present embodiment adopts monotone increasing formula S type function as Nonlinear Dynamic compressed transform function image to be carried out dynamic compression, can fully give prominence to edge of image, and attributive character such as texture have improved the performance of figure image intensifying and partitioning algorithm greatly;
Step 102: extract described enhancing image texture features and edge feature;
The local message of homogeney and image has confidential relation, and it reflects the degree of uniformity of image-region, plays an important role in the image information classification.Choose the homogeney feature that texture in the image and marginal information are come composing images, non-text filed with respect in the image, text filed texture and marginal information are abundanter, lack of homogeneity, so its homogeney is less, it is text filed to utilize this character to detect.In addition, the characteristics of utilizing WAVELET PACKET DECOMPOSITION can continue to decompose HFS can be used to extract image texture features information.Extraction step to image texture and marginal information is as follows:
(2a) utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features information:
(2a1) given function W 0(x) wavelet basis of one group of quadrature of generation:
W 2 m ( x ) = 2 &Sigma; k h ( k ) W m ( 2 x - k )
W 2 m + 1 ( x ) = 2 &Sigma; k g ( k ) W m ( 2 x - k )
Wherein, W 2m(x) represent scaling function, W 2m+1(x) represent wavelet function, h (k) and g (k) are the filter coefficients of Orthogonal Wavelets, and then wavelet packet basis is W m(2 nX-k), wherein, n is a scale parameter, and k is a translation parameters, and m is vibration parameters and n, k ∈ Z, m ∈ N;
(2a2) two one dimension wavelet packet basiss along level or vertical direction are done inner product and can obtain two dimensional filter:
h LL(k,n)=h(k)·h(n),h LH(k,n)=h(k)·g(n)
h HL(k,n)=g(k)·h(n),h HH(k,n)=g(k)·g(n)
Wherein, the first, two of wave filter subscript is illustrated in x and the y direction is got high pass H or low pass L wave filter respectively;
(2a3) described enhancing image is carried out the secondary WAVELET PACKET DECOMPOSITION by described two dimensional filter, promptly, up-sampling is carried out in shock response to described two dimensional filter, obtain having translation invariance and comprise a plurality of subimage f (x of intermediate frequency information (texture information), y), and choose predetermined number (for example, 8) subimage therein with maximum variance;
(2a4) use based on the statistics the single order intensity profile textural characteristics is described, each width of cloth subimage is extracted energy feature, entropy feature and mean deviation feature respectively, be defined as follows respectively:
F 1 ( x , y ) = 1 255 2 &Sigma; i = x - n x + n &Sigma; j = y - n y + n f ( i , j ) 2 p [ f ( i , j ) ]
F 2 ( x , y ) = - 1 log 255 &Sigma; i = x - n x + n &Sigma; j = y - n y + n p [ f ( i , j ) ] log { p [ f ( i , j ) ] }
F 3 ( x , y ) = 1 ( 2 n + 1 ) 2 &Sigma; j = x - n x + n &Sigma; j = y - n y + n | f ( i , j ) - f &OverBar; ( i , j ) |
Wherein, p[f (i, j)] be illustrated in the subgraph mid point (i, the probability of gray scale j) in subgraph,
Figure B2009102415658D0000096
(i j) is the gray average of interior all pixels of operation window at center with point in expression.Above three features are calculated in each width of cloth feature subgraph pointwise of choosing, every bit in the texture image is obtained the texture feature vector of one 24 dimension
Figure B2009102415658D0000101
I=1,2 ..., 8, m=1,2,3, wherein,
Figure B2009102415658D0000102
Expression point (x, y) m the feature of in i width of cloth feature subgraph, extracting.
(2b) utilize the described enhancing edge of image of Sobel operator extraction characteristic information, calculate picture edge characteristic E[I (x, y)].
Step 103: described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Textural characteristics and edge feature are carried out normalized respectively:
F t * ( x , y ) = F t ( x , y ) - F t min ( x , y ) F t max ( x , y ) - F t min ( x , y ) , E * [ I ( x , y ) ] = E [ I ( x , y ) ] - E min [ I ( x , y ) ] E max [ I ( x , y ) ] - E min [ I ( x , y ) ]
Construct image then in (x, following homogeney feature y).
Y(x,y)={F 1 *(x,y),F 2 *(x,y),…,F 24 *(x,y),E *[I(x,y)]}
Present embodiment is taken all factors into consideration the texture information and the marginal information of image, choosing texture information and marginal information comes the homogeney of composing images and obtains characteristic image by mapping, the difference that has made full use of between the non-text filed and text filed homogeney in the image comes it is distinguished, thereby suppress non-text filed information, outstanding text filed feature
Step 104: it is text filed to utilize text filed detecting device to extract from described characteristic image;
The characteristic image of image to be detected is input to text filed detecting device detects, can obtain described text filed.
Wherein, obtain described text filed detecting device by sample image being trained in described homogeney space.Training step is specific as follows:
(4a) (s 1, z 1) ..., (s n, z n) expression text and non-text data, if z iBe 1, s then iBe samples of text, if z iBe 0, s then iBe non-samples of text;
(4b) to text and non-samples of text image s iExecution in step 101~103 described processes obtain the characteristic image of sample image in the homogeney space;
(4c) the homogeney feature of all pixels in the sample image is formed an eigenvector Q;
(4d) at each feature construction Weak Classifier c among the eigenvector Q j
(4e) select T the Weak Classifier that performance is best, and determine the weights τ of each Weak Classifier by the method for AdaBoost i, it and Weak Classifier c iError rate be inversely proportional to, according to weights T feature Weak Classifier is integrated into a strong classifier C, thereby obtains described text filed detecting device:
Figure B2009102415658D0000111
Judge that s is a text if the result of C (s) is 1 presentation class device C, otherwise judge that s is non-text.
Present embodiment in conjunction with the text filed detecting device that AdaBoost classification based training method obtains having meticulousr distinguishing ability, not only more meets the feature description of text according to text filed feature, and calculated amount reduces greatly, has improved the efficient that detects.
Step 105: from described enhancing image described text filed, extract and identification character.
Extract text filed in because the complicacy of character background is difficult to simple threshold method literal be separated from background.And the background of individual character is even relatively, therefore, at first character area is resolved into character zone, carries out binary conversion treatment then on character picture, and concrete steps are as follows:
(4a) utilize the described text filed marginal information of Canny operator extraction;
(4b) described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
(4c) described literal line edge of image is carried out vertical projection, to isolate character picture;
(4d) grey level histogram of the described character picture of structure;
(4e) utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
(4f) character picture after the described removal background is discerned, is specifically comprised:
(4f1) character picture after utilizing the B-spline interpolation function to described removing background amplifies, to improve its resolution;
(4f2) utilize OCR software that the character picture after amplifying is carried out character recognition.
With reference to Fig. 3, the device that extracts text from image of the embodiment of the invention comprises: image enhancing unit, feature extraction unit, homogeney map unit, text filed extraction unit, character extraction and recognition unit, training unit.Wherein:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced.Particularly, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Described equalization transforming function transformation function is:
k = INT [ ( L - 1 ) &CenterDot; &Sigma; j = 0 k * P ( j ) + 0.5 ]
Wherein, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
Described monotone increasing formula S type function is:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; &le; k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Feature extraction unit is used to extract described enhancing image texture features and edge feature.Particularly, comprise in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features;
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image.Include homogeney latent structure subelement (figure do not show) in the described homogeney map unit, be used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image.
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.Particularly, described training unit training text area detector in the following way:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.Particularly, described character extraction and recognition unit comprise:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned, and is specially: the character picture after utilizing the B-spline interpolation function to described removing background amplifies; Utilize OCR software that the character picture after amplifying is carried out character recognition.
In sum, the embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.
Should be noted that at last, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, those of ordinary skill in the art is to be understood that, can make amendment or be equal to replacement technical scheme of the present invention, and not breaking away from the spiritual scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (22)

1. a method of extracting text from image is characterized in that, comprising:
Original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Extract described enhancing image texture features and edge feature;
Described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
It is text filed to utilize text filed detecting device to extract from described characteristic image;
From described enhancing image described text filed, extract and identification character.
2. the method for claim 1 is characterized in that, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:
Utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Utilize monotone increasing formula S type function that the gray level in the described grey level histogram is carried out the Nonlinear Dynamic compression.
3. method as claimed in claim 2 is characterized in that, described equalization transforming function transformation function is:
k = INT [ ( L - 1 ) &CenterDot; &Sigma; j = 0 k * P ( j ) + 0.5 ]
Wherein, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
4. method as claimed in claim 3 is characterized in that, described monotone increasing formula S type function is:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; < k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
5. the method for claim 1 is characterized in that:
Utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
6. the method for claim 1 is characterized in that:
Utilize the Sobel operator to extract described enhancing edge of image feature.
7. the method for claim 1 is characterized in that:
By being carried out normalized respectively, described textural characteristics and edge feature construct described homogeney feature.
8. the method for claim 1 is characterized in that, also comprises:
Obtain described text filed detecting device by sample image being trained in described homogeney space.
9. method as claimed in claim 8 is characterized in that, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
10. the method for claim 1 is characterized in that, described extract from described enhancing image described text filed and identification character comprises:
Utilize the described text filed marginal information of Canny operator extraction;
Described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
Described literal line edge of image is carried out vertical projection, to isolate character picture;
Construct the grey level histogram of described character picture;
Utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
Character picture after the described removal background is discerned.
11. method as claimed in claim 10, described character picture after the described removing background is discerned comprises:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
12. a device that extracts text from image is characterized in that, comprising:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Feature extraction unit is used to extract described enhancing image texture features and edge feature;
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image;
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.
13. device as claimed in claim 12 is characterized in that, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
14. device as claimed in claim 13 is characterized in that, described equalization transforming function transformation function is:
k = INT [ ( L - 1 ) &CenterDot; &Sigma; j = 0 k * P ( j ) + 0.5 ]
Wherein, P (j) is the grey level histogram of original image, k *Be original image gray level, k *=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
15. device as claimed in claim 14 is characterized in that, described monotone increasing formula S type function is:
S ( k ; &alpha; , &beta; , &gamma; ) = 0 , k &le; &alpha; 2 ( k - &alpha; &gamma; - &alpha; ) 2 , &alpha; < k &le; &beta; 1 - 2 ( k - &gamma; &gamma; - &alpha; ) 2 , &beta; < k < &gamma; 1 , k &GreaterEqual; &gamma;
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
16. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
17. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
18. device as claimed in claim 12 is characterized in that, comprises in the described homogeney map unit:
Homogeney latent structure subelement is used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
19. device as claimed in claim 12 is characterized in that, also comprises:
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.
20. device as claimed in claim 19 is characterized in that, described training unit is further used for:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
21. device as claimed in claim 12 is characterized in that, described character extracts and recognition unit comprises:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned.
22. device as claimed in claim 21, described character recognition subelement is further used for:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
CN 200910241565 2009-11-26 2009-11-26 Method and device for extracting text from image Active CN102081731B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200910241565 CN102081731B (en) 2009-11-26 2009-11-26 Method and device for extracting text from image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200910241565 CN102081731B (en) 2009-11-26 2009-11-26 Method and device for extracting text from image

Publications (2)

Publication Number Publication Date
CN102081731A true CN102081731A (en) 2011-06-01
CN102081731B CN102081731B (en) 2013-01-23

Family

ID=44087687

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200910241565 Active CN102081731B (en) 2009-11-26 2009-11-26 Method and device for extracting text from image

Country Status (1)

Country Link
CN (1) CN102081731B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411707A (en) * 2011-10-31 2012-04-11 世纪龙信息网络有限责任公司 Method and device for identifying text in picture
CN102957963A (en) * 2011-08-17 2013-03-06 浪潮乐金数字移动通信有限公司 Method, device and mobile terminal for recognizing information
CN103425973A (en) * 2012-05-25 2013-12-04 夏普株式会社 Method and apparatus for performing enhancement processing on text-containing image, and video display device
CN103425974A (en) * 2012-05-15 2013-12-04 富士施乐株式会社 Appratus and method for processing images
CN103620339A (en) * 2011-06-30 2014-03-05 株式会社明电舍 Trolley wire abrasion measuring apparatus using image process
CN103632159A (en) * 2012-08-23 2014-03-12 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN105868757A (en) * 2016-03-25 2016-08-17 上海珍岛信息技术有限公司 Character positioning method and device in image text
CN106709490A (en) * 2015-07-31 2017-05-24 腾讯科技(深圳)有限公司 Character recognition method and device
CN108038495A (en) * 2017-12-04 2018-05-15 昆明理工大学 A kind of incompleteness Chinese characters recognition method
CN108765520A (en) * 2018-05-18 2018-11-06 腾讯科技(深圳)有限公司 Rendering intent and device, storage medium, the electronic device of text message
CN109460768A (en) * 2018-11-15 2019-03-12 东北大学 A kind of text detection and minimizing technology for histopathology micro-image
CN109919146A (en) * 2019-02-02 2019-06-21 上海兑观信息科技技术有限公司 Picture character recognition methods, device and platform
CN111046736A (en) * 2019-11-14 2020-04-21 贝壳技术有限公司 Method, device and storage medium for extracting text information
CN111368837A (en) * 2018-12-25 2020-07-03 中移(杭州)信息技术有限公司 Image quality evaluation method and device, electronic equipment and storage medium
CN112837313A (en) * 2021-03-05 2021-05-25 云南电网有限责任公司电力科学研究院 Image segmentation method for foreign matters in power transmission line
CN113159035A (en) * 2021-05-10 2021-07-23 北京世纪好未来教育科技有限公司 Image processing method, device, equipment and storage medium
CN114359905A (en) * 2022-01-06 2022-04-15 北京百度网讯科技有限公司 Text recognition method and device, electronic equipment and storage medium
CN114897773A (en) * 2022-03-31 2022-08-12 海门王巢家具制造有限公司 Distorted wood detection method and system based on image processing
CN116682112A (en) * 2023-07-28 2023-09-01 青岛虹竹生物科技有限公司 Polysaccharide test data storage and digitizing method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2843220B1 (en) * 2002-07-31 2005-02-18 Lyon Ecole Centrale "METHOD AND SYSTEM FOR AUTOMATIC LOCATION OF TEXT AREAS IN AN IMAGE"
CN101031035A (en) * 2006-03-03 2007-09-05 广州市纽帝亚资讯科技有限公司 Method for cutting news video unit automatically based on video sequence analysis

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103620339A (en) * 2011-06-30 2014-03-05 株式会社明电舍 Trolley wire abrasion measuring apparatus using image process
CN103620339B (en) * 2011-06-30 2016-06-01 株式会社明电舍 Trolley wire abrasion determinator based on image procossing
CN102957963A (en) * 2011-08-17 2013-03-06 浪潮乐金数字移动通信有限公司 Method, device and mobile terminal for recognizing information
CN102957963B (en) * 2011-08-17 2017-11-07 浪潮乐金数字移动通信有限公司 A kind of information identifying method, device and mobile terminal
CN102411707A (en) * 2011-10-31 2012-04-11 世纪龙信息网络有限责任公司 Method and device for identifying text in picture
CN103425974B (en) * 2012-05-15 2017-09-15 富士施乐株式会社 Image processing apparatus and image processing method
CN103425974A (en) * 2012-05-15 2013-12-04 富士施乐株式会社 Appratus and method for processing images
CN103425973B (en) * 2012-05-25 2019-05-31 夏普株式会社 The method, apparatus and video display apparatus of enhancing processing are carried out to the image containing text
CN103425973A (en) * 2012-05-25 2013-12-04 夏普株式会社 Method and apparatus for performing enhancement processing on text-containing image, and video display device
CN103632159A (en) * 2012-08-23 2014-03-12 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN103632159B (en) * 2012-08-23 2017-05-03 阿里巴巴集团控股有限公司 Method and system for training classifier and detecting text area in image
CN106709490A (en) * 2015-07-31 2017-05-24 腾讯科技(深圳)有限公司 Character recognition method and device
CN105868757A (en) * 2016-03-25 2016-08-17 上海珍岛信息技术有限公司 Character positioning method and device in image text
CN108038495A (en) * 2017-12-04 2018-05-15 昆明理工大学 A kind of incompleteness Chinese characters recognition method
CN108038495B (en) * 2017-12-04 2021-08-20 昆明理工大学 Incomplete Chinese character recognition method
CN108765520B (en) * 2018-05-18 2020-07-28 腾讯科技(深圳)有限公司 Text information rendering method and device, storage medium and electronic device
CN108765520A (en) * 2018-05-18 2018-11-06 腾讯科技(深圳)有限公司 Rendering intent and device, storage medium, the electronic device of text message
CN109460768A (en) * 2018-11-15 2019-03-12 东北大学 A kind of text detection and minimizing technology for histopathology micro-image
CN109460768B (en) * 2018-11-15 2021-09-21 东北大学 Text detection and removal method for histopathology microscopic image
CN111368837A (en) * 2018-12-25 2020-07-03 中移(杭州)信息技术有限公司 Image quality evaluation method and device, electronic equipment and storage medium
CN111368837B (en) * 2018-12-25 2023-12-05 中移(杭州)信息技术有限公司 Image quality evaluation method and device, electronic equipment and storage medium
CN109919146A (en) * 2019-02-02 2019-06-21 上海兑观信息科技技术有限公司 Picture character recognition methods, device and platform
CN111046736A (en) * 2019-11-14 2020-04-21 贝壳技术有限公司 Method, device and storage medium for extracting text information
CN112837313A (en) * 2021-03-05 2021-05-25 云南电网有限责任公司电力科学研究院 Image segmentation method for foreign matters in power transmission line
CN113159035A (en) * 2021-05-10 2021-07-23 北京世纪好未来教育科技有限公司 Image processing method, device, equipment and storage medium
CN114359905A (en) * 2022-01-06 2022-04-15 北京百度网讯科技有限公司 Text recognition method and device, electronic equipment and storage medium
JP2022172292A (en) * 2022-01-06 2022-11-15 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Text recognition method, device, electronic apparatus, storage medium and computer program
JP7418517B2 (en) 2022-01-06 2024-01-19 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Text recognition methods, devices, electronic devices, storage media and computer programs
CN114897773A (en) * 2022-03-31 2022-08-12 海门王巢家具制造有限公司 Distorted wood detection method and system based on image processing
CN114897773B (en) * 2022-03-31 2024-01-05 上海途巽通讯科技有限公司 Method and system for detecting distorted wood based on image processing
CN116682112A (en) * 2023-07-28 2023-09-01 青岛虹竹生物科技有限公司 Polysaccharide test data storage and digitizing method
CN116682112B (en) * 2023-07-28 2023-10-17 青岛虹竹生物科技有限公司 Polysaccharide test data storage and digitizing method

Also Published As

Publication number Publication date
CN102081731B (en) 2013-01-23

Similar Documents

Publication Publication Date Title
CN102081731B (en) Method and device for extracting text from image
CN101122953B (en) Picture words segmentation method
CN102208023B (en) Method for recognizing and designing video captions based on edge information and distribution entropy
Bataineh et al. An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows
CN100527156C (en) Picture words detecting method
Gllavata et al. A robust algorithm for text detection in images
CN102663377B (en) Character recognition method based on template matching
CN101453575B (en) Video subtitle information extracting method
US8947736B2 (en) Method for binarizing scanned document images containing gray or light colored text printed with halftone pattern
CN104200209B (en) A kind of pictograph detection method
CN101593276B (en) Video OCR image-text separation method and system
CN104361336A (en) Character recognition method for underwater video images
CN102915438A (en) Method and device for extracting video subtitles
CN103136523A (en) Arbitrary direction text line detection method in natural image
CN101615252A (en) A kind of method for extracting text information from adaptive images
CN105205488A (en) Harris angular point and stroke width based text region detection method
CN103699895A (en) Method for detecting and extracting text in video
CN103310211A (en) Filling mark recognition method based on image processing
CN103729856B (en) A kind of Fabric Defects Inspection detection method utilizing S-transformation signal extraction
US6532302B2 (en) Multiple size reductions for image segmentation
Garlapati et al. A system for handwritten and printed text classification
Forczmański et al. Stamps detection and classification using simple features ensemble
CN104834891A (en) Method and system for filtering Chinese character image type spam
Grover et al. Text extraction from document images using edge information
Darma et al. Segmentation of balinese script on lontar manuscripts using projection profile

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant