CN102081731A - Method and device for extracting text from image - Google Patents
Method and device for extracting text from image Download PDFInfo
- Publication number
- CN102081731A CN102081731A CN2009102415658A CN200910241565A CN102081731A CN 102081731 A CN102081731 A CN 102081731A CN 2009102415658 A CN2009102415658 A CN 2009102415658A CN 200910241565 A CN200910241565 A CN 200910241565A CN 102081731 A CN102081731 A CN 102081731A
- Authority
- CN
- China
- Prior art keywords
- image
- feature
- homogeney
- carried out
- utilize
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Abstract
The invention provides a method and device for extracting a text from an image. The method comprises the following steps: carrying out grey level nonlinear dynamic compression on an original image to obtain an enhanced image; extracting textural characteristics and edge characteristics of the enhanced image; constructing the textural characteristics and the edge characteristics into homogeneous characteristics, and mapping the enhanced image to a homogeneous space according to the homogeneous characteristics to obtain a characteristic image; extracting a text area from the characteristic image by a text area detector; and extracting and identifying characters from the text area of the enhanced image. According to the invention, the accuracy and robustness in text detection can be effectively improved.
Description
Technical field
The invention belongs to pattern-recognition and technical field of computer vision, particularly a kind of method and apparatus that from image, extracts text.
Background technology
Along with the development of multimedia information retrieval, internet and 3G network stream media technology, image and video have become the main flow information carrier that multimedia messages now exchanges and serves.Word message in image and the video also just seems more and more important to the expression and the retrieval of magnanimity information, how to realize that the automatic detection of image Chinese words is to realize based on the image retrieval of Word message and the first step of image susceptibility differentiation with extracting.
Text filed in the image has and significantly is different from non-text filed feature, have abundant edge, a special texture as text filed, usually text filedly constitute by delegation or multirow literal, and arrange and to be generally level or vertical direction, the solid colour of literal and stronger contrast is arranged with background.These features can be used for image Chinese version and non-text filed discriminating.Main method to detection, extraction and the identification of image Chinese version information is to utilize priori localization of text zones such as text filed above-mentioned feature and rule, then the text filed picture quality of carrying out is strengthened,, the text filed character that carries out is extracted and identification text and background separation by binary conversion treatment by OCR software.
Text filed extracting method mainly comprises based on the method in zone with based on the method two big classes of texture.Utilize the color gray difference of the text filed and background in the image to be feature based on the method in zone, carry out text filed detection.Adopt bottom-up strategy, earlier image is divided into a lot of subimages, determine text filedly then according to the subimage message structure, utilize features such as character size size, literal length breadth ratio and literal line projection to do further screening again, finally determine text filed.This method is insensitive to literal size, font, and arithmetic speed is very fast.
According to the difference of subimage message structure, can be divided into again based on connected domain with based on the method for rim detection based on the method in zone.Solid colour based on the method for connected domain supposition image Chinese words utilizes color cluster to determine candidate's character area, utilizes heuristic rule to carry out the character area screening again.Utilize literal and background to have relative higher contrast ratio based on the method for rim detection, detect the edge earlier, with morphological operator the edge is connected to become character area then, utilize heuristic rule to screen at last.
Method based on texture is that character area is regarded as special texture, utilizes text filed and the different texture characteristic background area to carry out the detection of literal, extract and discern.Pixel window of General Definition, with a fixed step size image is carried out slip scan, detect the zonule texture, the sorter that utilization trains judges whether the current area territory is text filed, it is text filed to merge all text zonule formation candidates at last, finishes the extraction and the identification of literal on this basis.Usually, under complex background, have more robustness based on the method for texture than method, and versatility is relatively good based on connected domain.
In the method for prior art, based on the method in zone when image background complexity or picture quality are relatively poor, very difficult extraction is connected domain accurately, the formulation of the heuristic rule that is adopted when the screening of character area depends on priori in addition, and these prioris generally are difficult to obtain exactly, and it is rigidity that a lot of threshold values are established a capital really, causes the robustness of algorithm poor.
Though the method versatility based on texture is better, calculation of complex, computing cost height, and it is to literal size and font sensitivity relatively causes the bearing accuracy of the versatility of sorter and character area lower.When comprising the periodic structure texture of similar literal in the image background, these class methods also can be met difficulty.
Summary of the invention
Technical matters to be solved by this invention provides a kind of method and apparatus that extracts text from image, to improve the accuracy and the robustness of text detection.
For solving the problems of the technologies described above, it is as follows to the invention provides technical scheme:
A kind of method of extracting text from image comprises:
Original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Extract described enhancing image texture features and edge feature;
Described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
It is text filed to utilize text filed detecting device to extract from described characteristic image;
From described enhancing image described text filed, extract and identification character.
Above-mentioned method, wherein, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:
Utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Utilize monotone increasing formula S type function that the gray level in the described grey level histogram is carried out the Nonlinear Dynamic compression.
Above-mentioned method, wherein, described equalization transforming function transformation function is:
Wherein, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Above-mentioned method, wherein, described monotone increasing formula S type function is:
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Above-mentioned method wherein, utilizes WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
Above-mentioned method wherein, utilizes the Sobel operator to extract described enhancing edge of image feature.
Above-mentioned method wherein, is constructed described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Above-mentioned method wherein, also comprises:
Obtain described text filed detecting device by sample image being trained in described homogeney space.
Above-mentioned method wherein, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Above-mentioned method, wherein, described from described enhancing image described text filed, extract and identification character comprises:
Utilize the described text filed marginal information of Canny operator extraction;
Described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
Described literal line edge of image is carried out vertical projection, to isolate character picture;
Construct the grey level histogram of described character picture;
Utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
Character picture after the described removal background is discerned.
Above-mentioned method, wherein, described character picture after the described removing background is discerned comprises:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
A kind of device that extracts text from image comprises:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Feature extraction unit is used to extract described enhancing image texture features and edge feature;
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image;
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.
Above-mentioned device, wherein, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
Above-mentioned device, wherein, described equalization transforming function transformation function is:
Wherein, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Above-mentioned device, wherein, described monotone increasing formula S type function is:
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Above-mentioned device wherein, comprises in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
Above-mentioned device wherein, comprises in the described feature extraction unit:
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
Above-mentioned device wherein, comprises in the described homogeney map unit:
Homogeney latent structure subelement is used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Above-mentioned device wherein, also comprises:
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.
Above-mentioned device, wherein, described training unit is further used for:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine its weights according to the performance of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Above-mentioned device, wherein, described character extracts and recognition unit comprises:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned.
Above-mentioned device, wherein, described character recognition subelement is further used for:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
The embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.
Description of drawings
Fig. 1 is the method flow diagram that extracts text from image of the embodiment of the invention;
Fig. 2 is the monotone increasing formula S type function synoptic diagram in the embodiment of the invention;
Fig. 3 is the structure drawing of device that extracts text from image of the embodiment of the invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with the accompanying drawings and the specific embodiments.
With reference to Fig. 1, the method that detects text from image of the embodiment of the invention mainly comprises the steps:
Step 101: original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Image is in the process of generation, transmission or conversion, be subjected to influence of various factors to cause the quality of image to descend, by original image being carried out the Nonlinear Dynamic compression of gray level, can fully give prominence to edge of image, attributive character such as texture, suppress garbage, improve the use value of image, for follow-up text filed extraction provides distincter feature.Concrete steps are as follows:
(1a) original image is carried out the grey level histogram adjustment, introduce the equalization transforming function transformation function
Adjust the tonal range of original image, increase image definition, wherein, INT[] for rounding symbol, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels;
(1b) image after the equalization conversion is handled, selected monotone increasing formula S type function as Nonlinear Dynamic compressed transform function, as shown in Figure 2, its form is as follows:
Wherein, each parameter in the described monotone increasing formula S type function adopts following steps to determine:
(1b1) α is set at first peak value corresponding gray of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, obviously α<γ;
(1b2) utilize the maximum informational entropy criterion to determine the value of parameter beta: establish P (k) for original image through the grey level histogram after the equalization conversion, by grey level histogram definition P (k)=n
k/ n, k=0,1 ..., L-1 P (k) as can be known is the probability distribution statistical of image gray levels k, n
kFor the gradation of image value is the number of pixels of k, n is total number of pixels of image; Choose a gray threshold v, it is divided into two zones with image: the gray-scale value in a zone is 0~v, and entropy that then should the zone is:
Another regional gray-scale value is v+1~L-1, and entropy that then should the zone is:
Wherein,
Can release thus total entropy H (v)=H
1(v)+H
2(v).Can realize the separation of two class data when entropy is maximum as can be known by information theory, so that total entropy H (v) Zui Da v is optimal threshold, i.e. v
*=arg
vMax[H
1(v)+H
2(v)]; Make β=v
*, finish the setting of β parameter.
Present embodiment adopts monotone increasing formula S type function as Nonlinear Dynamic compressed transform function image to be carried out dynamic compression, can fully give prominence to edge of image, and attributive character such as texture have improved the performance of figure image intensifying and partitioning algorithm greatly;
Step 102: extract described enhancing image texture features and edge feature;
The local message of homogeney and image has confidential relation, and it reflects the degree of uniformity of image-region, plays an important role in the image information classification.Choose the homogeney feature that texture in the image and marginal information are come composing images, non-text filed with respect in the image, text filed texture and marginal information are abundanter, lack of homogeneity, so its homogeney is less, it is text filed to utilize this character to detect.In addition, the characteristics of utilizing WAVELET PACKET DECOMPOSITION can continue to decompose HFS can be used to extract image texture features information.Extraction step to image texture and marginal information is as follows:
(2a) utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features information:
(2a1) given function W
0(x) wavelet basis of one group of quadrature of generation:
Wherein, W
2m(x) represent scaling function, W
2m+1(x) represent wavelet function, h (k) and g (k) are the filter coefficients of Orthogonal Wavelets, and then wavelet packet basis is W
m(2
nX-k), wherein, n is a scale parameter, and k is a translation parameters, and m is vibration parameters and n, k ∈ Z, m ∈ N;
(2a2) two one dimension wavelet packet basiss along level or vertical direction are done inner product and can obtain two dimensional filter:
h
LL(k,n)=h(k)·h(n),h
LH(k,n)=h(k)·g(n)
h
HL(k,n)=g(k)·h(n),h
HH(k,n)=g(k)·g(n)
Wherein, the first, two of wave filter subscript is illustrated in x and the y direction is got high pass H or low pass L wave filter respectively;
(2a3) described enhancing image is carried out the secondary WAVELET PACKET DECOMPOSITION by described two dimensional filter, promptly, up-sampling is carried out in shock response to described two dimensional filter, obtain having translation invariance and comprise a plurality of subimage f (x of intermediate frequency information (texture information), y), and choose predetermined number (for example, 8) subimage therein with maximum variance;
(2a4) use based on the statistics the single order intensity profile textural characteristics is described, each width of cloth subimage is extracted energy feature, entropy feature and mean deviation feature respectively, be defined as follows respectively:
Wherein, p[f (i, j)] be illustrated in the subgraph mid point (i, the probability of gray scale j) in subgraph,
(i j) is the gray average of interior all pixels of operation window at center with point in expression.Above three features are calculated in each width of cloth feature subgraph pointwise of choosing, every bit in the texture image is obtained the texture feature vector of one 24 dimension
I=1,2 ..., 8, m=1,2,3, wherein,
Expression point (x, y) m the feature of in i width of cloth feature subgraph, extracting.
(2b) utilize the described enhancing edge of image of Sobel operator extraction characteristic information, calculate picture edge characteristic E[I (x, y)].
Step 103: described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Textural characteristics and edge feature are carried out normalized respectively:
Construct image then in (x, following homogeney feature y).
Y(x,y)={F
1 *(x,y),F
2 *(x,y),…,F
24 *(x,y),E
*[I(x,y)]}
Present embodiment is taken all factors into consideration the texture information and the marginal information of image, choosing texture information and marginal information comes the homogeney of composing images and obtains characteristic image by mapping, the difference that has made full use of between the non-text filed and text filed homogeney in the image comes it is distinguished, thereby suppress non-text filed information, outstanding text filed feature
Step 104: it is text filed to utilize text filed detecting device to extract from described characteristic image;
The characteristic image of image to be detected is input to text filed detecting device detects, can obtain described text filed.
Wherein, obtain described text filed detecting device by sample image being trained in described homogeney space.Training step is specific as follows:
(4a) (s
1, z
1) ..., (s
n, z
n) expression text and non-text data, if z
iBe 1, s then
iBe samples of text, if z
iBe 0, s then
iBe non-samples of text;
(4b) to text and non-samples of text image s
iExecution in step 101~103 described processes obtain the characteristic image of sample image in the homogeney space;
(4c) the homogeney feature of all pixels in the sample image is formed an eigenvector Q;
(4d) at each feature construction Weak Classifier c among the eigenvector Q
j
(4e) select T the Weak Classifier that performance is best, and determine the weights τ of each Weak Classifier by the method for AdaBoost
i, it and Weak Classifier c
iError rate be inversely proportional to, according to weights T feature Weak Classifier is integrated into a strong classifier C, thereby obtains described text filed detecting device:
Judge that s is a text if the result of C (s) is 1 presentation class device C, otherwise judge that s is non-text.
Present embodiment in conjunction with the text filed detecting device that AdaBoost classification based training method obtains having meticulousr distinguishing ability, not only more meets the feature description of text according to text filed feature, and calculated amount reduces greatly, has improved the efficient that detects.
Step 105: from described enhancing image described text filed, extract and identification character.
Extract text filed in because the complicacy of character background is difficult to simple threshold method literal be separated from background.And the background of individual character is even relatively, therefore, at first character area is resolved into character zone, carries out binary conversion treatment then on character picture, and concrete steps are as follows:
(4a) utilize the described text filed marginal information of Canny operator extraction;
(4b) described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
(4c) described literal line edge of image is carried out vertical projection, to isolate character picture;
(4d) grey level histogram of the described character picture of structure;
(4e) utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
(4f) character picture after the described removal background is discerned, is specifically comprised:
(4f1) character picture after utilizing the B-spline interpolation function to described removing background amplifies, to improve its resolution;
(4f2) utilize OCR software that the character picture after amplifying is carried out character recognition.
With reference to Fig. 3, the device that extracts text from image of the embodiment of the invention comprises: image enhancing unit, feature extraction unit, homogeney map unit, text filed extraction unit, character extraction and recognition unit, training unit.Wherein:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced.Particularly, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Described equalization transforming function transformation function is:
Wherein, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
Described monotone increasing formula S type function is:
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
Feature extraction unit is used to extract described enhancing image texture features and edge feature.Particularly, comprise in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features;
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image.Include homogeney latent structure subelement (figure do not show) in the described homogeney map unit, be used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image.
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.Particularly, described training unit training text area detector in the following way:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.Particularly, described character extraction and recognition unit comprise:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned, and is specially: the character picture after utilizing the B-spline interpolation function to described removing background amplifies; Utilize OCR software that the character picture after amplifying is carried out character recognition.
In sum, the embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.
Should be noted that at last, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, those of ordinary skill in the art is to be understood that, can make amendment or be equal to replacement technical scheme of the present invention, and not breaking away from the spiritual scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.
Claims (22)
1. a method of extracting text from image is characterized in that, comprising:
Original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Extract described enhancing image texture features and edge feature;
Described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
It is text filed to utilize text filed detecting device to extract from described characteristic image;
From described enhancing image described text filed, extract and identification character.
2. the method for claim 1 is characterized in that, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:
Utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Utilize monotone increasing formula S type function that the gray level in the described grey level histogram is carried out the Nonlinear Dynamic compression.
3. method as claimed in claim 2 is characterized in that, described equalization transforming function transformation function is:
Wherein, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
4. method as claimed in claim 3 is characterized in that, described monotone increasing formula S type function is:
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
5. the method for claim 1 is characterized in that:
Utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
6. the method for claim 1 is characterized in that:
Utilize the Sobel operator to extract described enhancing edge of image feature.
7. the method for claim 1 is characterized in that:
By being carried out normalized respectively, described textural characteristics and edge feature construct described homogeney feature.
8. the method for claim 1 is characterized in that, also comprises:
Obtain described text filed detecting device by sample image being trained in described homogeney space.
9. method as claimed in claim 8 is characterized in that, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
10. the method for claim 1 is characterized in that, described extract from described enhancing image described text filed and identification character comprises:
Utilize the described text filed marginal information of Canny operator extraction;
Described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
Described literal line edge of image is carried out vertical projection, to isolate character picture;
Construct the grey level histogram of described character picture;
Utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
Character picture after the described removal background is discerned.
11. method as claimed in claim 10, described character picture after the described removing background is discerned comprises:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
12. a device that extracts text from image is characterized in that, comprising:
Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;
Feature extraction unit is used to extract described enhancing image texture features and edge feature;
The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;
Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image;
Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.
13. device as claimed in claim 12 is characterized in that, described image enhancing unit comprises:
Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;
Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.
14. device as claimed in claim 13 is characterized in that, described equalization transforming function transformation function is:
Wherein, P (j) is the grey level histogram of original image, k
*Be original image gray level, k
*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.
15. device as claimed in claim 14 is characterized in that, described monotone increasing formula S type function is:
Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.
16. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:
The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.
17. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:
The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.
18. device as claimed in claim 12 is characterized in that, comprises in the described homogeney map unit:
Homogeney latent structure subelement is used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.
19. device as claimed in claim 12 is characterized in that, also comprises:
Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.
20. device as claimed in claim 19 is characterized in that, described training unit is further used for:
Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;
Extract the enhancing image texture features and the edge feature of described sample image;
The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;
Be respectively the Weak Classifier of each feature construction in the described eigenvector;
Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;
According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.
21. device as claimed in claim 12 is characterized in that, described character extracts and recognition unit comprises:
Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;
The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;
The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;
Histogram constructor unit is used to construct the grey level histogram of described character picture;
The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;
The character recognition subelement is used for the character picture after the described removal background is discerned.
22. device as claimed in claim 21, described character recognition subelement is further used for:
Character picture after utilizing the B-spline interpolation function to described removing background amplifies;
Utilize OCR software that the character picture after amplifying is carried out character recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910241565 CN102081731B (en) | 2009-11-26 | 2009-11-26 | Method and device for extracting text from image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200910241565 CN102081731B (en) | 2009-11-26 | 2009-11-26 | Method and device for extracting text from image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102081731A true CN102081731A (en) | 2011-06-01 |
CN102081731B CN102081731B (en) | 2013-01-23 |
Family
ID=44087687
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200910241565 Active CN102081731B (en) | 2009-11-26 | 2009-11-26 | Method and device for extracting text from image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102081731B (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102411707A (en) * | 2011-10-31 | 2012-04-11 | 世纪龙信息网络有限责任公司 | Method and device for identifying text in picture |
CN102957963A (en) * | 2011-08-17 | 2013-03-06 | 浪潮乐金数字移动通信有限公司 | Method, device and mobile terminal for recognizing information |
CN103425973A (en) * | 2012-05-25 | 2013-12-04 | 夏普株式会社 | Method and apparatus for performing enhancement processing on text-containing image, and video display device |
CN103425974A (en) * | 2012-05-15 | 2013-12-04 | 富士施乐株式会社 | Appratus and method for processing images |
CN103620339A (en) * | 2011-06-30 | 2014-03-05 | 株式会社明电舍 | Trolley wire abrasion measuring apparatus using image process |
CN103632159A (en) * | 2012-08-23 | 2014-03-12 | 阿里巴巴集团控股有限公司 | Method and system for training classifier and detecting text area in image |
CN105868757A (en) * | 2016-03-25 | 2016-08-17 | 上海珍岛信息技术有限公司 | Character positioning method and device in image text |
CN106709490A (en) * | 2015-07-31 | 2017-05-24 | 腾讯科技(深圳)有限公司 | Character recognition method and device |
CN108038495A (en) * | 2017-12-04 | 2018-05-15 | 昆明理工大学 | A kind of incompleteness Chinese characters recognition method |
CN108765520A (en) * | 2018-05-18 | 2018-11-06 | 腾讯科技(深圳)有限公司 | Rendering intent and device, storage medium, the electronic device of text message |
CN109460768A (en) * | 2018-11-15 | 2019-03-12 | 东北大学 | A kind of text detection and minimizing technology for histopathology micro-image |
CN109919146A (en) * | 2019-02-02 | 2019-06-21 | 上海兑观信息科技技术有限公司 | Picture character recognition methods, device and platform |
CN111046736A (en) * | 2019-11-14 | 2020-04-21 | 贝壳技术有限公司 | Method, device and storage medium for extracting text information |
CN111368837A (en) * | 2018-12-25 | 2020-07-03 | 中移(杭州)信息技术有限公司 | Image quality evaluation method and device, electronic equipment and storage medium |
CN112837313A (en) * | 2021-03-05 | 2021-05-25 | 云南电网有限责任公司电力科学研究院 | Image segmentation method for foreign matters in power transmission line |
CN113159035A (en) * | 2021-05-10 | 2021-07-23 | 北京世纪好未来教育科技有限公司 | Image processing method, device, equipment and storage medium |
CN114359905A (en) * | 2022-01-06 | 2022-04-15 | 北京百度网讯科技有限公司 | Text recognition method and device, electronic equipment and storage medium |
CN114897773A (en) * | 2022-03-31 | 2022-08-12 | 海门王巢家具制造有限公司 | Distorted wood detection method and system based on image processing |
CN116682112A (en) * | 2023-07-28 | 2023-09-01 | 青岛虹竹生物科技有限公司 | Polysaccharide test data storage and digitizing method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2843220B1 (en) * | 2002-07-31 | 2005-02-18 | Lyon Ecole Centrale | "METHOD AND SYSTEM FOR AUTOMATIC LOCATION OF TEXT AREAS IN AN IMAGE" |
CN101031035A (en) * | 2006-03-03 | 2007-09-05 | 广州市纽帝亚资讯科技有限公司 | Method for cutting news video unit automatically based on video sequence analysis |
-
2009
- 2009-11-26 CN CN 200910241565 patent/CN102081731B/en active Active
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103620339A (en) * | 2011-06-30 | 2014-03-05 | 株式会社明电舍 | Trolley wire abrasion measuring apparatus using image process |
CN103620339B (en) * | 2011-06-30 | 2016-06-01 | 株式会社明电舍 | Trolley wire abrasion determinator based on image procossing |
CN102957963A (en) * | 2011-08-17 | 2013-03-06 | 浪潮乐金数字移动通信有限公司 | Method, device and mobile terminal for recognizing information |
CN102957963B (en) * | 2011-08-17 | 2017-11-07 | 浪潮乐金数字移动通信有限公司 | A kind of information identifying method, device and mobile terminal |
CN102411707A (en) * | 2011-10-31 | 2012-04-11 | 世纪龙信息网络有限责任公司 | Method and device for identifying text in picture |
CN103425974B (en) * | 2012-05-15 | 2017-09-15 | 富士施乐株式会社 | Image processing apparatus and image processing method |
CN103425974A (en) * | 2012-05-15 | 2013-12-04 | 富士施乐株式会社 | Appratus and method for processing images |
CN103425973B (en) * | 2012-05-25 | 2019-05-31 | 夏普株式会社 | The method, apparatus and video display apparatus of enhancing processing are carried out to the image containing text |
CN103425973A (en) * | 2012-05-25 | 2013-12-04 | 夏普株式会社 | Method and apparatus for performing enhancement processing on text-containing image, and video display device |
CN103632159A (en) * | 2012-08-23 | 2014-03-12 | 阿里巴巴集团控股有限公司 | Method and system for training classifier and detecting text area in image |
CN103632159B (en) * | 2012-08-23 | 2017-05-03 | 阿里巴巴集团控股有限公司 | Method and system for training classifier and detecting text area in image |
CN106709490A (en) * | 2015-07-31 | 2017-05-24 | 腾讯科技(深圳)有限公司 | Character recognition method and device |
CN105868757A (en) * | 2016-03-25 | 2016-08-17 | 上海珍岛信息技术有限公司 | Character positioning method and device in image text |
CN108038495A (en) * | 2017-12-04 | 2018-05-15 | 昆明理工大学 | A kind of incompleteness Chinese characters recognition method |
CN108038495B (en) * | 2017-12-04 | 2021-08-20 | 昆明理工大学 | Incomplete Chinese character recognition method |
CN108765520B (en) * | 2018-05-18 | 2020-07-28 | 腾讯科技(深圳)有限公司 | Text information rendering method and device, storage medium and electronic device |
CN108765520A (en) * | 2018-05-18 | 2018-11-06 | 腾讯科技(深圳)有限公司 | Rendering intent and device, storage medium, the electronic device of text message |
CN109460768A (en) * | 2018-11-15 | 2019-03-12 | 东北大学 | A kind of text detection and minimizing technology for histopathology micro-image |
CN109460768B (en) * | 2018-11-15 | 2021-09-21 | 东北大学 | Text detection and removal method for histopathology microscopic image |
CN111368837A (en) * | 2018-12-25 | 2020-07-03 | 中移(杭州)信息技术有限公司 | Image quality evaluation method and device, electronic equipment and storage medium |
CN111368837B (en) * | 2018-12-25 | 2023-12-05 | 中移(杭州)信息技术有限公司 | Image quality evaluation method and device, electronic equipment and storage medium |
CN109919146A (en) * | 2019-02-02 | 2019-06-21 | 上海兑观信息科技技术有限公司 | Picture character recognition methods, device and platform |
CN111046736A (en) * | 2019-11-14 | 2020-04-21 | 贝壳技术有限公司 | Method, device and storage medium for extracting text information |
CN112837313A (en) * | 2021-03-05 | 2021-05-25 | 云南电网有限责任公司电力科学研究院 | Image segmentation method for foreign matters in power transmission line |
CN113159035A (en) * | 2021-05-10 | 2021-07-23 | 北京世纪好未来教育科技有限公司 | Image processing method, device, equipment and storage medium |
CN114359905A (en) * | 2022-01-06 | 2022-04-15 | 北京百度网讯科技有限公司 | Text recognition method and device, electronic equipment and storage medium |
JP2022172292A (en) * | 2022-01-06 | 2022-11-15 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Text recognition method, device, electronic apparatus, storage medium and computer program |
JP7418517B2 (en) | 2022-01-06 | 2024-01-19 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Text recognition methods, devices, electronic devices, storage media and computer programs |
CN114897773A (en) * | 2022-03-31 | 2022-08-12 | 海门王巢家具制造有限公司 | Distorted wood detection method and system based on image processing |
CN114897773B (en) * | 2022-03-31 | 2024-01-05 | 上海途巽通讯科技有限公司 | Method and system for detecting distorted wood based on image processing |
CN116682112A (en) * | 2023-07-28 | 2023-09-01 | 青岛虹竹生物科技有限公司 | Polysaccharide test data storage and digitizing method |
CN116682112B (en) * | 2023-07-28 | 2023-10-17 | 青岛虹竹生物科技有限公司 | Polysaccharide test data storage and digitizing method |
Also Published As
Publication number | Publication date |
---|---|
CN102081731B (en) | 2013-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102081731B (en) | Method and device for extracting text from image | |
CN101122953B (en) | Picture words segmentation method | |
CN102208023B (en) | Method for recognizing and designing video captions based on edge information and distribution entropy | |
Bataineh et al. | An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows | |
CN100527156C (en) | Picture words detecting method | |
Gllavata et al. | A robust algorithm for text detection in images | |
CN102663377B (en) | Character recognition method based on template matching | |
CN101453575B (en) | Video subtitle information extracting method | |
US8947736B2 (en) | Method for binarizing scanned document images containing gray or light colored text printed with halftone pattern | |
CN104200209B (en) | A kind of pictograph detection method | |
CN101593276B (en) | Video OCR image-text separation method and system | |
CN104361336A (en) | Character recognition method for underwater video images | |
CN102915438A (en) | Method and device for extracting video subtitles | |
CN103136523A (en) | Arbitrary direction text line detection method in natural image | |
CN101615252A (en) | A kind of method for extracting text information from adaptive images | |
CN105205488A (en) | Harris angular point and stroke width based text region detection method | |
CN103699895A (en) | Method for detecting and extracting text in video | |
CN103310211A (en) | Filling mark recognition method based on image processing | |
CN103729856B (en) | A kind of Fabric Defects Inspection detection method utilizing S-transformation signal extraction | |
US6532302B2 (en) | Multiple size reductions for image segmentation | |
Garlapati et al. | A system for handwritten and printed text classification | |
Forczmański et al. | Stamps detection and classification using simple features ensemble | |
CN104834891A (en) | Method and system for filtering Chinese character image type spam | |
Grover et al. | Text extraction from document images using edge information | |
Darma et al. | Segmentation of balinese script on lontar manuscripts using projection profile |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |