CN102081731A

CN102081731A - Method and device for extracting text from image

Info

Publication number: CN102081731A
Application number: CN2009102415658A
Authority: CN
Inventors: 舒波; 孔轶; 陈东明; 李英; 黄昭文; 李志锋; 吕汉鑫; 黄克书; 林茂; 陈涛; 雷志勇; 余士韬
Original assignee: China Mobile Group Guangdong Co Ltd
Current assignee: China Mobile Group Guangdong Co Ltd
Priority date: 2009-11-26
Filing date: 2009-11-26
Publication date: 2011-06-01
Anticipated expiration: 2029-11-26
Also published as: CN102081731B

Abstract

The invention provides a method and device for extracting a text from an image. The method comprises the following steps: carrying out grey level nonlinear dynamic compression on an original image to obtain an enhanced image; extracting textural characteristics and edge characteristics of the enhanced image; constructing the textural characteristics and the edge characteristics into homogeneous characteristics, and mapping the enhanced image to a homogeneous space according to the homogeneous characteristics to obtain a characteristic image; extracting a text area from the characteristic image by a text area detector; and extracting and identifying characters from the text area of the enhanced image. According to the invention, the accuracy and robustness in text detection can be effectively improved.

Description

A kind of method and apparatus that from image, extracts text

Technical field

The invention belongs to pattern-recognition and technical field of computer vision, particularly a kind of method and apparatus that from image, extracts text.

Background technology

Along with the development of multimedia information retrieval, internet and 3G network stream media technology, image and video have become the main flow information carrier that multimedia messages now exchanges and serves.Word message in image and the video also just seems more and more important to the expression and the retrieval of magnanimity information, how to realize that the automatic detection of image Chinese words is to realize based on the image retrieval of Word message and the first step of image susceptibility differentiation with extracting.

Text filed in the image has and significantly is different from non-text filed feature, have abundant edge, a special texture as text filed, usually text filedly constitute by delegation or multirow literal, and arrange and to be generally level or vertical direction, the solid colour of literal and stronger contrast is arranged with background.These features can be used for image Chinese version and non-text filed discriminating.Main method to detection, extraction and the identification of image Chinese version information is to utilize priori localization of text zones such as text filed above-mentioned feature and rule, then the text filed picture quality of carrying out is strengthened,, the text filed character that carries out is extracted and identification text and background separation by binary conversion treatment by OCR software.

Text filed extracting method mainly comprises based on the method in zone with based on the method two big classes of texture.Utilize the color gray difference of the text filed and background in the image to be feature based on the method in zone, carry out text filed detection.Adopt bottom-up strategy, earlier image is divided into a lot of subimages, determine text filedly then according to the subimage message structure, utilize features such as character size size, literal length breadth ratio and literal line projection to do further screening again, finally determine text filed.This method is insensitive to literal size, font, and arithmetic speed is very fast.

According to the difference of subimage message structure, can be divided into again based on connected domain with based on the method for rim detection based on the method in zone.Solid colour based on the method for connected domain supposition image Chinese words utilizes color cluster to determine candidate's character area, utilizes heuristic rule to carry out the character area screening again.Utilize literal and background to have relative higher contrast ratio based on the method for rim detection, detect the edge earlier, with morphological operator the edge is connected to become character area then, utilize heuristic rule to screen at last.

Method based on texture is that character area is regarded as special texture, utilizes text filed and the different texture characteristic background area to carry out the detection of literal, extract and discern.Pixel window of General Definition, with a fixed step size image is carried out slip scan, detect the zonule texture, the sorter that utilization trains judges whether the current area territory is text filed, it is text filed to merge all text zonule formation candidates at last, finishes the extraction and the identification of literal on this basis.Usually, under complex background, have more robustness based on the method for texture than method, and versatility is relatively good based on connected domain.

In the method for prior art, based on the method in zone when image background complexity or picture quality are relatively poor, very difficult extraction is connected domain accurately, the formulation of the heuristic rule that is adopted when the screening of character area depends on priori in addition, and these prioris generally are difficult to obtain exactly, and it is rigidity that a lot of threshold values are established a capital really, causes the robustness of algorithm poor.

Though the method versatility based on texture is better, calculation of complex, computing cost height, and it is to literal size and font sensitivity relatively causes the bearing accuracy of the versatility of sorter and character area lower.When comprising the periodic structure texture of similar literal in the image background, these class methods also can be met difficulty.

Summary of the invention

Technical matters to be solved by this invention provides a kind of method and apparatus that extracts text from image, to improve the accuracy and the robustness of text detection.

For solving the problems of the technologies described above, it is as follows to the invention provides technical scheme:

A kind of method of extracting text from image comprises:

Original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;

Extract described enhancing image texture features and edge feature;

Described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;

It is text filed to utilize text filed detecting device to extract from described characteristic image;

From described enhancing image described text filed, extract and identification character.

Above-mentioned method, wherein, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:

Utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;

Utilize monotone increasing formula S type function that the gray level in the described grey level histogram is carried out the Nonlinear Dynamic compression.

Above-mentioned method, wherein, described equalization transforming function transformation function is:

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

Wherein, P (j) is the grey level histogram of original image, k ^*Be original image gray level, k ^*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels.

Above-mentioned method, wherein, described monotone increasing formula S type function is:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α < k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

Wherein, α is the gray level of first peak value correspondence of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, and β is the parameter value that utilizes the maximum informational entropy criterion to determine.

Above-mentioned method wherein, utilizes WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.

Above-mentioned method wherein, utilizes the Sobel operator to extract described enhancing edge of image feature.

Above-mentioned method wherein, is constructed described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.

Above-mentioned method wherein, also comprises:

Obtain described text filed detecting device by sample image being trained in described homogeney space.

Above-mentioned method wherein, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:

Sample image is carried out the Nonlinear Dynamic compression of gray level, obtain the enhancing image of described sample image;

Extract the enhancing image texture features and the edge feature of described sample image;

The enhancing image texture features and the edge feature of described sample image are configured to the homogeney feature, and the homogeney feature of all pixels of the enhancing image of described sample image is formed an eigenvector;

Be respectively the Weak Classifier of each feature construction in the described eigenvector;

Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine the weights of each Weak Classifier according to the principle of performance complement;

According to described weights described a plurality of Weak Classifiers are integrated into a strong classifier, obtain described text filed detecting device.

Above-mentioned method, wherein, described from described enhancing image described text filed, extract and identification character comprises:

Utilize the described text filed marginal information of Canny operator extraction;

Described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;

Described literal line edge of image is carried out vertical projection, to isolate character picture;

Construct the grey level histogram of described character picture;

Utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;

Character picture after the described removal background is discerned.

Above-mentioned method, wherein, described character picture after the described removing background is discerned comprises:

Character picture after utilizing the B-spline interpolation function to described removing background amplifies;

Utilize OCR software that the character picture after amplifying is carried out character recognition.

A kind of device that extracts text from image comprises:

Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;

Feature extraction unit is used to extract described enhancing image texture features and edge feature;

The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;

Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image;

Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.

Above-mentioned device, wherein, described image enhancing unit comprises:

Equalization varitron unit is used to utilize the equalization transforming function transformation function that the gray level of described original image is carried out the equalization conversion, and the grey level histogram after the conversion of structure equalization;

Nonlinear Dynamic compression subelement is used for utilizing monotone increasing formula S type function that the gray level of described grey level histogram is carried out the Nonlinear Dynamic compression.

Above-mentioned device, wherein, described equalization transforming function transformation function is:

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

Above-mentioned device, wherein, described monotone increasing formula S type function is:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α < k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

Above-mentioned device wherein, comprises in the described feature extraction unit:

The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.

The Edge Gradient Feature subelement is used to utilize the Sobel operator to extract described enhancing edge of image feature.

Above-mentioned device wherein, comprises in the described homogeney map unit:

Homogeney latent structure subelement is used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.

Above-mentioned device wherein, also comprises:

Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.

Above-mentioned device, wherein, described training unit is further used for:

Utilize the AdaBoost method from all Weak Classifiers, to select a plurality of Weak Classifiers, and determine its weights according to the performance of each Weak Classifier according to the principle of performance complement;

Above-mentioned device, wherein, described character extracts and recognition unit comprises:

Marginal information is extracted subelement, is used to utilize the described text filed marginal information of Canny operator extraction;

The horizontal projection subelement is used for described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;

The vertical projection subelement is used for described literal line edge of image is carried out vertical projection, to isolate character picture;

Histogram constructor unit is used to construct the grey level histogram of described character picture;

The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;

The character recognition subelement is used for the character picture after the described removal background is discerned.

Above-mentioned device, wherein, described character recognition subelement is further used for:

The embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.

Description of drawings

Fig. 1 is the method flow diagram that extracts text from image of the embodiment of the invention;

Fig. 2 is the monotone increasing formula S type function synoptic diagram in the embodiment of the invention;

Fig. 3 is the structure drawing of device that extracts text from image of the embodiment of the invention.

Embodiment

For making the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with the accompanying drawings and the specific embodiments.

With reference to Fig. 1, the method that detects text from image of the embodiment of the invention mainly comprises the steps:

Step 101: original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced;

Image is in the process of generation, transmission or conversion, be subjected to influence of various factors to cause the quality of image to descend, by original image being carried out the Nonlinear Dynamic compression of gray level, can fully give prominence to edge of image, attributive character such as texture, suppress garbage, improve the use value of image, for follow-up text filed extraction provides distincter feature.Concrete steps are as follows:

(1a) original image is carried out the grey level histogram adjustment, introduce the equalization transforming function transformation function

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

Adjust the tonal range of original image, increase image definition, wherein, INT[] for rounding symbol, P (j) is the grey level histogram of original image, k ^*Be original image gray level, k ^*=0,1 ..., L-1, k are the gray level after the equalization conversion, k=0, and 1 ..., L-1, L are the original image number of greyscale levels;

(1b) image after the equalization conversion is handled, selected monotone increasing formula S type function as Nonlinear Dynamic compressed transform function, as shown in Figure 2, its form is as follows:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α < k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

Wherein, each parameter in the described monotone increasing formula S type function adopts following steps to determine:

(1b1) α is set at first peak value corresponding gray of the grey level histogram after the equalization conversion, and γ is the gray level of last peak value correspondence of the grey level histogram after the equalization conversion, obviously α＜γ;

(1b2) utilize the maximum informational entropy criterion to determine the value of parameter beta: establish P (k) for original image through the grey level histogram after the equalization conversion, by grey level histogram definition P (k)=n _k/ n, k=0,1 ..., L-1 P (k) as can be known is the probability distribution statistical of image gray levels k, n _kFor the gradation of image value is the number of pixels of k, n is total number of pixels of image; Choose a gray threshold v, it is divided into two zones with image: the gray-scale value in a zone is 0～v, and entropy that then should the zone is:

H_{1} (v) = - Σ_{k = 0}^{v} \frac{P (k)}{P (v)} \ln \frac{P (k)}{P (v)}

Another regional gray-scale value is v+1～L-1, and entropy that then should the zone is:

H_{2} (v) = - Σ_{k = v + 1}^{L - 1} \frac{P (k)}{1 - P (v)} \ln \frac{P (k)}{1 - P (v)}

Wherein, Can release thus total entropy H (v)=H ₁(v)+H ₂(v).Can realize the separation of two class data when entropy is maximum as can be known by information theory, so that total entropy H (v) Zui Da v is optimal threshold, i.e. v ^*=arg _vMax[H ₁(v)+H ₂(v)]; Make β=v ^*, finish the setting of β parameter.

Present embodiment adopts monotone increasing formula S type function as Nonlinear Dynamic compressed transform function image to be carried out dynamic compression, can fully give prominence to edge of image, and attributive character such as texture have improved the performance of figure image intensifying and partitioning algorithm greatly;

Step 102: extract described enhancing image texture features and edge feature;

The local message of homogeney and image has confidential relation, and it reflects the degree of uniformity of image-region, plays an important role in the image information classification.Choose the homogeney feature that texture in the image and marginal information are come composing images, non-text filed with respect in the image, text filed texture and marginal information are abundanter, lack of homogeneity, so its homogeney is less, it is text filed to utilize this character to detect.In addition, the characteristics of utilizing WAVELET PACKET DECOMPOSITION can continue to decompose HFS can be used to extract image texture features information.Extraction step to image texture and marginal information is as follows:

(2a) utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features information:

(2a1) given function W ₀(x) wavelet basis of one group of quadrature of generation:

W_{2 m} (x) = \sqrt{2} \underset{k}{Σ} h (k) W_{m} (2 x - k)

W_{2 m + 1} (x) = \sqrt{2} \underset{k}{Σ} g (k) W_{m} (2 x - k)

Wherein, W _2m(x) represent scaling function, W _2m+1(x) represent wavelet function, h (k) and g (k) are the filter coefficients of Orthogonal Wavelets, and then wavelet packet basis is W _m(2 ⁿX-k), wherein, n is a scale parameter, and k is a translation parameters, and m is vibration parameters and n, k ∈ Z, m ∈ N;

(2a2) two one dimension wavelet packet basiss along level or vertical direction are done inner product and can obtain two dimensional filter:

h _LL(k，n)＝h(k)·h(n)，h _LH(k，n)＝h(k)·g(n)

h _HL(k，n)＝g(k)·h(n)，h _HH(k，n)＝g(k)·g(n)

Wherein, the first, two of wave filter subscript is illustrated in x and the y direction is got high pass H or low pass L wave filter respectively;

(2a3) described enhancing image is carried out the secondary WAVELET PACKET DECOMPOSITION by described two dimensional filter, promptly, up-sampling is carried out in shock response to described two dimensional filter, obtain having translation invariance and comprise a plurality of subimage f (x of intermediate frequency information (texture information), y), and choose predetermined number (for example, 8) subimage therein with maximum variance;

(2a4) use based on the statistics the single order intensity profile textural characteristics is described, each width of cloth subimage is extracted energy feature, entropy feature and mean deviation feature respectively, be defined as follows respectively:

F_{1} (x, y) = \frac{1}{255^{2}} Σ_{i = x - n}^{x + n} Σ_{j = y - n}^{y + n} f {(i, j)}^{2} p [f (i, j)]

F_{2} (x, y) = \frac{- 1}{\log 255} Σ_{i = x - n}^{x + n} Σ_{j = y - n}^{y + n} p [f (i, j)] \log {p [f (i, j)]}

F_{3} (x, y) = \frac{1}{{(2 n + 1)}^{2}} Σ_{j = x - n}^{x + n} Σ_{j = y - n}^{y + n} | f (i, j) - \overset{&OverBar;}{f} (i, j) |

Wherein, p[f (i, j)] be illustrated in the subgraph mid point (i, the probability of gray scale j) in subgraph,

(i j) is the gray average of interior all pixels of operation window at center with point in expression.Above three features are calculated in each width of cloth feature subgraph pointwise of choosing, every bit in the texture image is obtained the texture feature vector of one 24 dimension

I=1,2 ..., 8, m=1,2,3, wherein,

Expression point (x, y) m the feature of in i width of cloth feature subgraph, extracting.

(2b) utilize the described enhancing edge of image of Sobel operator extraction characteristic information, calculate picture edge characteristic E[I (x, y)].

Step 103: described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image;

Textural characteristics and edge feature are carried out normalized respectively:

F_{t}^{*} (x, y) = \frac{F_{t} (x, y) - F_{t \min} (x, y)}{F_{t \max} (x, y) - F_{t \min} (x, y)},

E^{*} [I (x, y)] = \frac{E [I (x, y)] - E_{\min} [I (x, y)]}{E_{\max} [I (x, y)] - E_{\min} [I (x, y)]}

Construct image then in (x, following homogeney feature y).

Y(x，y)＝{F ₁ ^*(x，y)，F ₂ ^*(x，y)，…，F ₂₄ ^*(x，y)，E ^*[I(x，y)]}

Present embodiment is taken all factors into consideration the texture information and the marginal information of image, choosing texture information and marginal information comes the homogeney of composing images and obtains characteristic image by mapping, the difference that has made full use of between the non-text filed and text filed homogeney in the image comes it is distinguished, thereby suppress non-text filed information, outstanding text filed feature

Step 104: it is text filed to utilize text filed detecting device to extract from described characteristic image;

The characteristic image of image to be detected is input to text filed detecting device detects, can obtain described text filed.

Wherein, obtain described text filed detecting device by sample image being trained in described homogeney space.Training step is specific as follows:

(4a) (s ₁, z ₁) ..., (s _n, z _n) expression text and non-text data, if z _iBe 1, s then _iBe samples of text, if z _iBe 0, s then _iBe non-samples of text;

(4b) to text and non-samples of text image s _iExecution in step 101～103 described processes obtain the characteristic image of sample image in the homogeney space;

(4c) the homogeney feature of all pixels in the sample image is formed an eigenvector Q;

(4d) at each feature construction Weak Classifier c among the eigenvector Q _j

(4e) select T the Weak Classifier that performance is best, and determine the weights τ of each Weak Classifier by the method for AdaBoost _i, it and Weak Classifier c _iError rate be inversely proportional to, according to weights T feature Weak Classifier is integrated into a strong classifier C, thereby obtains described text filed detecting device:

Judge that s is a text if the result of C (s) is 1 presentation class device C, otherwise judge that s is non-text.

Present embodiment in conjunction with the text filed detecting device that AdaBoost classification based training method obtains having meticulousr distinguishing ability, not only more meets the feature description of text according to text filed feature, and calculated amount reduces greatly, has improved the efficient that detects.

Step 105: from described enhancing image described text filed, extract and identification character.

Extract text filed in because the complicacy of character background is difficult to simple threshold method literal be separated from background.And the background of individual character is even relatively, therefore, at first character area is resolved into character zone, carries out binary conversion treatment then on character picture, and concrete steps are as follows:

(4a) utilize the described text filed marginal information of Canny operator extraction;

(4b) described marginal information is carried out getting threshold value after the horizontal projection, to isolate the literal line image;

(4c) described literal line edge of image is carried out vertical projection, to isolate character picture;

(4d) grey level histogram of the described character picture of structure;

(4e) utilize the Otus method to determine the optimum segmentation threshold value of the grey level histogram of described character picture, according to described optimum segmentation threshold value described character picture is carried out binary conversion treatment, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;

(4f) character picture after the described removal background is discerned, is specifically comprised:

(4f1) character picture after utilizing the B-spline interpolation function to described removing background amplifies, to improve its resolution;

(4f2) utilize OCR software that the character picture after amplifying is carried out character recognition.

With reference to Fig. 3, the device that extracts text from image of the embodiment of the invention comprises: image enhancing unit, feature extraction unit, homogeney map unit, text filed extraction unit, character extraction and recognition unit, training unit.Wherein:

Image enhancing unit is used for original image is carried out the Nonlinear Dynamic compression of gray level, and image is enhanced.Particularly, described image enhancing unit comprises:

Described equalization transforming function transformation function is:

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

Described monotone increasing formula S type function is:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α \leq k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

Feature extraction unit is used to extract described enhancing image texture features and edge feature.Particularly, comprise in the described feature extraction unit:

The texture feature extraction subelement is used to utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features;

The homogeney map unit is used for described textural characteristics and edge feature are configured to the homogeney feature, according to described homogeney feature with described enhancing image mapped to the homogeney space, obtain characteristic image.Include homogeney latent structure subelement (figure do not show) in the described homogeney map unit, be used for constructing described homogeney feature by described textural characteristics and edge feature are carried out normalized respectively.

Text filed extraction unit is used for utilizing text filed detecting device to extract text filed from described characteristic image.

Training unit is used for obtaining described text filed detecting device by in described homogeney space sample image being trained.Particularly, described training unit training text area detector in the following way:

Character extracts and recognition unit, is used for described text filed extraction and identification character from described enhancing image.Particularly, described character extraction and recognition unit comprise:

The binaryzation subelement, the optimum segmentation threshold value of the grey level histogram that is used to utilize the Otus method to determine described character picture, there is most segmentation threshold that described character picture is carried out binary conversion treatment according to described, and remove residual background based on the connected domain analytical approach, obtain removing the character picture after the background;

The character recognition subelement is used for the character picture after the described removal background is discerned, and is specially: the character picture after utilizing the B-spline interpolation function to described removing background amplifies; Utilize OCR software that the character picture after amplifying is carried out character recognition.

In sum, the embodiment of the invention has improved the contrast of image by original image being carried out the Nonlinear Dynamic compression of gray level, makes that the text filed feature in the image is more obvious; Construct the homogeney feature by texture and the marginal information of extracting image, and image mapped is arrived the homogeney space, can suppress non-text filed information, outstanding simultaneously text filed information; Further, can also be at the homogeney space of image training characteristics sorter with meticulous division text and non-text filed.Therefore, the technical scheme of the embodiment of the invention can significantly improve the accuracy and the robustness of text detection.

Should be noted that at last, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, those of ordinary skill in the art is to be understood that, can make amendment or be equal to replacement technical scheme of the present invention, and not breaking away from the spiritual scope of technical solution of the present invention, it all should be encompassed in the middle of the claim scope of the present invention.

Claims

1. a method of extracting text from image is characterized in that, comprising:

Extract described enhancing image texture features and edge feature;

2. the method for claim 1 is characterized in that, the described Nonlinear Dynamic compression that original image is carried out gray level comprises:

3. method as claimed in claim 2 is characterized in that, described equalization transforming function transformation function is:

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

4. method as claimed in claim 3 is characterized in that, described monotone increasing formula S type function is:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α < k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

5. the method for claim 1 is characterized in that:

Utilize WAVELET PACKET DECOMPOSITION to extract described enhancing image texture features.

6. the method for claim 1 is characterized in that:

Utilize the Sobel operator to extract described enhancing edge of image feature.

7. the method for claim 1 is characterized in that:

By being carried out normalized respectively, described textural characteristics and edge feature construct described homogeney feature.

8. the method for claim 1 is characterized in that, also comprises:

9. method as claimed in claim 8 is characterized in that, describedly obtains described text filed detecting device and comprises by in described homogeney space sample image being trained:

10. the method for claim 1 is characterized in that, described extract from described enhancing image described text filed and identification character comprises:

Construct the grey level histogram of described character picture;

Character picture after the described removal background is discerned.

11. method as claimed in claim 10, described character picture after the described removing background is discerned comprises:

12. a device that extracts text from image is characterized in that, comprising:

13. device as claimed in claim 12 is characterized in that, described image enhancing unit comprises:

14. device as claimed in claim 13 is characterized in that, described equalization transforming function transformation function is:

k = INT [(L - 1) \cdot Σ_{j = 0}^{k^{*}} P (j) + 0.5]

15. device as claimed in claim 14 is characterized in that, described monotone increasing formula S type function is:

S (k; α, β, γ) = \{\begin{matrix} 0, & k \leq α \\ 2 {(\frac{k - α}{γ - α})}^{2}, & α < k \leq β \\ 1 - 2 {(\frac{k - γ}{γ - α})}^{2}, & β < k < γ \\ 1, & k &GreaterEqual; γ \end{matrix}

16. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:

17. device as claimed in claim 12 is characterized in that, comprises in the described feature extraction unit:

18. device as claimed in claim 12 is characterized in that, comprises in the described homogeney map unit:

19. device as claimed in claim 12 is characterized in that, also comprises:

20. device as claimed in claim 19 is characterized in that, described training unit is further used for:

21. device as claimed in claim 12 is characterized in that, described character extracts and recognition unit comprises:

22. device as claimed in claim 21, described character recognition subelement is further used for: