CN103268481A

CN103268481A - Method for extracting text in complex background image

Info

Publication number: CN103268481A
Application number: CN2013102100404A
Authority: CN
Inventors: 达飞鹏; 刘超; 饶立; 李燕春; 吕江昭; 王辰星; 何学勇
Original assignee: Focus Technology Co Ltd
Current assignee: Beijing Baichi Data Service Co ltd
Priority date: 2013-05-29
Filing date: 2013-05-29
Publication date: 2013-08-28
Anticipated expiration: 2033-05-29
Also published as: CN103268481B

Abstract

The invention discloses a method for extracting a text in a complex background image. The method comprises the steps of firstly, using the susan operator to detect and identify angular points in a source image, after removing isolated angular points, conducting integral projection transformation to cut out suspected text regions, and screening and removing non-text regions according to priori knowledge; then, judging the background complexity of a text region by the utilization of gray level jump information, when a background is judged to be complex, conducting color clustering on the text region by the utilization of the kmeans clustering algorithm, and determining the type which a text belongs to and extracting the text according to the color information at the position where the angular points are the densest; when the background is judged to be simple, conducting binaryzation on the image by the utilization of a largest-between-class variance method; finally, realizing accurate extraction of the text region. The text extraction method can locate the text region in the complex background image, and finally extracts characters after removing the background.

Description

Text extracting method in a kind of complex background image

Technical field

The invention belongs to technical field of image processing, especially relate to the text extracting method in a kind of complex background image.

Background technology

In recent years, along with network technology and development of multimedia technology, it is the new trend that the Internet culture of carrier is just becoming current cultural development with the network, the thing followed is to increase with how much step velocitys such as plain text, digital picture, video digital information, brings significant impact for people's life.There is mass data in these information, not only comprises the information useful to people, also have increasing salaciousness, violence, reaction information.These information testings obviously are unpractical by manual detection, need computing machine can identify detection automatically.Character recognition technology is ripe relatively at present, and therefore, the text message of locating and extracting in complicated image and the video has just had important meaning.

At present the text localization method mainly contain method based on connected domain, based on the method for textural characteristics, based on the method for rim detection, based on the method for Corner Detection, based on the combination of method and the above SOME METHODS of machine learning.These methods respectively have relative merits in the application of complex background image, be difficult to find a kind of algorithm various image Chinese versions to be carried out the location of robust.

Text in the complex background image extracts and is with a wide range of applications, and also is a challenging job simultaneously, needs further further investigation.

Summary of the invention

Technical matters to be solved by this invention is to overcome the deficiencies in the prior art, the present invention proposes the text extracting method in a kind of complex background image.

For solving the problems of the technologies described above, the technical solution used in the present invention is as follows: the text extracting method in a kind of complex background image, and its step is as follows:

Step 1: utilize the weighted mean value method with source images src gray processing, obtain gray level image Img;

Step 2: detect the angle point among the gray level image Img, angular coordinate is deposited in the angle point container, structure angle point matrix;

Step 3: remove the isolated angle point in the angle point matrix;

Step 4: utilize the integral projection conversion to realize the text location;

Step 5: screen and remove non-text filed;

Step 6: from source images src, intercept out the text subgraph, judge whether the background of text subgraph is complicated; If the background of text subgraph is judged as complexity, then execution in step 7; If it is uncomplicated that the background of text subgraph is judged as, then execution in step 8;

Step 7: the text subgraph to the background complexity carries out color cluster, and then extracts text message after the background of removal text subgraph; Execution in step 9;

Step 8: to the uncomplicated text subgraph of background gray processing, the recycling adaptive threshold is chosen algorithm and is carried out binaryzation, chooses prospect and background and distinguishes threshold value, extracts the text message of image text subgraph;

Step 9: circulation execution in step 6 is to step 9, and the text message of all text subgraphs accurately extracts and finishes in source images src.

In the step 1, the described weighted mean value method of utilizing obtains gray level image Img with source images src gray processing, and every gray-scale value computing formula is among the source images src:

Gray=0.30R+0.59G+0.11B

Wherein, R, G, B are respectively this triple channel pixel value in source images src, and Gray is the gray-scale value behind this gray processing.

In the step 2, the angle point among the described detection gray level image Img deposits angular coordinate in the angle point container in, structure angle point matrix; This step is utilized the angle point among the susan operator detection gray level image Img, and concrete steps are as follows:

Step 2.1: construct a two-dimensional matrix big or small on an equal basis with gray level image Img matrix, be designated as the angle point Matrix C, each value is set to 0 among the C, and structure angle point container V storage angular coordinate;

Step 2.2: construct an approximate circle N (x y) detects template as susan, N (x y) comprises 37 pixels, detect template N (x, center y) is the nuclear of template;

Step 2.3: choose any one pixel r among the gray level image Img ₀As measuring point to be checked, (x, nuclear y) is placed on r will to detect template N ₀The place, comparison template N (comparison function is as follows for x, y) gray-scale value of interior each non-epipole and the gray-scale value of nuclear location:

(x in the formula ₀, y ₀) for nuclear in gray level image Img coordinate, (x, y) be template N (x, y) in a non-epipole coordinate, f (x ₀, y ₀) and f (x y) is respectively nuclear coordinate (x ₀, y ₀) and non-epipole (x y) locates gray-scale value, and (x y) is that (x y) locates the gray scale comparative result to C; If not epipole is outside image, then (x y) directly is set to 0 with C;

Step 2.4: calculating pixel point r ₀Total gray difference functional value, namely the comparative result in the step 2.3 is sued for peace:

S (x_{0}, y_{0}) = Σ_{(x, y) &Element; N (x, y)}^{(x, y) &NotEqual; (x_{0}, y_{0})} C (x, y)

Step 2.5: according to susan angle point response function, judging point r ₀Whether is angle point, the angle point response function is as follows:

H is angle point response function threshold value in the formula,

Judge pixel r ₀Angle point response whether be maximal value in the regional area of the 5*5 centered by this point, if pixel r ₀Angle point response be maximal value just with this some reservation, its coordinate is deposited among the angle point container V, the value of relevant position in the angle point Matrix C is made as 1, and execution in step 2.6; If pixel r ₀Angle point response be not maximal value, direct execution in step 2.6 then;

Step 2.6: traversal gray level image Img, repeated execution of steps 2.3 is found out all angle points among the gray level image Img to step 2.5.

In the step 3, the isolated angle point in the described removal angle point matrix, its concrete steps are as follows:

Step 3.1: get first angle point d among the V ₀, add up in the angle point Matrix C with d ₀Centered by the 15*15 rectangular area in value be 1 number Sum;

Step 3.2: compare Sum and angle point threshold value T, if Sum is less than T, just with d ₀From V, remove, and the value of relevant position in the angle point Matrix C is re-set as 0 back execution in step 3.3; If Sum is more than or equal to T, direct execution in step 3.3 then;

Step 3.3: judge whether to travel through among the angle point container V have a few, if do not travel through among the angle point container V have a few, then forward step 3.1 to.

In the step 4, the described integral projection conversion realization text location that utilizes, its concrete steps are as follows:

Step 4.1: diagonal angle dot matrix C carries out horizontal integral projection, obtains one group of y direction integral projection vector group SHY; Since 0 judgement, establish y on the y direction ₀For more arbitrarily, if from y ₀Place's integral projection vector and y ₀All integral projection vectors are compared in+△ the y, and amplitude rises or descends and exceeds amplitude thresholds Tm, measuring point y ₀Coordinate, if herein amplitude rises, note y ₀The value of following flag _Y0=1, if amplitude descends, the note y0 value of following flag _Y0=-1, have a few on the traversal y direction, finally obtain set of coordinates Y (y ₀, y ₁..., y _k) and its follow set of coordinates Flagy (flag _Y0, flag _Y1..., flag _Yk), y wherein ₀＜y ₁＜...＜y _k

Step 4.2: diagonal angle dot matrix C carries out vertical integral projection, obtains one group of x direction integral projection vector group SHX; Since 0 judgement, establish x on the x direction ₀For more arbitrarily, if from x ₀Place's integral projection vector and x ₀All integral projection vectors are compared in+△ the x, and amplitude rises or descends and exceeds amplitude thresholds Tm, measuring point x ₀Coordinate, if herein amplitude rises, note x ₀The value of following flag _X0=1, if amplitude descends, note x ₀The value of following flag _X0=-1, have a few on the traversal x direction, finally obtain set of coordinates X (x ₀, x ₁..., x _i) and its follow set of coordinates Flagx (flag _X0, flag _X1..., flag _Xi), x wherein ₀＜x ₁＜...＜x _i

Step 4.3: from Y (y ₀, y ₁..., y _k) in to select the first value of following be that 1 coordinate is designated as y _a, select greater than y _aAnd the value of following is designated as y for-1 the most contiguous coordinate _b, they constitute a y to elements of a fix group (y _a, y _b), at Y (y ₀, y ₁..., y _k) to continue to select in turn the value of following in the residue coordinate be 1 coordinate, selecting greater than the described coordinate of selecting and the value of following is-1 the most contiguous coordinate again, until constructing all y to elements of a fix group;

Step 4.4: from X (x ₀, x ₁..., x _i) in to select the first value of following be that 1 coordinate is designated as x _c, select greater than x _cAnd the value of following is designated as x for-1 the most contiguous coordinate _d, they constitute an x to elements of a fix group (x _c, x _d), at X (x ₀, x ₁..., x _i) to continue to select in turn the value of following in the residue coordinate be 1 coordinate, selecting greater than the described coordinate of selecting and the value of following is-1 the most contiguous coordinate again, until constructing all x to elements of a fix group;

Step 4.5: make up the rectangular area S set, the construction method of the arbitrary rectangular area in the S set of rectangular area is as follows:

Respectively from all y to elements of a fix group and all x to elements of a fix group, choose a pair of set of coordinates arbitrarily, totally four set of coordinates, these four coordinates constitute a rectangle, keep this rectangular centre constant, 4 pixels of the corresponding lengthening of the length of side obtain a rectangular area.

Step 5, described screening is also removed non-text filedly, and concrete steps are as follows:

Step 5.1:s ₀Be arbitrary element in the S set of rectangular area, judge in the angle point Matrix C and s ₀Value is whether 1 number is less than angle point screening threshold value Tc, if screen threshold value Tc less than angle point, then with s in the same position ₀Remove and execution in step 5.3; If more than or equal to angle point screening threshold value Tc, then directly enter step 5.2;

Step 5.2: establish s ₀Area and the ratio of the area of source images src be T ₁, s ₀Height and the ratio of the height of source images src be T ₂, judge T respectively ₁Whether greater than area threshold Ta, T ₂Whether greater than height threshold Tl, if T ₁Greater than area threshold Ta, and T ₂Greater than height threshold Tl, then with s ₀From S, remove and execution in step 5.3, otherwise direct execution in step 5.3;

Step 5.3: all elements in the S set of traversal rectangular area, this moment, surplus element composition localization of text regional ensemble was designated as O, a given circulation sign M, its initial value is gathered element number among the O for the location.

In the step 6, the described text subgraph that intercepts out from source images src judges whether the background of text subgraph is complicated;

This step will circulate the sign M subtract 1, again according to M rectangular element O among the O _MPositional information, from source images src, intercept out the text subgraph and be designated as P, will be behind the text subgraph P gray processing detect its Gray Level Jump, whether complicated with the background of judging text subgraph P, specifically determining step is as follows:

Step 6.1: utilize the computing formula of gray-scale value described in the step 1 with text subgraph P gray processing, obtain gray scale subgraph PG, structure and the equal big or small two-dimensional matrix of text subgraph P matrix;

Step 6.2: ask the capable difference matrix D of gray scale subgraph PG, all values among the D all is set to 0, computing formula is as follows:

In the formula, l is the number of lines of pixels of gray scale subgraph PG, and c is the pixel columns of gray scale subgraph PG, and (l c) is this position grey scale pixel value of gray scale subgraph PG to g, and value is the value of row difference matrix D relevant position;

Step 6.3:K is 1 number among the row difference matrix D, and the area of K and text subgraph P is divided by, be the background complexity and skip to step 7 if the result, then judges text subgraph P greater than difference threshold value Td, otherwise execution in step 6.4;

Step 6.4: the row difference matrix D that asks gray scale subgraph PG ₁, with D ₁Middle all values all is set to 0, and computing formula is as follows:

In the formula, value ₁Be row difference matrix D ₁The value of relevant position;

Step 6.5:K ₁Be row difference matrix D ₁In 1 number, with K ₁Be divided by with the area of text subgraph P, if the result greater than difference threshold value Td, judges that then text subgraph P is the background complexity; If the result is not more than difference threshold value Td, judge that then text subgraph P is that background is uncomplicated.

In the step 7, background in the described step 6 is judged as complicated text subgraph, carries out color cluster, and then extract text message after the background of removal text subgraph;

Concrete steps are as follows:

Step 7.1: the length of establishing text subgraph P is length, and width is width, the RGB vector set A that it is length*width that traversal P obtains a size, A={a ₀, a ₁... a _i..., a _Length*width; a _iBe (x more arbitrarily among the P _i, y _i) vector constituting of the triple channel pixel value located, i=1,2 ..., length*width;

A is carried out color cluster, make iterator I=1,4 vectors are as initial cluster center Z among the picked at random A _j(I), j=1,2,3,4; E _c(I) be the error of sum square of institute's directed quantity among the I time iterative vectorized collection A, and E _c(0)=0;

Step 7.2: during the I time iteration, the Euclidean distance D (a of each vector data and cluster centre among the calculating RGB Vector Groups A _i, Z _j(I)), each data are referred to the class of cluster centre apart from minimum in go;

Step 7.3: upgrade all kinds of cluster centres according to following formula:

Z_{j} (I + 1) = \frac{1}{n_{b}} Σ_{t = 1}^{n_{b}} a_{jt}

In the formula, a _JtBe the data in the class of classification j, n _bBe the data number of the class of classification j, Z _j(I+1) be the cluster centre after upgrading;

Step 7.4: the error of sum square E that calculates institute's directed quantity among the RGB vector set A _c:

E_{c} (I) = Σ_{j = 1}^{4} Σ_{t = 1}^{n_{b}} {| | a_{jt} - Z_{j} (I) | |}^{2}

With E _c(I) with the error of sum square E of last iteration _c(I-1) compare, choose enough little positive number ξ, if | E _c(I)-E _c(I-1) |＜ξ, then clustering algorithm finishes, execution in step 7.5; Otherwise I is added 1 and forward step 7.2 to and carry out iteration again;

Step 7.5: construct four triple channel two dimension cluster Matrix C lusters big or small on an equal basis with text subgraph P matrix _i, i=1,2,3,4, all cluster Matrix C luster _iIn the triple channel pixel value of each pixel all be set to (0,0,0), be that the pixel of i is at cluster Matrix C luster with returning label value after the cluster _iThe triple channel pixel value of middle corresponding position is set to (255,255,255);

Step 7.6: the horizontal vector amplitude maximum that the integral projection conversion obtains in the extraction step 4 and the intersection point of vertical vector amplitude maximum, the rectangular area of a 5*5 of structure centered by this intersection point, the class of statistics point triple channel pixel value vector subordinate wherein, find out the maximum class of subordinate, and then the class under definite text, and the dendrogram of correspondence looked like to save as extract image I F.

In the step 8, described to the uncomplicated text subgraph of background, the text message of extraction image text subgraph; Its method is: utilize the gray-scale value computing formula in the step 1 that the uncomplicated text subgraph of background is carried out gray processing, the recycling adaptive threshold is chosen algorithm and is carried out binaryzation, and prospect and background are distinguished threshold value u and chosen as follows:

Order:

δ=ω ₀ω ₁(u ₀-u ₁)(u ₀-u ₁)

U is that prospect and background are distinguished threshold value, ω in the formula ₀Count for prospect and to account for the ratio of image, u ₀Be prospect average gray, ω ₁Count for background and to account for the ratio of image, u ₁Be the background average gray, δ is the variance of prospect and background, and u is chosen to 255 from 0 order, calculates δ respectively, and the u value was decided to be prospect and background differentiation threshold value when military order δ obtained maximal value; Behind the gray processing in the image grey scale pixel value distinguish threshold value u place gray-scale value greater than prospect and background and be set to 255, remainder is set to 0, then obtains extracting image I F.

The invention has the beneficial effects as follows: the present invention proposes the text extracting method in a kind of complex background image, described method is according to the corner location information in the detected image, accurately and more intactly there has been raising the localization of text position on recall rate and robustness; The text filed background complexity of carrying out to intercepting judges, text filed to complex background, utilize the kmeans clustering algorithm that it is carried out color cluster after, find out the class under the text and extract according to the colouring information of angle point close quarters; Simply text filed to background, utilize maximum variance between clusters to carry out binaryzation; More in good condition the background removal in the localization of text zone is fallen, be convenient to the text identification in later stage, the text identification rate has been had bigger lifting.

Description of drawings

Fig. 1 is the text extracting method overall flow figure in the complex background image of the present invention.

Fig. 2 is approximate circle template N (x, structural drawing y) that uses among the present invention.

Fig. 3 is the process flow diagram of the Kmeans clustering algorithm that uses among the present invention.

Embodiment

Below in conjunction with accompanying drawing, the text extracting method in a kind of complex background image of the present invention's proposition is elaborated:

Text extracting method in the complex background image of the present invention uses the C++ programming language to realize all processes of the text extracting method in the complex background image by the VS2010 platform in Windows operating system.We choose size is source images for the network image that contains text of 512*512, and with this as an example, the method that proposes based on the present invention positions extraction to the image Chinese version, and checks its extraction effect.Fig. 1 is the inventive method overall flow figure, and concrete steps are as follows:

Step 1: utilize the weighted mean value method with source images src gray processing, obtain gray level image Img, every gray-scale value computing formula is:

Gray=0.30R+0.59G+0.11B，

R, G, B are respectively this triple channel pixel value in source images src in the formula, and Gray is the gray-scale value behind this gray processing;

Step 2: utilize the susan operator to detect angle point among the gray level image Img, angular coordinate is deposited in the angle point container, structure angle point matrix also identifies out with angle point in its relevant position, and concrete steps are as follows:

Step 2.1: construct a two-dimensional matrix big or small on an equal basis with image I mg matrix, be designated as the angle point Matrix C, each value is set to 0 among the C, and structure angle point container V storage angular coordinate;

Step 2.2: because image is made of pixel, can't realize circular shuttering truly, for making detection respond well, construct an approximate circle N (x, y) detect template as susan, (x y) comprises 37 pixels to N, constructs template like this and has taken all factors into consideration algorithm working time and accuracy of detection, N (x, y) concrete structure of template as shown in Figure 2, detect template N (x, center y) is called the nuclear of template;

(x in the formula ₀, y ₀) for nuclear in gray level image Img coordinate, (x, y) be template N (x, y) in a non-epipole coordinate, f (x ₀, y ₀) and f (x y) is respectively nuclear coordinate (x ₀, y ₀) and non-epipole (x y) locates gray-scale value, C (x y) is the gray scale comparative result, if not epipole outside image, just (x y) directly is set to 0 with C;

S (x_{0}, y_{0}) = Σ_{(x, y) &Element; N (x, y)}^{(x, y) &NotEqual; (x_{0}, y_{0})} C (x, y);

H is angle point response function threshold value in the formula, and H arranges it and is

S _MaxBe S (x ₀, y ₀) can obtain maximal value (maximal value of this template is 36),

Judge pixel r ₀Angle point response whether be maximal value in the regional area centered by this point, for obtaining the quite good detecting result, choose the window that this regional area is a 5*5 through repeatedly testing, the angle point response is that maximal value just keeps this point, deposit its coordinate in V, and enter step 2.6 after the value of relevant position among the C is made as 1, otherwise directly enter step 2.6;

Step 2.6: traversal gray level image Img, repeating step 2.3 is found out wherein all angle points to step 2.5;

Step 3: remove the isolated angle point among the C, concrete steps are as follows:

Step 3.1: generally speaking, can have a certain amount of literal in the 15*15 zone, so contain a certain amount of angle point, get first angle point d among the V ₀, add up among the C with d ₀Centered by the 15*15 zone in value be 1 number Sum;

Step 3.2: compare Sum and angle point threshold value T size, if Sum is less than T, just with d ₀From V, remove, and go into step 3.3 after the value of relevant position among the C is re-set as 0, otherwise directly enter step 3.3, by repeatedly experiment, in order to be effective, get empirical value T=4 here;

Step 3.3: have a few among the traversal angle point array V;

Step 4: utilize the integral projection conversion that text is located out, the concrete steps of realization are as follows:

Step 4.1: calculate the horizontal integral projection of angle point Matrix C, computing formula is as follows:

H (y_{h}) = Σ_{x = 0}^{n - 1} F (x, y_{h}),

F in the formula (x, y _h) be that the angle point Matrix C is at (x, y _h) value, n is the length of source images src, H (y _h) y on the expression y direction _hThe integral projection vector at place then can obtain one group of y direction integral projection vector group SHY; Since 0 judgement, establish y on the y direction ₀For more arbitrarily, if from y ₀Place's integral projection vector and y ₀All integral projection vectors are compared in+△ the y, and amplitude rises or descends and exceeds amplitude thresholds Tm, measuring point y ₀Coordinate, if herein amplitude rises, note y ₀The value of following flag _Y0=1, if amplitude descends, note y ₀The value of following flag _Y0=-1, have a few on the traversal y direction, finally obtain set of coordinates Y (y ₀, y ₁..., y _k) and its follow set of coordinates Flagy (flag _Y0, flag _Y1..., flag _Yk), y wherein ₀＜y ₁＜...＜y _k, according to repeatedly experiment, choose empirical value △ y=8, Tm=6;

Step 4.2: calculate the vertical integral projection of angle point Matrix C, computing formula is as follows:

V (x_{v}) = Σ_{y = 0}^{m - 1} F (x_{v}, y),

F (x in the formula _v, be that the angle point Matrix C is at (x y) _v, value y), m is the width of source images src, V (x _v) x on the expression x direction _vThe integral projection vector at place obtains one group of x direction integral projection vector group SHX; Since 0 judgement, establish x on the x direction ₀For more arbitrarily, if from x ₀Place's integral projection vector and x ₀All integral projection vectors are compared in+△ the x, and amplitude rises or descends and exceeds amplitude thresholds Tm, measuring point x ₀Coordinate, if herein amplitude rises, note x ₀The value of following flag _X0=1, if amplitude descends, note x ₀The value of following flag _X0=-1, have a few on the traversal x direction, finally obtain set of coordinates X (x ₀, x ₁..., x _i) and its follow set of coordinates Flagx (flag _X0, flag _X1..., flag _Xi), x wherein ₀＜x ₁＜...＜x _i, according to repeatedly experiment, choose empirical value △ x=8, Tm=6;

Step 4.3: from Y (y ₀, y ₁... y _k) in to select the first value of following be that 1 coordinate is designated as y _a, select greater than y _aAnd following coordinate is that-1 the most contiguous coordinate is designated as y _b, they constitute a y to elements of a fix group (y _a, y _b), at Y (y ₀, y ₁... y _k) in continue to select meet above-mentioned condition all y to elements of a fix group;

Step 4.4: from X (x ₀, x ₁..., x _i) in to select the first value of following be that 1 coordinate is designated as x _c, select greater than x _cAnd the value of following is designated as x for-1 the most contiguous coordinate _d, they constitute an x to elements of a fix group (x _c, x _d), according to above method at X (x ₀, x ₁..., x _i) remain and continue in the coordinate to construct all x in turn to being decided to be set of coordinates;

Respectively from all y to elements of a fix group and all x to elements of a fix group, choose a pair of set of coordinates arbitrarily, totally four set of coordinates, these four coordinates constitute a rectangle, keep this rectangular centre constant, 4 pixels of the corresponding lengthening of the length of side obtain a rectangular area;

Step 5: utilize priori that element among the S is done further to judge, screen and remove non-text filed, concrete steps are as follows:

Step 5.1:s ₀Be arbitrary element among the S, judge in the angle point Matrix C and s ₀Value is whether 1 number screens threshold value Tc less than angle point in the same position, if just with s ₀Removal also enters step 5.3, otherwise directly enters step 5.2, chooses Tc=15 here;

Step 5.2: judge s ₀Area and the corresponding ratio of height and the area of source images src and height whether greater than area threshold Ta and height threshold Tl, if just with s ₀Removal also enters step 5.3, otherwise directly enters step 5.3, chooses Ta=0.6, Tl=0.5 here, and it is at a large amount of network pictures of statistics and the empirical value that obtains after through experiment that the threshold value is here chosen;

Step 5.3: all elements among the traversal S, this moment, surplus element composition localization of text regional ensemble was designated as O, a given circulation sign M, its initial value is gathered element number among the O for the location;

Step 6: the sign that will circulate M subtracts 1, again according to M rectangular element O among the O _MPositional information, from source images src, intercept out the text subgraph and be designated as P, will be behind the text subgraph P gray processing detect its Gray Level Jump situation, judge whereby whether its background complicated, concrete determining step is as follows:

Step 6.1: utilize that formula obtains gray scale subgraph PG with text subgraph P gray processing in the step 1, the two-dimensional matrix of structure and the equal size of text subgraph P matrix is designated as capable difference matrix D, and all values among the D all is set to 0;

Step 6.2: the capable difference matrix D that asks gray scale subgraph PG according to following formula:

L is the number of lines of pixels of gray scale subgraph PG in the formula, and c is the pixel columns of gray scale subgraph PG, and (l c) is this position grey scale pixel value of gray scale subgraph PG to g, and value is the value of row difference matrix D relevant position;

Step 6.3: 1 number K among the statistics row difference matrix D, the area of K and text subgraph P is compared, as if ratio greater than difference threshold value Td, then judge the background complexity of text subgraph P and skip to step 7, otherwise execution in step 6.4 according to repeatedly experiment, is chosen Td=0.15;

Step 6.4: step 6.4: the row difference matrix D that asks gray scale subgraph PG ₁, with D ₁Middle all values all is set to 0, and computing formula is as follows:

Step 6.5:K ₁Be row difference matrix D ₁In 1 number, with K ₁Be divided by with the area of text subgraph P, if the result, judges then that the background of text subgraph P is complicated and skip to step 7 greater than difference threshold value Td, otherwise execution in step 6.6;

Step 6.6: text subgraph P is designated as preliminary election text image PF and skips to step 8 execution;

Step 7: be the text subgraph P of complexity to judging background in the step 6, carry out color cluster according to the Kmeans clustering algorithm, consider the reading property of literal in the image, background and text have certain contrast, and background can be too complicated, therefore color in the image is divided into 4 classes, and then extract text message after the background of removal text subgraph P, the process flow diagram of Kmeans clustering algorithm as shown in Figure 3, concrete steps are as follows:

Step 7.1: the length of text subgraph P is length, and width is width, with (x more arbitrarily among the P ₀, y ₀) the triple channel pixel value located constitutes a vectorial a ₀, the traversal P RGB Vector Groups A that to obtain a size be length*width, with A as data set to carry out color cluster, make iterator I=1,4 vectors are as initial cluster center Z among the picked at random A _j(I), j=1,2,3,4;

Step 7.2: during the I time iteration, the Euclidean distance D (a of each data and cluster centre among the calculating RGB Vector Groups A _i, Z _j(I)), i=1,2 ..., length*width, j=1,2,3,4, with each data be referred to the class of cluster centre apart from minimum in go;

Step 7.3: upgrade all kinds of cluster centres according to following formula:

Z_{j} (I + 1) = \frac{1}{n_{b}} Σ_{t = 1}^{n_{b}} a_{jt},

A in the formula _JtBe the data in the class of classification j, n _bBe the data number of the class of classification j, Z _j(I+1) be the cluster centre after upgrading;

Step 7.4: the error of sum square E that calculates institute's directed quantity among the RGB Vector Groups A _c:

E_{c} (I) = Σ_{j = 1}^{4} Σ_{t = 1}^{n_{b}} {| | a_{jt} - Z_{j} (I) | |}^{2},

With E _c(I) with next time error of sum square E _c(I+1) compare, choose enough little positive number ξ, if | E _c(I)-E _c(I+1) |＜ξ, then clustering algorithm finishes, otherwise I is added 1 and forward step 7.2 to and carry out iteration again;

Step 7.5: construct four triple channel two dimension cluster Matrix C lusters big or small on an equal basis with text subgraph P matrix _i, i=1,2,3,4, with all cluster Matrix C luster _iIn the triple channel pixel value of each pixel be set to (0,0,0), be that the pixel of i is at cluster Matrix C luster with returning label value after the cluster _iThe triple channel pixel value of middle corresponding position is set to (255,255,255);

Step 7.6: the horizontal vector amplitude maximum that the integral projection conversion obtains in the extraction step 4 and the intersection point of vertical vector amplitude maximum, the rectangular area of a 5*5 of structure centered by this intersection point, according to experiment and statistics, such zone contains a large amount of angle points and word segment, the class of the some triple channel pixel value vector subordinate among the statistics C is found out the maximum class of subordinate, so just can determine the class that text is affiliated, and the dendrogram of correspondence looked like to save as extract image I F, and enter step 9;

Step 8: utilize the gray-scale value computing formula in the step 1 to carry out gray processing to preliminary election text image PF, the recycling adaptive threshold is chosen algorithm and is carried out binaryzation, and prospect and background are distinguished threshold value u and chosen as follows:

Order:

δ=ω ₀ω ₁(u ₀-u ₁)(u ₀-u ₁)

U is that prospect and background are distinguished threshold value, ω in the formula ₀Count for prospect and to account for the ratio of image, u ₀Be prospect average gray, ω ₁Count for background and to account for the ratio of image, u ₁Be the background average gray, δ is the variance of prospect and background, and u is chosen to 255 from 0 order, calculates δ respectively, and the u value was decided to be prospect and background differentiation threshold value when military order δ obtained maximal value; Behind the gray processing in the image grey scale pixel value distinguish threshold value u place gray-scale value greater than prospect and background and be set to 255, remainder is set to 0, so just obtains extracting image I F;

Step 9: judge circulation sign M, if M〉0, skip to the accurate extraction that step 6 is carried out next text filed Chinese version; If M=0, the text of having finished source images src accurately extracts.

Claims

1. the text extracting method in the complex background image is characterized in that its step is as follows:

Step 3: remove the isolated angle point in the angle point matrix;

Step 5: screen and remove non-text filed;

2. the text extracting method in a kind of complex background image according to claim 1, it is characterized in that in the step 1, the described weighted mean value method of utilizing is with source images src gray processing, obtain gray level image Img, every gray-scale value computing formula is among the source images src:

Gray=0.30R+0.59G+0.11B

3. the text extracting method in a kind of complex background image according to claim 2 is characterized in that, in the step 2, the angle point among the described detection gray level image Img deposits angular coordinate in the angle point container in, structure angle point matrix; This step is utilized the angle point among the susan operator detection gray level image Img, and concrete steps are as follows:

S (x_{0}, y_{0}) = Σ_{(x, y) &Element; N (x, y)}^{(x, y) &NotEqual; (x_{0}, y_{0})} C (x, y)

H is angle point response function threshold value in the formula,

4. the text extracting method in a kind of complex background image according to claim 3 is characterized in that, in the step 3, and the isolated angle point in the described removal angle point matrix, its concrete steps are as follows:

5. the text extracting method in a kind of complex background image according to claim 4 is characterized in that, in the step 4, and the described integral projection conversion realization text location that utilizes, its concrete steps are as follows:

Step 4.1: diagonal angle dot matrix C carries out horizontal integral projection, obtains one group of y direction integral projection vector group SHY; Since 0 judgement, establish y on the y direction ₀For more arbitrarily, if from y ₀Place's integral projection vector and y ₀All integral projection vectors are compared in+△ the y, and amplitude rises or descends and exceeds amplitude thresholds Tm, measuring point y ₀Coordinate, if herein amplitude rises, note y ₀The value of following flag _Y0=1, if amplitude descends, note y ₀The value of following flag _Y0=-1, have a few on the traversal y direction, finally obtain set of coordinates Y (y ₀, y ₁..., y _k) and its follow set of coordinates Flagy (flag _Y0, flag _Y1..., flag _Yk), y wherein ₀＜y ₁＜...＜y _k

6. the text extracting method in a kind of complex background image according to claim 5 is characterized in that, step 5, and described screening is also removed non-text filedly, and concrete steps are as follows:

7. the text extracting method in a kind of complex background image according to claim 6 is characterized in that, in the step 6, the described text subgraph that intercepts out from source images src judges whether the background of text subgraph is complicated;

8. the text extracting method in a kind of complex background image according to claim 7, it is characterized in that, in the step 7, background in the described step 6 is judged as complicated text subgraph, carry out color cluster, and then extract text message after the background of removal text subgraph;

Concrete steps are as follows:

Step 7.1: the length of establishing text subgraph P is length, and width is width, the RGB vector set A that it is length*width that traversal P obtains a size, A={a ₀, a ₁... a _i..., a _Length* _Width; a _iBe (x more arbitrarily among the P _i, y _i) vector constituting of the triple channel pixel value located, i=1,2 ..., length*width;

Step 7.3: upgrade all kinds of cluster centres according to following formula:

Z_{j} (I + 1) = \frac{1}{n_{b}} Σ_{t = 1}^{n_{b}} a_{jt}

E_{c} (I) = Σ_{j = 1}^{4} Σ_{t = 1}^{n_{b}} {| | a_{jt} - Z_{j} (I) | |}^{2}

9. the text extracting method in a kind of complex background image according to claim 8 is characterized in that, and is described to the uncomplicated text subgraph of background in the step 8, extracts the text message of image text subgraph; Its method is: utilize the gray-scale value computing formula in the step 1 that the uncomplicated text subgraph of background is carried out gray processing, the recycling adaptive threshold is chosen algorithm and is carried out binaryzation, and prospect and background are distinguished threshold value u and chosen as follows:

Order:

δ=ω ₀ω ₁(u ₀-u ₁)(u ₀-u ₁)