CN108154151A

CN108154151A - A kind of quick multi-oriented text lines detection method

Info

Publication number: CN108154151A
Application number: CN201711385007.XA
Authority: CN
Inventors: 方承志; 樊梦雅; 黄梅玲; 顾子超
Original assignee: Nanjing Post and Telecommunication University
Current assignee: Nanjing Post and Telecommunication University
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2018-06-12
Anticipated expiration: 2037-12-20
Also published as: CN108154151B

Abstract

The invention discloses a kind of quick multi-oriented text lines detection methods, extract candidate connected region using MSER algorithms to natural scene picture to be detected first；Then connected region is carried out to candidate connected region to an algorithm, obtains candidate characters region, and is passed through and is connected rule and be grouped, candidate line of text is obtained using the algorithm that character restores is lost；The classification of line of text and non-textual row is finally carried out according to the feature application AdaBoost algorithms of candidate multidirectional line of text.The present invention, to an algorithm, to being handled by the candidate connected region that MSER is obtained, is reduced computation complexity, accelerates the extraction rate of scene text using connected region.Using the feature of the candidate multidirectional line of text of Adaboost algorithm extraction, the accuracy rate of text detection can be greatly improved.

Description

A kind of quick multi-oriented text lines detection method

Technical field

The invention belongs to technical field of image processing, and what is related in particular to is a kind of quick multi-oriented text lines detection Method.

Background technology

Universal with smart mobile phone and follow shot equipment, the quantity of image is more and more.Text in natural image Detection has a wide range of applications field, such as robot navigation, human-computer interaction and image retrieval.At present, document text detects It through achieving huge progress, and is widely used, however, more due to the text appearance in natural scene image The complexity of sample and background, in natural scene the detection of text be still a challenging task.

Existing Method for text detection can substantially be divided into three classes：Based on texture, based on connected region and mixing side Method.In the existing method, most of method lays particular emphasis on detection level or the text in level of approximation direction.These methods are not only It is serious to limit the scene applicability that image is arbitrarily shot using mobile equipment, and when applied to multi-direction text image, Their performance usually drastically declines.Moreover, most of method time complexity is higher, detection speed is relatively low.How It is quickly and accurately positioned out from complicated natural scene multi-direction text filed, here it is current natural scene text detections to exist A research hotspot and difficult point for image domains.

Invention content

It is an object of the invention to overcome the deficiencies in the prior art, propose a kind of quick multi-oriented text lines detection side Method, which can detect the scene text row for being randomly oriented and being bent, so as to improve The accuracy of text detection in natural scene image, convenient identifying processing below.Meanwhile calculation amount is greatly reduced, it carries The high detection speed of text, method are simple and effective.

In order to solve the above technical problems, the technical solution adopted by the present invention is a kind of quick multi-oriented text lines detection side Method, including following basic step：

Step 1：Candidate connected region is extracted using MSER algorithms to natural scene picture to be detected；

Step 2：Connected region is carried out to candidate connected region to an algorithm, obtains candidate characters region, and pass through connection Rule is grouped, and candidate line of text is obtained using the algorithm for losing character recovery；

Step 3：According to feature application the AdaBoost algorithms progress line of text of candidate multidirectional line of text and non-textual row Classification.

Further, in above-mentioned steps two, the connected region is as follows to the detailed process of algorithm：In each candidate connection On region, edge pixel point s and gradient direction ds are acquired, if s is located at stroke edge, ds is centainly approximately perpendicular to stroke direction, Along ray r=s+n*ds (n>=0) gradient searches corresponding another edge pixel point t, if the direction of ds and dt is expired Sufficient dt=-ds ± π/8, then it is assumed that edge pixel point s and t be it is substantially opposite, if s can not find corresponding matched t or Ds and dt is unsatisfactory for substantially reversed requirement, then ray r is discarded；If find the t met the requirements, then edge pixel Point s and t can be designated stroke width property value | | s-t | |, unless the point is assigned a smaller stroke width Property value calculates the stroke width value of all edge pixel points, obtains candidate characters region, to obtained candidate characters region Probability filtering is carried out, excludes some apparent regions for being unlikely to be character.

Further, the detailed process of above-mentioned probability filtering is as follows：The stroke width value of each edge pixel point is counted, is built Vertical reference axis, horizontal axis represent different stroke width values, and the longitudinal axis is the edge pixel point number corresponding to each stroke width value Normalized value, the stroke width value of text is basically identical, and therefore, distribution in reference axis is concentrated and sharp, and non-textual Stroke width value difference it is larger, being evenly distributed in reference axis and flat is calculated corresponding to each stroke width value The average value E (Ns) of edge pixel point number normalized value and standard deviation STD (Ns), and it is right to calculate each stroke width value institute The change rate C (Ns) of edge pixel point number normalized value=STD (the Ns)/E (Ns) answered, if C (Ns)<0.3, then it is determined as Candidate characters region, is retained, otherwise, it is determined that for non-character region, is filtered, removes each stroke width Corresponding edge pixel point number differs those larger regions, it is possible to exclude some in background and significantly be unlikely to be word The region of symbol.

Further, in above-mentioned steps two, the connection rule is used for detecting candidate line of text, performs connected component algorithm, Adjacent candidate characters region is merged to form candidate line of text, detailed process is as follows：On obtained candidate characters region side It is scanned on edge pixel, the average stroke width ratio of adjacent candidate characters region edge pixel point is less than 2, then is considered as same One region is merged into candidate line of text.

Further, above-mentioned after connected component algorithm, connected region is marked as two parts：Candidate characters region block and The region of character is not detected, if the region that text is not detected meets the restrictive condition of definition, can apply and lose word The algorithm restored is accorded with, the restrictive condition is as follows：

(1) it is equal with the stroke width value of nearest candidate characters region block；

(2) the height ratio of region unit is less than 2.0.

Further, the algorithm lost character and restored described in above-mentioned steps two, detailed process are as follows：

(1) candidate characters region block and undetected piece of midpoint and the minimum range between them are calculated, works as meter After calculating midpoint and minimum range, calculate angle, θ ' value, calculation formula is as follows:

δ=(y₂-y₁)/(x₂-x₁), θ '=arctan δ

Wherein, (x₁,y₁) and (x₂,y₂) candidate characters region block and undetected piece of midpoint are represented respectively；

(2) according to the angle, θ detected ', undetected piece is found nearest candidate character area either vertically or horizontally Gradient direction of the angle detected respectively with undetected piece of each pixel is added by domain block, obtain new angle (θ+ θ '), become undetected piece of gradient direction；

(3) based on the similarity with candidate characters region candidate word in the block, i.e. stroke width value and the height of region unit Than obtaining undetected candidate characters in the block, these candidate characters region blocks and undetected character block being merged into One group of new candidate line of text.

Further, the feature of the above-mentioned multidirectional line of text of candidate includes：

(1) stroke width of character compares change rate in candidate multidirectional line of text

(2) in candidate multidirectional line of text character range rate

(3) in candidate multidirectional line of text character color change rate

(4) in candidate multidirectional line of text character width aspect ratio change rate

(5) in candidate multidirectional line of text character area change rate

(6) in candidate multidirectional line of text character marginal density change rate

(7) in candidate multidirectional line of text character pixel occupation ratio change rate

Wherein, A_iAnd A_i+1The average value of the stroke width of two neighboring candidate region is represented respectively；(x_i,y_i) and (x_i+1, y_i+1) centre coordinate of character in two neighboring candidate domain is represented respectively；Y_iAnd Y_i+1Two neighboring candidate region is represented respectively Average color；W_iAnd H_iThe width and height of candidate region are represented respectively；Area(B_i) and Area (B_i+1) adjacent two are represented respectively The area of a candidate region；|cc_i| and S (cc_i) number of edge pixel point and candidate region that candidate region includes are represented respectively Area；T₁And T₂The number of pixels of candidate region and area-encasing rectangle number of pixels are represented respectively；N is represented in candidate line of text The number in candidate characters region；U and C represents the average value and change rate of variable respectively.

Compared with prior art, the present invention has following advantageous effect：

(1) present invention, to being handled by the candidate connected region that MSER is obtained, is carried using connected region to an algorithm Take candidate characters region.In processing procedure, it is only necessary to pay close attention to edge pixel, reduce computation complexity, accelerate scene text This extraction rate.

(2) candidate characters are connected into candidate line of text, then using loss character by the present invention using connected component algorithm Recovery algorithms find candidate characters region around the candidate line of text detected, are waited compared with traditional found around character Character zone is selected, saves a large amount of time, therefore, the detection speed of text can be accelerated.

(3) present invention is found, Ke Nengkuo using character recovery algorithms are lost around the candidate line of text found The big region of candidate line of text, therefore, can detect the scene text row for being randomly oriented and being bent.

(4) present invention utilizes Adaboost algorithm, the feature of the candidate multidirectional line of text of extraction.Adaboost algorithm is a kind of General-purpose algorithm can be applied in many fields.In the application of text detection, Adaboost algorithm extraction feature is usually all From the point of view of single character, and here it is the angle extraction feature from line of text, because for text and non-textual region Differentiation, the candidate line of text that multiple characters are formed possesses the information that can more differentiate compared with single candidate characters region, more Easily distinguish.Therefore, the accuracy rate of text detection can be greatly improved.

Description of the drawings

Fig. 1 is a kind of flow diagram of quick multi-oriented text lines detection method of the present invention；

Fig. 2 is the candidate connected region of MSER extractions of the present invention；

Fig. 3 is the flow diagram of multi-oriented text lines detection of the present invention；

Fig. 4 is that the present invention restores undetected character figure；(a) it is to be bent candidate characters region block in text and do not detect To the region of character；(b) be δ, θ and θ ' calculating.

Specific embodiment

Technical scheme of the present invention is described in further detail below in conjunction with the accompanying drawings, the given examples are served only to explain the present invention, It is not intended to limit the scope of the present invention.

As shown in Figure 1, the present embodiment provides a kind of quick multi-oriented text lines detection method, flow can be divided into following Several steps：

Step 1：Candidate connected region is extracted using MSER algorithms to natural scene picture to be detected first；

MSER algorithms are used to natural scene picture to be detected described in step 1, the candidate so as to be extracted is even Logical region, as shown in Figure 2.

The process of step 2 is as shown in Figure 3.In step 2, the candidate connected region that is obtained using step 1 performs connection To an algorithm, detailed process is as follows in region：In each candidate connected region, edge pixel point s and gradient direction ds are acquired.If S is located at stroke edge, and ds is centainly approximately perpendicular to stroke direction, along ray r=s+n*ds (n>=0) gradient is searched right therewith Another edge pixel point t answered, if the direction of ds and dt meets dt=-ds ± π/8, then it is assumed that edge pixel point s and t is big It causes on the contrary.If s can not find corresponding matched t or ds and dt and be unsatisfactory for substantially reversed requirement, then ray r gives up It discards；If find the t met the requirements, then edge pixel point s and t can be designated stroke width property value | | s-t | | (Europe Formula distance), unless the point is assigned a smaller stroke width property value.Calculate the pen of all edge pixel points Width value is drawn, obtains candidate characters region.To the method that obtained candidate characters region carries out probability filtering, it is apparent to exclude some It is unlikely to be the region of character.

The method of probability filtering described in step 2, detailed process are as follows：The stroke for counting each edge pixel point is wide Angle value establishes reference axis, and horizontal axis represents different stroke width values, and the longitudinal axis is the edge pixel corresponding to each stroke width value The normalized value of point number.The stroke width value of text is basically identical, and therefore, distribution in reference axis is concentrated and sharp, and Non-textual stroke width value difference is larger, being evenly distributed in reference axis and flat.Calculate each stroke width value institute Average value E (the N of corresponding edge pixel point number normalized value_s) and standard deviation STD (N_s), and calculate each stroke width Change rate C (the N of the corresponding edge pixel point number normalized value of value_s)=STD (N_s)/E(N_s), if C (N_s)<0.3, then It is determined as candidate characters region, is retained, otherwise, it is determined that for non-character region, is filtered.Remove each pen It draws the edge pixel point number corresponding to width and differs larger those regions, it is possible to which excluding in background leaf etc., some are bright The aobvious region for being unlikely to be character.

Connection rule described in step 2 is used for detecting candidate line of text, connected component algorithm is performed, by adjacent candidate Character zone merges to form candidate line of text.Detailed process is as follows：It is enterprising in obtained candidate characters region edge pixel point Row scanning, the average stroke width ratio of adjacent candidate characters region edge pixel point are less than 2, are then considered as the same area, by it Merge into candidate line of text.

For step 2 after connected component algorithm, connected region is marked as two parts：It candidate characters region block and does not examine The region of character is measured, as shown in Fig. 4 (a).If the region that text is not detected meets the restrictive condition of definition, can answer The algorithm restored with character is lost, the restrictive condition are as follows：

(2) the height ratio of region unit is less than 2.0.

The algorithm lost character and restored described in step 2, detailed process are as follows：

(1) candidate characters region block and undetected piece of midpoint and the minimum range between them are calculated.Work as meter After calculating midpoint and minimum range, calculate angle, θ ' value, calculation formula is as follows：

δ=(y₂-y₁)/(x₂-x₁), θ '=arctan δ

Wherein, (x₁,y₁) and (x₂,y₂) candidate characters region block and undetected piece of midpoint are represented respectively.

(2) then, according to the angle, θ detected ', undetected piece is found nearest candidate word either vertically or horizontally Region unit is accorded with, gradient direction of the angle detected respectively with undetected piece of each pixel is added, obtains new angle (θ+θ ') becomes undetected piece of gradient direction, as shown in Fig. 4 (b).

(3) finally, based on the similarity with candidate characters region candidate word in the block, i.e. stroke width value and region unit Height ratio obtains undetected candidate characters in the block, these candidate characters region blocks and undetected character block are closed And into one group of new candidate line of text.

It is first to obtain character, then the merging of the character carried out that the candidate line of text that step 2 obtains, which is not, but directly To candidate line of text region, so it is possible that presence or the same candidate of also non-textual row in obtained candidate line of text Character zone may be planned in multiple candidate line of text.In order to be more accurate obtain line of text region, be to non-textual Row is rejected, and the differentiation of line of text and non-textual row is carried out using AdaBoost algorithms.Adaboost described in step 3 is calculated Method is a kind of general-purpose algorithm, its main feature is that multiple Weak Classifiers are cascaded to form a strong classifier, key is how to select Take feature.The feature of candidate multidirectional line of text includes：

(2) in candidate multidirectional line of text character range rate

(3) in candidate multidirectional line of text character color change rate

(5) in candidate multidirectional line of text character area change rate

Wherein, A_iAnd A_i+1The average value of the stroke width of two neighboring candidate region is represented respectively；(x_i,y_i) and (x_i+1, y_i+1) centre coordinate of character in two neighboring candidate domain is represented respectively；Y_iAnd Y_i+1Two neighboring candidate region is represented respectively Average color；W_iAnd H_iThe width and height of candidate region are represented respectively；Area(B_i) and Area (B_i+1) adjacent two are represented respectively The area of a candidate region；|cc_i| and S (cc_i) number of edge pixel point and candidate region that candidate region includes are represented respectively Area；T₁And T₂The number of pixels of candidate region and area-encasing rectangle number of pixels are represented respectively；N is represented in candidate line of text The number in candidate characters region；U and C represents the average value and change rate of variable respectively.There is phase between character in line of text Like property, difference is all little, therefore, by all little candidate line of text of the change rate of these variables, is determined as line of text, conversely, The too big candidate line of text of change rate, is determined as non-textual row, is filtered.

The core concept of AdaBoost algorithms described in step 3 is exactly to utilize the sample of Weak Classifier and different weights point Cloth space constructs a strong classifier.Main algorithm detailed process is as follows：

The first step：Initialize training sample weights W_i=1/7, i=1,2 ..., 7；

Second step：Work as m=1, during 2 ..., M, repeat the following steps：

(1) construction Weak Classifier W_i=W_iexp[c_mh(y_i≠f_m(x_i))], f_m(x)ε(+1,-1)；

(2) classification error rate e is calculated_m, enable C_m=ln ((1-e_m)/e_m)；

(3) weights are updated；Normalize W_i,

Third walks：Build strong classifier

The preferred embodiment of the present invention has been described above in detail, and still, the invention is not limited in above-mentioned particular implementations Mode, those skilled in the art can modify within the scope of the claims or equivalents, should be included in this hair Within bright protection domain.

Claims

1. a kind of quick multi-oriented text lines detection method, which is characterized in that including following basic step：

Step 2：Connected region is carried out to candidate connected region to an algorithm, obtains candidate characters region, and pass through connection rule It is grouped, candidate line of text is obtained using the algorithm for losing character recovery；

Step 3：Point of line of text and non-textual row is carried out according to the feature application AdaBoost algorithms of candidate multidirectional line of text Class.

2. quick multi-oriented text lines detection method as described in claim 1, it is characterised in that connected region described in step 2 Domain is as follows to the detailed process of algorithm：In each candidate connected region, edge pixel point s and gradient direction ds are acquired, if s Positioned at stroke edge, ds is centainly approximately perpendicular to stroke direction, along ray r=s+n*ds (n>=0) gradient is searched right therewith Another edge pixel point t answered, if the direction of ds and dt meets dt=-ds ± π/8, then it is assumed that edge pixel point s and t is big It causes on the contrary, if s can not find corresponding matched t or ds and dt and be unsatisfactory for substantially reversed requirement, then ray r gives up It discards；If find the t met the requirements, then edge pixel point s and t can be designated stroke width property value | | s-t | |, it removes The non-point is assigned a smaller stroke width property value, calculates the stroke width value of all edge pixel points, Candidate characters region is obtained, probability filtering is carried out to obtained candidate characters region, some is excluded and is significantly unlikely to be character Region.

3. quick multi-oriented text lines detection method as claimed in claim 2, it is characterised in that：The probability filters specific Process is as follows：The stroke width value of each edge pixel point is counted, establishes reference axis, horizontal axis represents different stroke widths Value, the longitudinal axis is the normalized value of the edge pixel point number corresponding to each stroke width value, and the stroke width value of text is basic Unanimously, therefore, distribution in reference axis is concentrated and sharp, and non-textual stroke width value difference is larger, in reference axis Be evenly distributed and it is flat, calculate the average value E of the edge pixel point number normalized value corresponding to each stroke width value (Ns) and standard deviation STD (Ns), and the change of the edge pixel point number normalized value corresponding to each stroke width value is calculated Rate C (Ns)=STD (Ns)/E (Ns), if C (Ns)<0.3, then it is determined as candidate characters region, is retained, it is no Then, be determined as non-character region, be filtered, remove corresponding to each stroke width edge pixel point number difference compared with Those big regions, it is possible to exclude the region that some in background are significantly unlikely to be character.

4. quick multi-oriented text lines detection method as described in claim 1, it is characterised in that in step 2, the connection rule Then it is used for detecting candidate line of text, performs connected component algorithm, adjacent candidate characters region is merged to form candidate text Row, detailed process are as follows：It is scanned on obtained candidate characters region edge pixel point, adjacent candidate characters region side The average stroke width ratio of edge pixel is less than 2, then is considered as the same area, is merged into candidate line of text.

5. quick multi-oriented text lines detection method as claimed in claim 4, it is characterised in that：By connected component algorithm Afterwards, connected region is marked as two parts：Candidate characters region block and the region that character is not detected, if text is not detected Region meet the restrictive condition of definition, then can apply and lose the algorithm that character restores, the restrictive condition is as follows：

(2) the height ratio of region unit is less than 2.0.

6. quick multi-oriented text lines detection method as described in claim 1, it is characterised in that：Loss word described in step 2 The algorithm restored is accorded with, detailed process is as follows：

(1) candidate characters region block and undetected piece of midpoint and the minimum range between them are calculated, when calculating Behind midpoint and minimum range, calculate angle, θ ' value, calculation formula is as follows:

δ=(y₂-y₁)/(x₂-x₁), θ '=arctan δ

(2) according to the angle, θ detected ', undetected piece is found nearest candidate characters region block either vertically or horizontally, Gradient direction of the angle detected respectively with undetected piece of each pixel is added, obtains new angle (θ+θ '), into For undetected piece of gradient direction；

(3) it based on the similarity with candidate characters region candidate word in the block, i.e. stroke width value and the height of region unit ratio, obtains To undetected candidate characters in the block, by these candidate characters region blocks and undetected character block be merged into one group it is new Candidate line of text.

7. quick multi-oriented text lines detection method as described in claim 1, it is characterised in that：The multidirectional line of text of candidate Feature include：

(2) in candidate multidirectional line of text character range rate

(3) in candidate multidirectional line of text character color change rate

(5) in candidate multidirectional line of text character area change rate

Wherein, A_iAnd A_i+1The average value of the stroke width of two neighboring candidate region is represented respectively；(x_i,y_i) and (x_i+1,y_i+1) The centre coordinate of character in two neighboring candidate domain is represented respectively；Y_iAnd Y_i+1Being averaged for two neighboring candidate region is represented respectively Color；W_iAnd H_iThe width and height of candidate region are represented respectively；Area(B_i) and Area (B_i+1) two neighboring time is represented respectively The area of favored area；|cc_i| and S (cc_i) number of edge pixel point that includes of candidate region and the face of candidate region are represented respectively Product；T₁And T₂The number of pixels of candidate region and area-encasing rectangle number of pixels are represented respectively；N represents the candidate in candidate line of text The number of character zone；U and C represents the average value and change rate of variable respectively.