CN103136523A - Arbitrary direction text line detection method in natural image - Google Patents

Arbitrary direction text line detection method in natural image Download PDF

Info

Publication number
CN103136523A
CN103136523A CN2012105060724A CN201210506072A CN103136523A CN 103136523 A CN103136523 A CN 103136523A CN 2012105060724 A CN2012105060724 A CN 2012105060724A CN 201210506072 A CN201210506072 A CN 201210506072A CN 103136523 A CN103136523 A CN 103136523A
Authority
CN
China
Prior art keywords
text
line
theta
candidate
connected region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012105060724A
Other languages
Chinese (zh)
Other versions
CN103136523B (en
Inventor
魏宝刚
庄越挺
袁杰
张引
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201210506072.4A priority Critical patent/CN103136523B/en
Publication of CN103136523A publication Critical patent/CN103136523A/en
Application granted granted Critical
Publication of CN103136523B publication Critical patent/CN103136523B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses an arbitrary direction text line detection method in a natural image. The arbitrary direction text line detection method in the natural image comprises the following steps: (1) candidate text areas are detected through a maximum stable extreme area detection method with restriction, compound similarity among area pairs is obtained according to combination with size of the areas, absolute distance, relative distance, geometric similarity in a contextual information definition area and color similarity; (2) three areas are found, serve as seed areas of a candidate text line and are expanded to all areas in the line through a candidate text line identification method based on the similarity; and (3) non-text lines are removed through a filter based on morphologic framework characteristics, the filter conducts filtering through a sparse classifier and characteristic vectors required by the classifier are taken from the morphologic framework characteristics in all areas of the candidate text line. The arbitrary direction text line detection method in the natural image can detect text in arbitrary directions in the natural image. In addition, due to adoption of the area interior characteristic structured classifier, better identification accuracy can be achieved.

Description

Any direction text line detection method in a kind of natural image
Technical field
The present invention relates to any direction text line detection method in a kind of natural image, the method be used for to realize that the scene text that detects any direction at natural image is used for OCR identification, belongs to the Computer Image Processing field.
Background technology
Along with the development of multimedia and electronics industry, increasing image information is produced.How effectively to organize and retrieve them and just become a difficult problem.All contain Word message in a lot of image documents, as the front cover of books, guideboard sign, buildings (on name information is arranged) etc., these Word messages and picture material are closely related.If effectively detect and identify these Word messages, can utilize them that image document is organized and retrieved, have very strong practical value.
Word/text detection can be divided into three kinds of methods: based on the method for the method of gradient, color-based cluster with based on the method for texture.Larger based on the method for gradient hypothesis text its edge strength with respect to background, therefore have larger Grad pixel its be that text filed probability is just larger.The Institute of Electrical and Electric Engineers image is processed journal " 2011 years the 20th phase IEEETransactionson Image Processing, vol.20, no.9,2011) the text stroke is detected in the stroke path between the point by approximate opposite gradient direction is arranged on the searching image edge pair, then uses cluster with other heuristic rule, stroke classification to be become different line of text.It will become unreliable when the background area also comprises a lot of marginal information based on the shortcoming of the method for gradient.Method based on texture is used the texture feature extractions such as Gabor filtrator, wavelet transformation, Fast Fourier Transform (FFT), and then the method with machine learning such as neural network, svm classifier devices detects caption area.Ieee communication technology meeting paper in 2008 is concentrated (In Proceeding of IEEE International Conference on Communication Technology (ICCT), 2008, a kind of method of 722-725) announcing by 4 fritter wavelet coefficients being merged into a bulk of the location in the big font text, then strengthens effect with morphological dilation and neural grid with the HARR wavelet transformation.Method based on texture can't detect the text filed of any direction.ACM multimedia technology meeting paper in 2007 is concentrated (In Proceedings ofthe ACM International Multimedia Conference and Exhibition2007 (MM), a kind of method of 847-850) announcing uses color cluster to remove noise, and they come the relatively best planes of color of adaptive selection to carry out the binaryzation operation according to the TEXT CONTRAST difference on the abstract plane of each color.Textcolor in the method hypothesis frame of video of color-based cluster is all unified, yet this hypothesis is in most of the cases invalid, so the limitation of its application is larger.To detect its effect undesirable owing to utilizing a kind of feature to carry out captions, and therefore a lot of methods are united and used above various features.
Above these Method for text detections all made some good tries, but the characteristics such as, the direction strong with the background area calibration that has due to the natural scene text arbitrarily, the position is fixing cause these methods to the detection poor effect of any direction text in natural image.
Summary of the invention
The objective of the invention is to overcome the deficiencies in the prior art, any direction Method for text detection in a kind of natural image is provided.
In natural image, any direction text line detection method comprises the following steps:
(1) the maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity;
(2) employing based on candidate's line of text recognition methods of similarity, at first finds Three regions as the seed region of candidate's line of text, then expands to the All Ranges of this row;
(3) employing is removed non-line of text based on the filtrator of morphology framework characteristic, and this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic of All Ranges on candidate's line of text.
described maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity step be: at first, it is text filed as the candidate that detection calculates in Britain's machine vision all MSER maximum stable extreme regions that propose in 2002 nd Annual Meeting collection Robust Wide Baseline Stereofrom Maximally Stable Extremal Regions one literary compositions, then use the marginal information of canny operator extraction image, with the constrained line of these edge lines as MSERs, collect connected region, in collection process, a pixel can only be connected to the pixel of its four direction up and down, two pixels that can prevent like this edge pixel side couple together, after collecting all connected regions, use the geometric similarity degree between following 5 steps definition arbitrary region pair:
Step 1: be provided with connected region CC iAnd CC j, its standardization absolute distance is defined as follows:
dis ij 1 = | C x i - C x j | + k 1 * | C y i - C y j | k 1 h im + w im - ( k 1 * ( h i + h j ) + w i + w j ) / 2 - - - 1
Wherein
Figure BDA00002496154500022
Represent respectively CC iAnd CC jCentral point horizontal and vertical coordinate, h i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width, h im, w imThe height and width that represent respectively present image, k 1Be a constant of controlling horizontal range and vertical range contribution proportion, its value is set as 2.
Figure BDA00002496154500023
Value from 0 to 1.
Step 2: in order to enlarge the difference of the right distance of different CC, the distance metric in formula 1 further is modified as following expression:
dis ij 2 = dis ij 1 · ℵ i ( j ) - - - 2
Wherein
Figure BDA00002496154500025
Expression
Figure BDA00002496154500026
Arrive CC at all iThe central point distance
Figure BDA00002496154500027
The sequence number number.
Step 3: formula 2 further is revised as:
dis ij 3 = min p k { dis ip k j 2 , | p k | ∈ { 0,1,2 , . . . . . . , n - 2 } } - - - 3
P wherein kExpression is from CC iTo CC jA paths, and length is between 0 to n-2,
Figure BDA00002496154500031
Expression CC iTo CC jBetween shortest path, can obtain by the algorithm of Floyd algorithm or other identity functions:
Step 4: defining two interregional shape distances is:
dis ij 4 = max ( h i , h j ) · max ( w i , w j ) min ( h i , h j ) · min ( w i , w j ) - - - 4
H wherein i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width.
Step 5: connected region CC iAnd CC jThe geometric similarity degree be:
simi geometry ( i , j ) = exp ( - max ( dis ij 3 , dis ji 3 ) · dis ij 4 ) - - - 5
Connected region CC iAnd CC jColor similarity be:
At first with image by the RGB color space conversion to the hsv color space, H, S, V component are quantized into respectively 8,3,3 grades.Color histogram is 72 dimensions like this.Suppose CC iAnd CC jColor feature vector be respectively C i=[C I, 1, C i,2..., C I, t..., C i,n] and C j=[C J, 1, C J, 2..., C J, t..., C j,n], color similarity is:
simi color ( i , j ) = Σ t = 1 n min ( C i , t , C j , t ) - - - 6
N gets 72;
Finally, the synthetic similarity of two connected region synthetic geometry similarities and color similarity is:
simi(i,j)=(simi geometry(i,j)+simi color(i,j))/27。
Described employing is based on candidate's line of text recognition methods of similarity, at first find Three regions as the seed region of candidate's line of text, the All Ranges step that then expands to this row is: carry out candidate's line of text and generate on the basis of judging based on the brother between connected region pair;
(1) brother judges:
Whether enough the brother judges two zones similar and neighbour.If two zones are not the brothers, they can not be merged into the one text row, define following three restrictive conditions and judge whether two connected regions are brothers:
A) ratio of the height and width of two adjacent areas should be at two threshold value T 1And T 2Between;
B) distance between two connected regions should be greater than T 3Multiply by the high or wide of larger zone;
C) two adjacent characters should have similar color characteristic, so their color similarity should be greater than a threshold value T 4Formalization representation is as follows:
S ij = S ij 1 ^ S ij 2 ^ S ij 3 - - - 8
S ijExpression connected region CC iAnd CC jWhether, if its value be 1, be similar zone, they may belong to the one text row, otherwise can not belong to the one text row if being similar zone,
Figure BDA00002496154500042
Represent respectively three above-mentioned restrictive conditions, T 1, T 2, T 3, T 4Be set as respectively 2,4,3,0.4,
The refinement of condition 1 judge as shown in the formula:
h r=max(h i,h 1)/min(h i,h 1)
w r=max(w i,w 1)/mm(w i,w 19
Figure BDA00002496154500043
In formula 10, θ represents connected region CC iAnd CC jAngle between the line of central point and X-axis positive dirction;
The refinement of condition 2 is judged as follows:
S ij 2 = 1 dis ij ≤ T 3 · max ( h i , h j ) | tgθ | > 1 1 dis ij ≤ T 3 · max ( w i , w j ) | tgθ | ≤ 1 0 others - - - 11
(2) candidate's line of text generates:
In order to produce candidate's line of text, at first find three seed connected regions, then expand to and comprise more connected region, step is as follows:
Step 1: make UL ccRepresent all the current set of not determining the connected region composition of line of text, at first initialization set UL ccBe the set that all connected regions form, for each region division respective flag position, and be initialized as 0.To UL ccIn each connected region calculate similarity simi (i, *) between other connected region of it and all, then take out two maximum similarities and obtain they and, be designated as partSimi (CC i), then all partSimi values are by descending sort;
Step 2: for satisfying condition
S ij=1 ∧ S 1k=1/partSimi (CC k)≤partSimi (CC i) ∧ partSimi (CC j)≤partSimi (CC i), any three connected region CC i∈ UL cc, CC j∈ UL cc, and CC k∈ UL cc, use following formula to calculate differential seat angle Δ θ ijk;
Figure BDA00002496154500051
V (c wherein ic j) and v (c jc k) represent respectively vectorial c ic jAnd c jc k
Use the same method and calculate Δ θ jikWith Δ θ ikjIf
Figure BDA00002496154500052
Produce a new line of text L t, record its element S cc(L t)={ CC i, CC j, CC kAnd calculate c ic jAnd c jc kAverage angle as the angle of inclination of current text line, and with CC i, CC j, and CC kFrom set UL ccIn remove.These three connected regions are just as the kind daughter element of current text line;
Any two line segment c ic jAnd c mc nThe average tilt angle
Figure BDA00002496154500053
Be calculated as follows:
θ ‾ = θ ij + θ mn + π 2 if θ ij · θ mn ≤ 0 ^ max { | θ ij | , | θ mn | } ≥ π 4 θ ij + θ mn 2 otherwise - - - 13
In top equation, θ ijAnd θ mnSpan is
Figure BDA00002496154500055
Step 3: for UL ccIn remaining arbitrary connected region CC m, use following formula to calculate it to working as front L tBetween similarity: And all simi are pressed descending sort, from UL ccMiddle order is got a connected region CC tIf following 3 conditions satisfy, CC tBe added to S cc(L t) in:
A) at CC tK nearest-neighbors in have at least a CC k∈ S cc(L t), it is not only CC tThe brother and also connect it and CC tThe line of central point and current text line L tThe average tilt angle
Figure BDA00002496154500057
Between differential seat angle less than threshold values T 5
Arbitrary line segment c ic jWith average tilt n angle be
Figure BDA00002496154500058
Line between differential seat angle be calculated as follows:
Δθ = min { | - θ ij - θ ‾ | , π - | - θ ij - θ ‾ | } - - - 14
And T 5Value determined by following formula:
D wherein ijExpression connected region central point c iAnd c jBetween distance and
Figure BDA00002496154500063
The center of upper all connected regions of expression line l is by the mean value of adjacent center point distance after from left to right or from top to bottom arranging;
B) CC tAlso at CC kThe set that forms of K arest neighbors connected region in;
C) CC tCentral point with as front L tBetween distance less than threshold values T 6.
It is 3, T that K is set 6Determined by following formula:
T 6 = k ′ · h t | tgθ | ≤ 1 k ′ · w t | tgθ | > 1 - - - 16
In following formula, h tAnd w tRepresent respectively CC tHeight and width, θ is the X-axis positive dirction and be connected CC tAnd CC kAngle between the central point line, and k '=1/3;
If current connected region is added to S cc(L t) in, upgrade set UL ccAverage angle with when the front repeats this process until UL ccTill middle all elements is all processed, then again repeating step 1 to another group candidate's line of text seed of step 3 search until there is not any line of text seed.
Described employing is removed non-line of text based on the filtrator of morphology framework characteristic, and this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic step of All Ranges on candidate's line of text and is:
Step 1: prepare training sample, take English as example, prepare 26 letters and the 0-9 binary map form of the different fonts of totally 10 numerals, each portion of roman and italic, respectively with these binary map 90-degree rotations, 180 degree, 270 the degree, with postrotational binary map also as the positive example training sample; Prepare again in addition the non-text image of same sample number as the counter-example training sample;
Step 2: for each binary map, be S with its minimum area-encasing rectangle size conversion rh* S rw, and max (S rh, S rw)=S rg, S herein rg=32, the length on the limit that soon connected region will be larger becomes S rgAnd the constant rate that keeps height and width extracts the skeleton of connected region and is amplified to equally S rh* S rw, then, the skeleton of the amplification that extracts is carried out skeletal extraction again, and with center and square area center-aligned, final, square block is converted into the vector of 32 * 32=1024 dimension, as the input vector of sparse filtrator;
Step 3: adopt the FISHER sorter that proposes in IEEE computer vision 2011 nd Annual Meeting collection 543-550 pages to train, the text filed sorter Classifier that obtains training;
Step 4: for any vergence direction
Figure BDA00002496154500071
Candidate's line of text L t, at first with its rotation θ rAngle, purpose are that it is rotated to level or vertical direction;
θ rBe defined as follows:
θ r = - θ ‾ | θ ‾ | ≤ π 4 sign ( θ ‾ ) · ( π 2 - | θ ‾ | ) | θ ‾ | > π 4 - - - 17
Wherein
Figure BDA00002496154500073
Represent the average slope angle in 14 formulas.
Step 5: for candidate's line of text and its connected region of composition
Figure BDA00002496154500074
Suppose
Figure BDA00002496154500075
The expression element The directory markeys of proper vector.The tag definitions of whole like this line of text is as follows:
C ( L t ) = 1 Σ i C ( y i t ) ≥ C T 0 otherwise - - - 18
C T=k 2·n19
K wherein 2Be one and control parameter, n is current text row L tThe connected region number, k 2Get 0.7,
Being labeled as 0 expression current line is line of text, is kept, otherwise is abandoned.
The beneficial effect that the present invention compared with prior art has:
1) the text detection algorithm in the present invention is stronger to the robustness of size text, and the large text that conventional method is detected poor effect has effect preferably.
2) the text detection algorithm in the present invention can overcome detection algorithm commonly used can only detection level or the problem of vertical direction text, it can detect the scene text of any direction.
3) the captions extraction algorithm in the present invention can overcome the not high shortcoming of detection algorithm accuracy of detection commonly used, and due to the internal characteristics that adopts the sparse sorter learning text of FISHER, its accuracy of detection has had and increases substantially.
Description of drawings
Fig. 1 is any direction text line detection method FB(flow block) in natural image;
Fig. 2 (a) is a primitive nature image to be detected;
Fig. 2 (b) is the MSER zone of detected all binaryzations of the present invention;
Fig. 2 (c) is candidate's line of text seed testing result of the present invention;
Fig. 2 (d) is candidate's line of text testing result of the present invention;
Fig. 3 (a) is the detected candidate's line of text of the present invention;
Fig. 3 (b) is candidate's line of text transformation results of the present invention;
Fig. 3 (c) is candidate's line of text conversion back skeleton figure of the present invention;
Fig. 4 (a) is the MSER zone of the detected binaryzation of the present invention;
Fig. 4 (b) is the result that amplify in zone for the first time of the present invention;
Fig. 4 (c) is the result of skeletal extraction for the first time of the present invention;
Fig. 4 (d) is the result that skeleton for the second time of the present invention amplifies;
Fig. 4 (e) is the result of skeletal extraction for the second time of the present invention, and being used for provides input vector to the FISHER sorter;
Fig. 5 is the instance graph that any direction line of text in natural image is detected of the present invention.
Embodiment
Technical scheme for a better understanding of the present invention, the invention will be further described below in conjunction with accompanying drawing 1.Accompanying drawing 1 has been described natural image text identification frame diagram of the present invention.
In natural image, any direction text line detection method comprises the following steps:
(1) the maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity;
(2) employing based on candidate's line of text recognition methods of similarity, at first finds Three regions as the seed region of candidate's line of text, then expands to the All Ranges of this row;
(3) employing is removed non-line of text based on the filtrator of morphology framework characteristic, and this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic of All Ranges on candidate's line of text.
described maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity step be: at first, it is text filed as the candidate that detection calculates in Britain's machine vision all MSER maximum stable extreme regions that propose in 2002 nd Annual Meeting collection Robust Wide Baseline Stereofrom Maximally Stable Extremal Regions one literary compositions, then use the marginal information of canny operator extraction image, with the constrained line of these edge lines as MSERs, collect connected region, in collection process, a pixel can only be connected to the pixel of its four direction up and down, two pixels that can prevent like this edge pixel side couple together, after collecting all connected regions, use the geometric similarity degree between following 5 steps definition arbitrary region pair:
Step 1: be provided with connected region CC iAnd CC j, its standardization absolute distance is defined as follows:
dis ij 1 = | C x i - C x j | + k 1 * | C y i - C y j | k 1 h im + w im - ( k 1 * ( h i + h j ) + w i + w j ) / 2 - - - 1
Wherein
Figure BDA00002496154500092
Represent respectively CC iAnd CC jCentral point horizontal and vertical coordinate, h i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width, h im, w imThe height and width that represent respectively present image, k 1Be a constant of controlling horizontal range and vertical range contribution proportion, its value is set as 2.
Figure BDA00002496154500093
Value from 0 to 1.
Step 2: in order to enlarge the difference of the right distance of different CC, the distance metric in formula 1 further is modified as following expression:
dis ij 2 = dis ij 1 · ℵ i ( j ) - - - 2
Wherein
Figure BDA00002496154500095
Expression
Figure BDA00002496154500096
Arrive CC at all iThe central point distance
Figure BDA00002496154500097
The sequence number number.
Step 3: formula 2 further is revised as:
dis ij 3 = min p k { dis ip k j 2 , | p k | ∈ { 0,1,2 , . . . . . . , n - 2 } } - - - 3
P wherein kExpression is from CC iTo CC jA paths, and length is between 0 to n-2,
Figure BDA00002496154500099
Expression CC iTo CC jBetween shortest path, can obtain by the algorithm of Floyd algorithm or other identity functions:
Step 4: defining two interregional shape distances is:
dis ij 4 = max ( h i , h j ) · max ( w i , w j ) min ( h i , h j ) · min ( w i , w j ) - - - 4
H wherein i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width.
Step 5: connected region CC iAnd CC jThe geometric similarity degree be:
simi geometry ( i , j ) = exp ( - max ( dis ij 3 , dis ji 3 ) · dis ij 4 ) - - - 5
Connected region CC iAnd CC jColor similarity be:
At first with image by the RGB color space conversion to the hsv color space, H, S, V component are quantized into respectively 8,3,3 grades.Color histogram is 72 dimensions like this.Suppose CC iAnd CC jColor feature vector be respectively C i=[C I, 1, C i,2..., C I, t..., C i,n] and C j=[C J, 1, C J, 2..., C J, t..., C j,n], color similarity is:
simi color ( i , j ) = Σ t = 1 n min ( C i , t , C j , t ) - - - 6
N gets 72;
Finally, the synthetic similarity of two connected region synthetic geometry similarities and color similarity is:
simi(i,j)=(simi geometry(i,j)+simi color(i,j))/27。
Described employing is based on candidate's line of text recognition methods of similarity, at first find Three regions as the seed region of candidate's line of text, the All Ranges step that then expands to this row is: carry out candidate's line of text and generate on the basis of judging based on the brother between connected region pair;
(1) brother judges:
Whether enough the brother judges two zones similar and neighbour.If two zones are not the brothers, they can not be merged into the one text row, define following three restrictive conditions and judge whether two connected regions are brothers:
A) ratio of the height and width of two adjacent areas should be at two threshold value T 1And T 2Between;
B) distance between two connected regions should be greater than T 3Multiply by the high or wide of larger zone;
C) two adjacent characters should have similar color characteristic, so their color similarity should be greater than a threshold value T 4
Formalization representation is as follows:
S ij = S ij 1 ^ S ij 2 ^ S ij 3 - - - 8
S ijExpression connected region CC iAnd CC jWhether, if its value be 1, be similar zone, they may belong to the one text row, otherwise can not belong to the one text row if being similar zone,
Figure BDA00002496154500103
Represent respectively three above-mentioned restrictive conditions, T 1, T 2, T 3, T 4Be set as respectively 2,4,3,0.4,
The refinement of condition 1 judge as shown in the formula:
h r=max(h i,h j)/min(h i,h j)
w r=max(w i,w j)/min(w i,w j 9
In formula 10, θ represents connected region CC iAnd CC jAngle between the line of central point and X-axis positive dirction;
The refinement of condition 2 is judged as follows:
S ij 2 = 1 dis ij ≤ T 3 · max ( h i , h j ) | tgθ | > 1 1 dis ij ≤ T 3 · max ( w i , w j ) | tgθ | ≤ 1 0 others - - - 11
(2) candidate's line of text generates:
In order to produce candidate's line of text, at first find three seed connected regions, then expand to and comprise more connected region, step is as follows:
Step 1: make UL ccRepresent all the current set of not determining the connected region composition of line of text, at first initialization set UL ccBe the set that all connected regions form, for each region division respective flag position, and be initialized as 0.To UL ccIn each connected region calculate similarity simi (i, *) between other connected region of it and all, then take out two maximum similarities and obtain they and, be designated as partSimi (CC i), then all partSimi values are by descending sort;
Step 2: for the S that satisfies condition ij=1 ∧ S jk=1partSimi (CC k)≤partSimi (CC i) ∧ partSimi (CC j)≤partSimi (CC i), any three connected region CC i∈ UL cc, C C j∈ UL cc, and CC k∈ UL cc, use following formula to calculate differential seat angle Δ θ ijk,
Figure BDA00002496154500112
V (c wherein ic j) and v (c jc k) represent respectively vectorial c ic jAnd c jc k
Use the same method and calculate Δ θ jikWith Δ θ ikjIf
Figure BDA00002496154500113
Produce a new line of text L t, record its element S cc(L t)={ CC i, CC j, CC kAnd calculate c ic jAnd c jc kAverage angle as the angle of inclination of current text line, and with CC i, CC j, and CC kFrom set UL ccIn remove.These three connected regions are just as the kind daughter element of current text line;
Any two line segment c ic jAnd c mc nThe average tilt angle
Figure BDA00002496154500121
Be calculated as follows:
θ ‾ = θ ij + θ mn + π 2 if θ ij · θ mn ≤ 0 ^ max { | θ ij | , | θ mn | } ≥ π 4 θ ij + θ mn 2 otherwise - - - 13
In top equation, θ ijAnd θ mnSpan is
Figure BDA00002496154500123
Step 3: for UL ccIn remaining arbitrary connected region CC m, use following formula to calculate it to working as front L tBetween similarity:
Figure BDA00002496154500124
And all simi are pressed descending sort, from UL ccMiddle order is got a connected region CC tIf following 3 conditions satisfy, CC tBe added to S cc(L t) in:
A) at CC tK nearest-neighbors in have at least a CC k∈ S cc(L t), it is not only CC tThe brother and also connect it and CC tThe line of central point and current text line L tThe average tilt angle
Figure BDA00002496154500125
Between differential seat angle less than threshold values T 5
Arbitrary line segment c ic jWith average tilt n angle be
Figure BDA00002496154500126
Line between differential seat angle be calculated as follows:
Δθ = min { | - θ ij - θ ‾ | , π - | - θ ij - θ ‾ | } - - - 14
And T 5Value determined by following formula:
Figure BDA00002496154500128
D wherein ijExpression connected region central point c iAnd c jBetween distance and
Figure BDA00002496154500129
The center of upper all connected regions of expression line l is by the mean value of adjacent center point distance after from left to right or from top to bottom arranging;
B) CC tAlso at CC kThe set that forms of K arest neighbors connected region in;
C) CC tCentral point with as front L tBetween distance less than threshold values T 6.
It is 3, T that K is set 6Determined by following formula:
T 6 = k ′ · h t | tgθ | ≤ 1 k ′ · w t | tgθ | > 1 - - - 16
In following formula, h tAnd w tRepresent respectively CC tHeight and width, θ is the X-axis positive dirction and be connected CC tAnd CC kAngle between the central point line, and k '=1/3;
If current connected region is added to S cc(L t) in, upgrade set UL ccAverage angle with when the front repeats this process until UL ccTill middle all elements is all processed, then again repeating step 1 to another group candidate's line of text seed of step 3 search until there is not any line of text seed.
Described employing is removed non-line of text based on the filtrator of morphology framework characteristic, and this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic step of All Ranges on candidate's line of text and is:
Step 1: prepare training sample, take English as example, prepare 26 letters and the 0-9 binary map form of the different fonts of totally 10 numerals, each portion of roman and italic, respectively with these binary map 90-degree rotations, 180 degree, 270 the degree, with postrotational binary map also as the positive example training sample; Prepare again in addition the non-text image of same sample number as the counter-example training sample;
Step 2: for each binary map, be S with its minimum area-encasing rectangle size conversion rh* S rw, and max (S rh, S rw)=S rg, S herein rg=32, the length on the limit that soon connected region will be larger becomes S rgAnd the constant rate that keeps height and width extracts the skeleton of connected region and is amplified to equally S rh* S rw, then, the skeleton of the amplification that extracts is carried out skeletal extraction again, and with center and square area center-aligned, final, square block is converted into the vector of 32 * 32=1024 dimension, as the input vector of sparse filtrator;
Step 3: adopt the FISHER sorter that proposes in IEEE computer vision 2011 nd Annual Meeting collection 543-550 pages to train, the text filed sorter Classifier that obtains training;
Step 4: for any vergence direction
Figure BDA00002496154500132
Candidate's line of text L t, at first with its rotation θ rAngle, purpose are that it is rotated to level or vertical direction;
θ rBe defined as follows:
θ r = - θ ‾ | θ ‾ | ≤ π 4 sign ( θ ‾ ) · ( π 2 - | θ ‾ | ) | θ ‾ | > π 4 - - - 17
Wherein
Figure BDA00002496154500134
Represent the average slope angle in 14 formulas.
Step 5: for candidate's line of text and its connected region of composition
Figure BDA00002496154500135
Suppose
Figure BDA00002496154500136
The expression element
Figure BDA00002496154500137
The directory markeys of proper vector.The tag definitions of whole like this line of text is as follows:
C ( L t ) = 1 Σ i C ( y i t ) ≥ C T 0 otherwise - - - 18
C T=k 2·n 19
K wherein 2Be one and control parameter, n is current text row L tThe connected region number, k 2Get 0.7,
Being labeled as 0 expression current line is line of text, is kept, otherwise is abandoned.
Embodiment
As shown in Fig. 2,3,4, for a certain natural image, provided being included in the identification process example of captions wherein.Describe below in conjunction with method of the present invention the concrete steps that this example is implemented in detail, as follows:
For a certain natural image, as shown in accompanying drawing 2 (a),
(1) it is text filed that the maximum stable extreme regions MSER detection method of the belt restraining in employing claim 2 draws all candidates, and result is as shown in accompanying drawing 2 (b);
(2) in conjunction with the fraternal decision method in the definition of the similarity in claim 2 and claim 3, adopt the seed connected region in claim 3 to detect, the candidate's line of text seed region that draws is as scheming as shown in attached 2 (c).
(3) the candidate's line of text extended method in employing claim 3, all candidate's line of text that draw are as shown in Fig. 2 (d).
(4) the arbitrary candidate's line of text that the upper step was obtained, right to use require the rotational transform in 4 that it is transformed into level or vertical candidate's line of text, and result is as shown in accompanying drawing 3 (b), and its corresponding skeleton structure is as shown in accompanying drawing 3 (c).
(5) the arbitrary connected region in the arbitrary candidate's level that the upper step is obtained or vertical line of text, as shown in accompanying drawing 4 (a), right to use requires the feature extracting method in 4 to extract feature, and each intermediate steps result is as shown in accompanying drawing 4 (b), (c), (d), (e).Then right to use requires the sparse sorter in 4 to classify, and abandons classification results and is not the candidate row of line of text.
In some natural images any direction line of text testing result as shown in Figure 5, red of detected text filed use or blue piece are showed.Can find out from accompanying drawing, any direction that this method can detect in natural image preferably is text filed, and testing result can reach precision preferably.

Claims (4)

1. any direction text line detection method in a natural image is characterized in that comprising the following steps:
(1) the maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity;
(2) employing based on candidate's line of text recognition methods of similarity, at first finds Three regions as the seed region of candidate's line of text, then expands to the All Ranges of this row;
(3) employing is removed non-line of text based on the filtrator of morphology framework characteristic, and this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic of All Ranges on candidate's line of text.
2. any direction text line detection method in a kind of natural image according to claim 1, it is characterized in that described maximum stable extreme regions detection method with belt restraining detects the text filed of candidate, then calmodulin binding domain CaM is big or small, absolute distance, relative distance, geometric similarity degree between the contextual information defined range, and be combined with color similarity obtain the zone pair between synthetic similarity step be: at first, it is text filed as the candidate that detection calculates in Britain's machine vision all MSER maximum stable extreme regions that propose in 2002 nd Annual Meeting collection Robust Wide Baseline Stereofrom Maximally Stable Extremal Regions one literary compositions, then use the marginal information of canny operator extraction image, with the constrained line of these edge lines as MSERs, collect connected region, in collection process, a pixel can only be connected to the pixel of its four direction up and down, two pixels that can prevent like this edge pixel side couple together, after collecting all connected regions, use the geometric similarity degree between following 5 steps definition arbitrary region pair:
Step 1: be provided with connected region CC iAnd CC j, its standardization absolute distance is defined as follows:
dis ij 1 = | C x i - C x j | + k 1 * | C y i - C y j | k 1 h im + w im - ( k 1 * ( h i + h j ) + w i + w j ) / 2 - - - 1
Wherein
Figure FDA00002496154400012
Represent respectively CC iAnd CC jCentral point horizontal and vertical coordinate, h i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width, h im, w imThe height and width that represent respectively present image, k 1Be a constant of controlling horizontal range and vertical range contribution proportion, its value is set as 2.
Figure FDA00002496154400013
Value from 0 to 1;
Step 2: in order to enlarge the difference of the right distance of different CC, the distance metric in formula 1 further is modified as following expression:
dis ij 2 = dis ij 1 · ℵ i ( j ) - - - 2
Wherein Expression
Figure FDA00002496154400016
Arrive CC at all iThe central point distance
Figure FDA00002496154400017
The sequence number number.
Step 3: formula 2 further is revised as:
dis ij 3 = min p k { dis ip k j 2 , | p k | ∈ { 0,1,2 , . . . . . . , n - 2 } } - - - 3
P wherein kExpression is from CC iTo CC jA paths, and length is between 0 to n-2,
Figure FDA00002496154400022
Expression CC iTo CC jBetween shortest path, can obtain by the algorithm of Floyd algorithm or other identity functions:
Step 4: defining two interregional shape distances is:
dis ij 4 = max ( h i , h j ) · max ( w i , w j ) min ( h i , h j ) · min ( w i , w j ) - - - 4
H wherein i, h j, w i, w jRepresent respectively CC iAnd CC jHeight and width.
Step 5: connected region CC iAnd CC jThe geometric similarity degree be:
simi geometry ( i , j ) = exp ( - max ( dis ij 3 , dis ji 3 ) · dis ij 4 ) - - - 5
Connected region CC iAnd CC jColor similarity be:
At first with image by the RGB color space conversion to the hsv color space, H, S, V component are quantized into respectively 8,3,3 grades.Color histogram is 72 dimensions like this.Suppose CC iAnd CC jColor feature vector be respectively C i=[C I, 1, C i,2..., C I, t..., C i,n] and C j=[C J, 1, C J, 2..., C J, t..., C j,n], color similarity is:
simi color ( i , j ) = Σ t = 1 n min ( C i , t , C j , t ) - - - 6
N gets 72;
Finally, the synthetic similarity of two connected region synthetic geometry similarities and color similarity is:
simi(i,j)=(simi geometry(i,j)+simi color(i,j))/2 7。
3. any direction text line detection method in a kind of natural image according to claim 1, it is characterized in that described employing is based on candidate's line of text recognition methods of similarity, at first find Three regions as the seed region of candidate's line of text, the All Ranges step that then expands to this row is: carry out candidate's line of text and generate on the basis of judging based on the brother between connected region pair;
(1) brother judges:
Whether enough the brother judges two zones similar and neighbour.If two zones are not the brothers, they can not be merged into the one text row, define following three restrictive conditions and judge whether two connected regions are brothers:
A) ratio of the height and width of two adjacent areas should be at two threshold value T 1And T 2Between;
B) distance between two connected regions should be greater than T 3Multiply by the high or wide of larger zone;
C) two adjacent characters should have similar color characteristic, so their color similarity should be greater than a threshold value T 4
Formalization representation is as follows:
S ij = S ij 1 ^ S ij 2 ^ S ij 3 - - - 8
S ijExpression connected region CC iAnd CC jWhether, if its value be 1, be similar zone, they may belong to the one text row, otherwise can not belong to the one text row if being similar zone,
Figure FDA00002496154400032
Represent respectively three above-mentioned restrictive conditions, T 1, T 2, T 3, T 4Be set as respectively 2,4,3,0.4,
The refinement of condition 1 judge as shown in the formula:
h r=max(h i,h j)/min(h i,h j)
w r=max(w i,w j)/min(w i,w j 9
Figure FDA00002496154400033
In formula 10, θ represents connected region CC iAnd CC jAngle between the line of central point and X-axis positive dirction;
The refinement of condition 2 is judged as follows:
S ij 2 = 1 dis ij ≤ T 3 · max ( h i , h j ) | tgθ | > 1 1 dis ij ≤ T 3 · max ( w i , w j ) | tgθ | ≤ 1 0 others - - - 11
(2) candidate's line of text generates:
In order to produce candidate's line of text, at first find three seed connected regions, then expand to and comprise more connected region, step is as follows:
Step 1: make UL ccRepresent all the current set of not determining the connected region composition of line of text, at first initialization set UL ccBe the set that all connected regions form, for each region division respective flag position, and be initialized as 0.To UL ccIn each connected region calculate similarity simi (i, *) between other connected region of it and all, then take out two maximum similarities and obtain they and, be designated as partSimi (CC i), then all partSimi values are by descending sort;
Step 2: for the s that satisfies condition ij=1 ∧ s ik=1partSimi (CC k)≤partSimi (CC i) ∧ partSimi (CC j)≤partSimi is (CC first i), any three connected region CC i∈ UL cc, CC j∈ UL cc, and CC k∈ UL cc, use following formula to calculate differential seat angle Δ θ ijk:
V (c wherein ic j) and v (c jc k) represent respectively vectorial c ic jAnd c jc k
Use the same method and calculate Δ θ jikWith Δ θ ikjIf
Figure FDA00002496154400042
Produce a new line of text L t, record its element S cc(L t)={ CC i, CC j, CC kAnd calculate c ic jAnd c jc kAverage angle as the angle of inclination of current text line, and with CC i, CC j, and CC kFrom set UL ccIn remove.These three connected regions are just as the kind daughter element of current text line;
Any two line segment c ic jAnd c mc nThe average tilt angle
Figure FDA00002496154400043
Be calculated as follows:
θ ‾ = θ ij + θ mn + π 2 if θ ij · θ mn ≤ 0 ^ max { | θ ij | , | θ mn | } ≥ π 4 θ ij + θ mn 2 otherwise - - - 13
In top equation, θ ijAnd θ mnSpan is
Figure FDA00002496154400045
Step 3: for UL ccIn remaining arbitrary connected region CC m, use following formula to calculate it to working as front L tBetween similarity: And all simi are pressed descending sort, from UL ccMiddle order is got a connected region CC tIf following 3 conditions satisfy, CC tBe added to S cc(L t) in:
A) at CC tK nearest-neighbors in have at least a CC k∈ S cc(L t), it is not only CC tThe brother and also connect it and CC tThe line of central point and current text line L tThe average tilt angle
Figure FDA00002496154400051
Between differential seat angle less than threshold values T 5
Arbitrary line segment c ic jWith average tilt n angle be
Figure FDA00002496154400052
Line between differential seat angle be calculated as follows:
Δθ = min { | - θ ij - θ ‾ | , π - | - θ ij - θ ‾ | } - - - 14
And T 5Value determined by following formula:
Figure FDA00002496154400054
D wherein ijExpression connected region central point c iAnd c jBetween distance and
Figure FDA00002496154400055
The center of upper all connected regions of expression line l is by the mean value of adjacent center point distance after from left to right or from top to bottom arranging;
B) CC tAlso at CC kThe set that forms of K arest neighbors connected region in;
C) CC tCentral point with as front L tBetween distance less than threshold values T 6.
It is 3, T that K is set 6Determined by following formula:
T 6 = k ′ · h t | tgθ | ≤ 1 k ′ · w t | tgθ | > 1 - - - 16
In following formula, h tAnd w tRepresent respectively CC tHeight and width, θ is the X-axis positive dirction and be connected CC tAnd CC kAngle between the central point line, and k '=1/3;
If current connected region is added to S cc(L t) in, upgrade set UL ccAverage angle with when the front repeats this process until UL ccTill middle all elements is all processed, then again repeating step 1 to another group candidate's line of text seed of step 3 search until there is not any line of text seed.
4. the text line detection method in a kind of natural image according to claim 1, it is characterized in that described employing is based on the non-line of text of filtrator removal of morphology framework characteristic, this filtrator uses a sparse sorter to filter, and the required proper vector of sorter is taken from the morphology framework characteristic step of All Ranges on candidate's line of text and is:
Step 1: prepare training sample, take English as example, prepare 26 letters and the 0-9 binary map form of the different fonts of totally 10 numerals, each portion of roman and italic, respectively with these binary map 90-degree rotations, 180 degree, 270 the degree, with postrotational binary map also as the positive example training sample; Prepare again in addition the non-text image of same sample number as the counter-example training sample;
Step 2: for each binary map, be S with its minimum area-encasing rectangle size conversion rh* S rw, and max (S rh, S rw)=S rg, S herein rg=32, the length on the limit that soon connected region will be larger becomes S rgAnd the constant rate that keeps height and width extracts the skeleton of connected region and is amplified to equally S rh* S rw, then, the skeleton of the amplification that extracts is carried out skeletal extraction again, and with center and square area center-aligned, final, square block is converted into the vector of 32 * 32=1024 dimension, as the input vector of sparse filtrator;
Step 3: adopt the FISHER sorter that proposes in IEEE computer vision 2011 nd Annual Meeting collection 543-550 pages to train, the text filed sorter Classifier that obtains training;
Step 4: for any vergence direction
Figure FDA00002496154400061
Candidate's line of text L t, at first with its rotation θ rAngle, purpose are that it is rotated to level or vertical direction;
θ rBe defined as follows:
θ r = - θ ‾ | θ ‾ | ≤ π 4 sign ( θ ‾ ) · ( π 2 - | θ ‾ | ) | θ ‾ | > π 4 - - - 17
Wherein
Figure FDA00002496154400063
Represent the average slope angle in 14 formulas.
Step 5: for candidate's line of text and its connected region of composition
Figure FDA00002496154400064
Suppose
Figure FDA00002496154400065
The expression element
Figure FDA00002496154400066
The directory markeys of proper vector.The tag definitions of whole like this line of text is as follows:
C ( L t ) = 1 Σ i C ( y i t ) ≥ C T 0 otherwise - - - 18
C T=k 2·n 19
K wherein 2Be one and control parameter, n is current text row L tThe connected region number, k 2Get 0.7,
Being labeled as 0 expression current line is line of text, is kept, otherwise is abandoned.
CN201210506072.4A 2012-11-29 2012-11-29 Any direction text line detection method in a kind of natural image Expired - Fee Related CN103136523B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210506072.4A CN103136523B (en) 2012-11-29 2012-11-29 Any direction text line detection method in a kind of natural image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210506072.4A CN103136523B (en) 2012-11-29 2012-11-29 Any direction text line detection method in a kind of natural image

Publications (2)

Publication Number Publication Date
CN103136523A true CN103136523A (en) 2013-06-05
CN103136523B CN103136523B (en) 2016-06-29

Family

ID=48496331

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210506072.4A Expired - Fee Related CN103136523B (en) 2012-11-29 2012-11-29 Any direction text line detection method in a kind of natural image

Country Status (1)

Country Link
CN (1) CN103136523B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778470A (en) * 2015-03-12 2015-07-15 浙江大学 Character detection and recognition method based on component tree and Hough forest
CN105005764A (en) * 2015-06-29 2015-10-28 东南大学 Multi-direction text detection method of natural scene
CN105678207A (en) * 2014-11-19 2016-06-15 富士通株式会社 Device and method for identifying content of target nameplate image from given image
CN105825216A (en) * 2016-03-17 2016-08-03 中国科学院信息工程研究所 Method of locating text in complex background image
CN106503732A (en) * 2016-10-13 2017-03-15 北京云江科技有限公司 Text image and the sorting technique and categorizing system of non-textual image
CN106796647A (en) * 2014-09-05 2017-05-31 北京市商汤科技开发有限公司 Scene text detecting system and method
CN107368830A (en) * 2016-05-13 2017-11-21 佳能株式会社 Method for text detection and device and text recognition system
CN107368826A (en) * 2016-05-13 2017-11-21 佳能株式会社 Method and apparatus for text detection
CN107688807A (en) * 2016-08-05 2018-02-13 腾讯科技(深圳)有限公司 Image processing method and image processing apparatus
CN107784316A (en) * 2016-08-26 2018-03-09 阿里巴巴集团控股有限公司 A kind of image-recognizing method, device, system and computing device
CN108288061A (en) * 2018-03-02 2018-07-17 哈尔滨理工大学 A method of based on the quick positioning tilt texts in natural scene of MSER
CN108399419A (en) * 2018-01-25 2018-08-14 华南理工大学 Chinese text recognition methods in natural scene image based on two-dimentional Recursive Networks
CN108875744A (en) * 2018-03-05 2018-11-23 南京理工大学 Multi-oriented text lines detection method based on rectangle frame coordinate transform
CN109284751A (en) * 2018-10-31 2019-01-29 河南科技大学 The non-textual filtering method of text location based on spectrum analysis and SVM
CN109934229A (en) * 2019-03-28 2019-06-25 网易有道信息技术(北京)有限公司 Image processing method, device, medium and calculating equipment
CN110059600A (en) * 2019-04-09 2019-07-26 杭州视氪科技有限公司 A kind of single line text recognition methods based on direction gesture
CN110211048A (en) * 2019-05-28 2019-09-06 湖北华中电力科技开发有限责任公司 A kind of complicated archival image Slant Rectify method based on convolutional neural networks
CN111325210A (en) * 2018-12-14 2020-06-23 北京京东尚科信息技术有限公司 Method and apparatus for outputting information
CN112560599A (en) * 2020-12-02 2021-03-26 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium
CN112883974A (en) * 2021-05-06 2021-06-01 江西省江咨金发数据科技发展有限公司 Electronic letter identification system based on image verification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1512439A (en) * 2002-12-26 2004-07-14 ��ʿͨ��ʽ���� Video frequency text processor
US20060062460A1 (en) * 2004-08-10 2006-03-23 Fujitsu Limited Character recognition apparatus and method for recognizing characters in an image
CN102208023A (en) * 2011-01-23 2011-10-05 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102542268A (en) * 2011-12-29 2012-07-04 中国科学院自动化研究所 Method for detecting and positioning text area in video

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1512439A (en) * 2002-12-26 2004-07-14 ��ʿͨ��ʽ���� Video frequency text processor
US20060062460A1 (en) * 2004-08-10 2006-03-23 Fujitsu Limited Character recognition apparatus and method for recognizing characters in an image
CN102208023A (en) * 2011-01-23 2011-10-05 浙江大学 Method for recognizing and designing video captions based on edge information and distribution entropy
CN102542268A (en) * 2011-12-29 2012-07-04 中国科学院自动化研究所 Method for detecting and positioning text area in video

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIE YUAN ET ALL: "A New Video Text Detection Method", 《JCDL`11 PROCEEDINGS OF THE 11TH ANNUAL INTERNATIONAL ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES》 *
张引 等: "面向彩色图像和视频的文本提取新方法", 《计算机辅助设计与图形学学报》 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106796647B (en) * 2014-09-05 2018-09-14 北京市商汤科技开发有限公司 Scene text detecting system and method
CN106796647A (en) * 2014-09-05 2017-05-31 北京市商汤科技开发有限公司 Scene text detecting system and method
CN105678207A (en) * 2014-11-19 2016-06-15 富士通株式会社 Device and method for identifying content of target nameplate image from given image
CN104778470A (en) * 2015-03-12 2015-07-15 浙江大学 Character detection and recognition method based on component tree and Hough forest
CN105005764B (en) * 2015-06-29 2018-02-13 东南大学 The multi-direction Method for text detection of natural scene
CN105005764A (en) * 2015-06-29 2015-10-28 东南大学 Multi-direction text detection method of natural scene
CN105825216A (en) * 2016-03-17 2016-08-03 中国科学院信息工程研究所 Method of locating text in complex background image
CN107368826A (en) * 2016-05-13 2017-11-21 佳能株式会社 Method and apparatus for text detection
CN107368830A (en) * 2016-05-13 2017-11-21 佳能株式会社 Method for text detection and device and text recognition system
CN107368830B (en) * 2016-05-13 2021-11-09 佳能株式会社 Text detection method and device and text recognition system
CN107688807A (en) * 2016-08-05 2018-02-13 腾讯科技(深圳)有限公司 Image processing method and image processing apparatus
CN107784316A (en) * 2016-08-26 2018-03-09 阿里巴巴集团控股有限公司 A kind of image-recognizing method, device, system and computing device
CN106503732A (en) * 2016-10-13 2017-03-15 北京云江科技有限公司 Text image and the sorting technique and categorizing system of non-textual image
CN106503732B (en) * 2016-10-13 2019-07-19 北京云江科技有限公司 The classification method and categorizing system of text image and non-textual image
CN108399419A (en) * 2018-01-25 2018-08-14 华南理工大学 Chinese text recognition methods in natural scene image based on two-dimentional Recursive Networks
CN108399419B (en) * 2018-01-25 2021-02-19 华南理工大学 Method for recognizing Chinese text in natural scene image based on two-dimensional recursive network
CN108288061A (en) * 2018-03-02 2018-07-17 哈尔滨理工大学 A method of based on the quick positioning tilt texts in natural scene of MSER
CN108875744A (en) * 2018-03-05 2018-11-23 南京理工大学 Multi-oriented text lines detection method based on rectangle frame coordinate transform
CN109284751A (en) * 2018-10-31 2019-01-29 河南科技大学 The non-textual filtering method of text location based on spectrum analysis and SVM
CN111325210A (en) * 2018-12-14 2020-06-23 北京京东尚科信息技术有限公司 Method and apparatus for outputting information
CN109934229A (en) * 2019-03-28 2019-06-25 网易有道信息技术(北京)有限公司 Image processing method, device, medium and calculating equipment
CN110059600A (en) * 2019-04-09 2019-07-26 杭州视氪科技有限公司 A kind of single line text recognition methods based on direction gesture
CN110059600B (en) * 2019-04-09 2021-07-06 杭州视氪科技有限公司 Single-line character recognition method based on pointing gesture
CN110211048A (en) * 2019-05-28 2019-09-06 湖北华中电力科技开发有限责任公司 A kind of complicated archival image Slant Rectify method based on convolutional neural networks
CN112560599A (en) * 2020-12-02 2021-03-26 上海眼控科技股份有限公司 Text recognition method and device, computer equipment and storage medium
CN112883974A (en) * 2021-05-06 2021-06-01 江西省江咨金发数据科技发展有限公司 Electronic letter identification system based on image verification

Also Published As

Publication number Publication date
CN103136523B (en) 2016-06-29

Similar Documents

Publication Publication Date Title
CN103136523B (en) Any direction text line detection method in a kind of natural image
Nikolaou et al. Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths
Antonacopoulos et al. ICDAR2015 competition on recognition of documents with complex layouts-RDCL2015
JP5492205B2 (en) Segment print pages into articles
CN102081731B (en) Method and device for extracting text from image
US8462394B2 (en) Document type classification for scanned bitmaps
Alberti et al. Labeling, cutting, grouping: an efficient text line segmentation method for medieval manuscripts
CN101122952A (en) Picture words detecting method
CN103530600A (en) License plate recognition method and system under complicated illumination
CN108154151B (en) Rapid multi-direction text line detection method
Rigaud et al. Automatic text localisation in scanned comic books
Shivakumara et al. Gradient-angular-features for word-wise video script identification
Forczmański et al. Stamps detection and classification using simple features ensemble
Unar et al. Artificial Urdu text detection and localization from individual video frames
Mullick et al. An efficient line segmentation approach for handwritten Bangla document image
Kunishige et al. Scenery character detection with environmental context
CN107368826B (en) Method and apparatus for text detection
Melinda et al. Parameter-free table detection method
Zhan et al. A robust split-and-merge text segmentation approach for images
Shelke et al. A novel multistage classification and wavelet based kernel generation for handwritten marathi compound character recognition
Ziaratban et al. An adaptive script-independent block-based text line extraction
Lue et al. A novel character segmentation method for text images captured by cameras
Phan et al. Text detection in natural scenes using gradient vector flow-guided symmetry
Ahmed et al. Enhancing the character segmentation accuracy of bangla ocr using bpnn
Nguyen et al. An effective method for text line segmentation in historical document images

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160629

Termination date: 20191129