CN106874905B - A method of the natural scene text detection based on self study Color-based clustering - Google Patents

A method of the natural scene text detection based on self study Color-based clustering Download PDF

Info

Publication number
CN106874905B
CN106874905B CN201710021572.1A CN201710021572A CN106874905B CN 106874905 B CN106874905 B CN 106874905B CN 201710021572 A CN201710021572 A CN 201710021572A CN 106874905 B CN106874905 B CN 106874905B
Authority
CN
China
Prior art keywords
character
hierarchical clustering
text
color
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710021572.1A
Other languages
Chinese (zh)
Other versions
CN106874905A (en
Inventor
郭建京
邹北骥
吴慧
杨文君
徐子雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201710021572.1A priority Critical patent/CN106874905B/en
Publication of CN106874905A publication Critical patent/CN106874905A/en
Application granted granted Critical
Publication of CN106874905B publication Critical patent/CN106874905B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram

Abstract

The present invention provides a kind of methods of natural scene text detection based on self study Color-based clustering, first, hierarchical clustering and Parameter Self-learning strategy are combined, design a kind of adaptive Color-based clustering method, extract the candidate characters in image, the adaptive Color-based clustering method can learn automatically weight threshold for different images, there is preferable character recall rate.Then, by training Adaboost classifier, building character verifies model, removes non-text character;Finally, merging character constructs line of text, and text detection result is obtained by post-processing.Compared with traditional method, this method can obtain higher text detection recall rate, and the text results detected are more accurate.

Description

A method of the natural scene text detection based on self study Color-based clustering
Technical field
The invention belongs to mode identification technologies, are related to a kind of natural scene text inspection based on self study Color-based clustering Survey method.
Background technique
Natural scene image text includes mass efficient information, extracts the weight that image text is analysis of image content and understanding Premise is wanted, and can be widely applied to car plate detection, unmanned, content-based image retrieval, mobile phone text identification and machine The fields such as people's self-navigation.However, being increased since natural scene image Method for text detection is influenced by various factors The difficulty of text detection, wherein influence factor is broadly divided into following three classes:
Complicated image background: Image Acquisition color complexity in arbitrary scene, different images is different, and exists The interfering objects such as a large amount of leaf, brick, railing and tile, easily lead to text detection mistake.
Diversified text: the diversification of text size and pattern in natural scene image, and text character exists Different degrees of distortion and inclination.
Different degrees of disturbing factor: natural scene image collects for photographing outdoors, vulnerable to different degrees of illumination, The influence of shade, resolution ratio and shooting angle.
To overcome above-mentioned influence factor, the accuracy rate of text detection is improved, experts and scholars propose a large amount of natural scene Method for text detection is broadly divided into two classes: the method based on sliding window and the method based on connected domain.
Based on the Method for text detection of sliding window usually using multi-scale sliding window mouth, original image is scanned, extracts and waits Selection one's respective area is verified then in conjunction with candidate region color, gradient and Texture eigenvalue using the method for machine learning, Obtain text detection result.Due to the diversity of text size in image, based on the method for sliding window usually using multiple dimensioned Window scan image extracts candidate text, so that time-consuming for this method, the candidate region of generation is excessive, increases follow-up text The difficulty of verifying.
Method based on connected domain is Method for text detection more popular at present.This method is further divided into three sons and appoints Business: (1) candidate characters extract, the verifying of (2) candidate characters, (3) text filed analysis.
Candidate characters, which extract, usually considers that the pixel that text character includes in image has gray consistency, color consistent Property, the features such as stroke width homogeneity, and then extract the similar pixel of feature, construct candidate characters connected domain.
Candidate characters verifying usually by analyzing character and background area, extract a series of easily distinguishable backgrounds and The feature of text, and the method validation candidate characters connected domain of machine learning is combined, remove non-text character.
Text filed analysis is usually to carry out post-processing operation to the character retained after verifying.Generally, by analyzing character The spatial position of connected domain, color, Texture eigenvalue merge character similar in position, color and texture, constitute text Row, is then segmented and is verified to line of text using heuristic rule and the mode of machine learning, obtain final detection knot Fruit.
Since natural scene image background is complicated and changeable, the diversifications such as text color, font, size in image, and by The influence of different degrees of illumination, shade, shooting angle.Therefore, how effectively to be mentioned from the background of differing complexity Candidate characters are taken, are the key that based on connected region Method for text detection.
Summary of the invention
The present invention provides a kind of Method for text detection based on self study Color-based clustering, in order to overcome above-mentioned existing text Detection method there are the problem of, this method combines hierarchical clustering and Parameter Self-learning strategy, realizes adaptive Color-based clustering Algorithm constructs color layer, extracts the connected region in color layer, as candidate characters, and then positions the text in image.
A kind of natural scene Method for text detection based on self study Color-based clustering, comprising the following steps:
Step 1: R, G, B color-values of each pixel in pending text detection image I are projected into three-dimensional color In space, three-dimensional color space is equidistantly divided, each three-dimensional color space cube is as a hierarchical clustering base This unit;
Using the color mean value of all pixels point in each three-dimensional color space cube as hierarchical clustering basic unit Feature c;
C=(μ (r), μ (g), μ (b)), wherein μ (r), μ (g) and μ (b) are respectively all pictures in hierarchical clustering basic unit R, G, B color mean value of vegetarian refreshments;
Step 2: feature weight vector w, the w=(w of initialization hierarchical clustering basic unitr,wg,wb,wθ);
Wherein, wr, wg, wbThe respectively color distance weighting of R, G, B of hierarchical clustering basic unit pixel, wθIt is poly- Class threshold value;
Step 3: with the feature weight vector w of hierarchical clustering basic unit, it is basic successively to calculate any two hierarchical clustering Color distance between unit;
di,j=wri(r)-μj(r)|+wgi(g)-μj(g)|
+wbi(b)-μj(b)|
Wherein, μiAnd μjRespectively indicate i and j hierarchical clustering basic unit;
Step 4: the smallest two hierarchical clustering basic units of color distance being merged, new hierarchical clustering base is obtained This unit, and the feature c of new hierarchical clustering basic unit is calculated, merged with hierarchical clustering basic unit and constructs corresponding level Clustering tree, return step 3, until hierarchical clustering basic unit quantity is 1;
Step 5: the feature vector of building positive sample and negative sample;
According to cluster threshold value wθ, the hierarchical clustering tree constructed in step 4 is divided, hierarchical clustering forest is obtained, with The color distance of the initial hierarchical clustering basic unit of any two in hierarchical clustering forest under same stalk tree is as positive sample This feature vector, with the color of the initial hierarchical clustering basic unit of any two under subtrees different in hierarchical clustering forest Feature vector of the distance as negative sample;
Step 6: using the current value of the feature weight vector w of hierarchical clustering basic unit, and using activation primitive pair The feature vector of positive sample and negative sample that step 5 constructs carries out sample class prediction, and utilizes sample class predicted value and sample The category attribute of itself constructs the likelihood function of weight vectors w, acquires new hierarchical clustering base by maximizing likelihood function The feature weight vector w of this unit, if updated w makes the maximum value convergence of the likelihood function of building, with new level The feature weight vector w for clustering basic unit, rebuilds hierarchical clustering forest, otherwise, return step 3;
In order to facilitate likelihood function is solved, logarithm is taken to likelihood function both sides, obtains log-likelihood function:
It using stochastic gradient rise method, maximizes log-likelihood function l (w), solves weight vectors w.
Step 7: all initial hierarchical clusterings that each subtree includes in the hierarchical clustering forest successively obtained with step 6 Pixel merges in unit, constructs corresponding color layer;
Step 8: extracting connected domain from each color layer, obtain candidate characters, candidate characters are sieved with classifier Choosing carries out character merging to the candidate characters after screening, obtains line of text;Word division is carried out to line of text, obtains text This testing result.
Further, the activation primitive used in the step 6 is logistic regression function:
Wherein, hw(x) prediction result of sample is corresponded to for input vector x;X is input vector, by positive sample or negative sample Feature vector and intercept item -1 form, x=(| μi(r)-μj(r)|,|μi(g)-μj(g)|,|μi(b)-μj(b)|,-1)。
Further, the likelihood function of the weight vectors w constructed in the step 6 is as follows:
Wherein, p (y(i)|x(i);It w) is probability density function of the x about parameter w, x(i)And y(i)Respectively indicate i-th of sample Input vector and sample attribute classification itself, n be sample total;y(i)Value is 0 or 1,0 expression negative sample, and 1 indicates positive sample This.
Further, the classifier in the step 8 for being screened to candidate characters is Adaboost classifier, is adopted It is obtained with the training of following process:
Firstly, every piece image of training set in ICDAR2013 database is executed step 1-7, from obtained color layer Middle extraction candidate characters;
Then, candidate characters are subjected to pixel matching, the positive and negative sample set of building training with the character really demarcated;
Then, from the positive and negative sample set of training, 30000 trained positive samples and 30000 negative samples of training are randomly selected The training set of this conduct building Adaboost classifier;
Finally, extracting the geometrical characteristic and HOG feature of each sample in training set, training Adaboost classifier is obtained To the Adaboost classifier for verifying candidate characters.
Further, described that candidate characters are screened with classifier, refer to the geometry for extracting each candidate characters Feature and HOG feature input trained Adaboost classifier and carry out candidate characters verifying, remove non-text character, retain Text character.
Further, the described pair of candidate characters after screening carry out character merging, obtain the detailed process of line of text It is as follows:
By the character combination of two after verifying, form character pair, by the ratio of width to height, horizontal distance and color distance meet with The character of lower condition merges the text character pair comprising identical connected domain to text character pair is considered as, and constructs line of text:
|mean(R1)-mean(R2)|<80
Wherein, w () and h () respectively indicates the width and height of character;hdAnd vdRespectively indicate character zone R1And R2 Horizontal distance and vertical range between two central points;Mean (R) indicates the color mean value of pixel in character zone R.
Further, described that word division is carried out to line of text, refer to two adjacent character horizontal space dhIt carries out Judgement, if meetingIt is then once divided, the word after being divided;
Wherein, dhHorizontal space between adjacent character,For the mean value of all character horizontal spaces, α puts down for character Equal spacing zoom factor, value 1-2, β are the median of all character horizontal spaces.
The median of all character horizontal spaces refers to successively sort to the horizontal space size of all characters after Median.
Further, the Adaboost classifier described in the only word use comprising single character in the word after division It is verified, the word to keep score greater than 0.8, obtains final text detection result.
Beneficial effect
The present invention provides a kind of methods of natural scene text detection based on self study Color-based clustering, firstly, by layer Secondary cluster and Parameter Self-learning strategy combine, and design a kind of adaptive Color-based clustering method, extract the candidate characters in image, should Adaptive Color-based clustering method can learn automatically weight threshold for different images, there is preferable character recall rate.Then, pass through Training Adaboost classifier, building character verify model, remove non-text character;Finally, merging character constructs line of text, And text detection result is obtained by post-processing.Compared with traditional method, this method can obtain higher text detection and recall Rate, and the text results detected are more accurate.
Detailed description of the invention
Fig. 1 is the flow diagram of the method for the invention;
Fig. 2 is the text detection process of the method for the invention, wherein figure (a) is image to be detected;It is adaptive for scheming (b) The color layer of Color-based clustering method extraction is answered, the pixel of the same color layer is indicated using same color in figure;Scheming (c) is The candidate characters that color layer extracts, each candidate characters are identified using individual color;(d) is schemed for candidate characters verifying Result later;(e) is schemed for character merging, the line of text of building;Scheming (f) is obtained final text after line of text participle Testing result;
Fig. 3 is the schematic diagram that hierarchical clustering and positive negative sample construct.
Specific embodiment
Below in conjunction with specific embodiment, the present invention is described in more detail.
A method of the natural scene text detection based on self study Color-based clustering, process is as shown in Figure 1, include following Step:
For carrying out text detection to Fig. 2 (a), the specific steps are as follows:
Step 1: input image to be detected is expressed as image I, such as Fig. 2 (a);
Step 2: R, G, B color-values of all pixels point in image I are extracted, it, will according to R, G, B value of pixel in image It projects to three-dimensional color space.By three-dimensional color space according to step-length 32,512 are equidistantly divided into having a size of 32 × 32 × 32 small cubes.
Step 3: extracting the small cubes comprising pixel, the basic unit as hierarchical clustering.Calculate each small cube R, G, B color mean value of pixel in body are expressed as c=(μ (r), μ (g), μ (b)) as the feature of cluster basic unit.
Step 4: initialization weight vectors w=(wr,wg,wb,wθ), wherein wr, wg, wbRespectively μ (r), μ (g), μ (b) Color distance weighting, wθTo cluster threshold value, w is initialized as (1,1,1,50).
Step 5: building level clustering tree;
Step 5.1: according to feature weight vector w in step 4, between any two by cluster cell in formula (1) calculating step 3 Color distance.
Wherein, μ (r), μ (g), μ (b) are respectively the R of pixel included in hierarchical clustering unit, G, B color mean value, μi And μjRespectively indicate i and j hierarchical clustering basic unit;
Step 5.2: merging the smallest cluster cell of color distance, construct new cluster cell;
Step 5.3: calculating the color characteristic of new unit, and update the color distance between other units;
Step 5.4: repeat step 5.2-5.3, until cluster element be 1, thus construct level clustering tree.
Step 6: according to cluster threshold value wθ, the hierarchical clustering tree in step 5 is divided into hierarchical clustering forest, such as schemes (3) It is shown.Wherein, solid line divides to obtain different subtrees, and lower every stalk tree includes different cell node, shown in frame as dashed. Using the color distance of the initial hierarchical clustering basic unit of any two under same stalk tree as the feature vector of positive sample, Using the color distance of the initial hierarchical clustering basic unit of any two under subtrees different in hierarchical clustering forest as negative sample Thus this feature vector constructs the feature vector of positive and negative samples.
Step 7: updating weight vectors w.
Step 7.1: the sample characteristics extracted according to weight vectors w and step 6, herein using logistic regression function (such as public affairs Shown in formula 2) it is used as activation primitive, class prediction is carried out to sample.
Wherein, x is input vector, is made of sample characteristics and intercept item -1, be expressed as x=(| μi(r)-μj(r)|,|μi (g)-μj(g)|,|μi(b)-μj(b) |, -1), hwIt (x) is the prediction result of sample.Y is sample true tag, is carried out with 0 or 1 It indicates.
Step 7.2: according to sample predictions value and its true tag, likelihood function is constructed, as shown in formula (3), wherein p (y(i)|x(i);It w) is probability density function of the x about parameter w, x(i)And y(i)Respectively indicate the input vector and sample of i-th of sample This Attribute class itself is other, and n is sample total;y(i)Value is 0 or 1,0 expression negative sample, and 1 indicates positive sample.
Step 7.3: solving likelihood function for convenience, logarithm is taken to formula (3) both sides, obtains log-likelihood function, such as Shown in formula (4).
Step 7.4: using stochastic gradient rise method, maximize log-likelihood function l (w), thus solve weight vectors w. The gradient for seeking formula (4) w obtains shown in result such as formula (5).
Step 7.5: updating weight vectors w according to formula (6), wherein α is the learning rate of stochastic gradient rise method, setting It is 0.011.
wj:=wj+α((y-hw(x))xj) (6)
Step 7.6: repeating step 7.1-7.5, weight vectors w is updated, until gradientUntil being close to 0, it is believed that Likelihood function reaches maximum value.
Step 8: according to the weight vectors w updated in step 7, step 5-7 is repeated, until log-likelihood function l in step 7 (w) maximum value convergence, obtains optimal weight vectors, constructs optimal hierarchical clustering tree.
Step 9: according to final cluster threshold value wθ, hierarchical clustering tree is divided, hierarchical clustering forest is obtained.
Step 10: all initial levels that each subtree includes in the hierarchical clustering forest successively obtained with step 9 are poly- Pixel merges in class unit, constructs corresponding color layer;Its color layer result such as Fig. 2 (b), same color layer in figure Pixel is labeled as same color.
Step 11: the connected domain in each color layer is marked, candidate characters are obtained, it is same in figure as a result as shown in Fig. 2 (c) One character pixels point is labeled as identical color.
Step 12: extracting the geometrical characteristic and HOG feature of each candidate characters, input trained Adaboost classification Device carries out candidate characters verifying, removes non-text character, retains text character, as a result as shown in Fig. 2 (d).
The training process of the Adaboost classifier is as follows:
Firstly, every piece image of training set in ICDAR2013 database is executed step 1-11, candidate characters are obtained;
Then, candidate characters are subjected to pixel matching with the character really demarcated, when matched pixel point accounts for candidate characters With 60% or more of the character phase and pixel quantity really demarcated, then it is assumed that candidate characters are considered as positive sample by successful match, Otherwise it is considered as negative sample, constructs sample set.
Finally, randomly selecting 30000 positive samples and 30000 negative samples as building from sample set The training set of Adaboost classifier.Extract the geometrical characteristic and HOG feature of each sample in training set, training Adaboost Classifier obtains candidate characters verifying model.
Step 13: character merges, and constructs line of text, as a result as shown in Fig. 2 (e).
By the character combination of two after step 12 verifying, character pair is formed, by the ratio of width to height, horizontal distance and color distance The character of (such as formula 7-9) is met certain condition to text character pair is considered as, merges the text character pair comprising identical connected domain, Construct line of text.
Wherein, w () and h () respectively indicates the width and height of character;hdAnd vdRespectively indicate character zone R1And R2 Horizontal distance and vertical range between two central points.
|mean(R1)-mean(R2)|<80 (9)
Wherein, mean (R) indicates the color mean value of pixel in character zone R.
Step 14: line of text post-processing obtains text detection as a result, as shown in Fig. 2 (f);
Step 14.1: the line of text that step 13 is constructed, according to formula 10 to two adjacent character horizontal space dhInto Row judgement, if meeting the condition of formula 10, is once divided, the word after being divided;
Wherein, dhHorizontal space between adjacent character,For the mean value of all character horizontal spaces, α puts down for character Equal spacing zoom factor, value 1.5, β are the median of all character horizontal spaces.
Step 14.2: after being divided by step 14.1 line of text, only including the word of single character, use step 12 In Adaboost classifier further verify, the word only to keep score greater than 0.8 obtains final text detection result.
The foregoing is merely presently preferred embodiments of the present invention, is merely illustrative for the purpose of the present invention, and not restrictive 's.Those skilled in the art understand that many modifications can be carried out to it in the scope of the claims in the present invention, but all will It falls within the scope of protection of the present invention.

Claims (4)

1. a kind of natural scene Method for text detection based on self study Color-based clustering, which comprises the following steps:
Step 1: R, G, B color-values of each pixel in pending text detection image I are projected into three-dimensional color space In, three-dimensional color space is equidistantly divided, each three-dimensional color space cube is substantially single as a hierarchical clustering Member;
Using the color mean value of all pixels point in each three-dimensional color space cube as the feature of hierarchical clustering basic unit c;
Step 2: feature weight vector w, the w=(w of initialization hierarchical clustering basic unitr,wg,wb,wθ);
Wherein, wr, wg, wbThe respectively color distance weighting of R, G, B of hierarchical clustering basic unit pixel, wθTo cluster threshold Value;
Step 3: with the feature weight vector of hierarchical clustering basic unit, successively calculate any two hierarchical clustering basic unit it Between color distance;
Step 4: the smallest two hierarchical clustering basic units of color distance being merged, it is substantially single to obtain new hierarchical clustering Member, and the feature c of new hierarchical clustering basic unit is calculated, merged with hierarchical clustering basic unit and constructs corresponding hierarchical clustering Tree, return step 3, until hierarchical clustering basic unit quantity is 1;
Step 5: the feature vector of building positive sample and negative sample;
According to cluster threshold value wθ, the hierarchical clustering tree constructed in step 4 is divided, hierarchical clustering forest is obtained, it is poly- with level Spy of the color distance of the initial hierarchical clustering basic unit of any two in class forest under same stalk tree as positive sample Vector is levied, is made with the color distance of the initial hierarchical clustering basic unit of any two under subtrees different in hierarchical clustering forest For the feature vector of negative sample;
Step 6: using the current value of the feature weight vector w of hierarchical clustering basic unit, and using activation primitive to step 5 The positive sample of building and the feature vector of negative sample carry out sample class prediction, and utilize sample class predicted value and sample itself Category attribute, construct weight vectors w likelihood function, by maximize likelihood function acquire new hierarchical clustering basic unit Feature weight vector w, if updated w make building likelihood function maximum value convergence, with new hierarchical clustering base The feature weight vector w of this unit, rebuilds hierarchical clustering forest, otherwise, return step 3;
Step 7: all initial level cluster cells that each subtree includes in the hierarchical clustering forest successively obtained with step 6 Middle pixel merges, and constructs corresponding color layer;
Step 8: connected domain is extracted from each color layer, obtains candidate characters, candidate characters are screened with classifier, it is right Candidate characters after screening carry out character merging, obtain line of text;Word division is carried out to line of text, obtains text detection As a result;
Classifier in the step 8 for being screened to candidate characters is Adaboost classifier, is instructed using following process Practice and obtain:
Firstly, every piece image of training set in ICDAR2013 database is executed step 1-7, mentioned from obtained color layer Take candidate characters;
Then, candidate characters are subjected to pixel matching, the positive and negative sample set of building training with the character really demarcated;
Then, from the positive and negative sample set of training, 30000 trained positive samples is randomly selected and 30000 trained negative samples are made For the training set for constructing Adaboost classifier;
Finally, extracting the geometrical characteristic and HOG feature of each sample in training set, training Adaboost classifier is used In the Adaboost classifier of verifying candidate characters;
It is described that candidate characters are screened with classifier, refer to the geometrical characteristic and HOG feature for extracting each candidate characters, It inputs trained Adaboost classifier and carries out candidate characters verifying, remove non-text character, retain text character;
The described pair of candidate characters after screening carry out character merging, and obtaining line of text, detailed process is as follows:
By the character combination of two after verifying, character pair is formed, the ratio of width to height, horizontal distance and color distance are met into following item The character of part merges the text character pair comprising identical connected domain to text character pair is considered as, and constructs line of text:
|mean(R1)-mean(R2) | < 80
Wherein, w () and h () respectively indicates the width and height of character;hdAnd vdRespectively indicate character zone R1And R2Two Horizontal distance and vertical range between central point;Mean (R) indicates the color mean value of pixel in character zone R;
It is described that word division is carried out to line of text, refer to two adjacent intercharacter horizontal space dhJudged, if meetingIt is then once divided, the word after being divided;
Wherein, dhHorizontal space between adjacent character,For the mean value of all character horizontal spaces, α is between character is average Away from zoom factor, value 1-2, β are the median of all character horizontal spaces.
2. the method according to claim 1, wherein the activation primitive used in the step 6 is logistic regression Function:
Wherein, hw(x) prediction result of sample is corresponded to for input vector x;X is input vector, by positive sample or the feature of negative sample Vector sum intercept item -1 forms, x=(| μi(r)-μj(r)|,|μi(g)-μj(g)|,|μi(b)-μj(b)|,-1)。
3. according to the method described in claim 2, it is characterized in that, the likelihood function of the weight vectors w constructed in the step 6 It is as follows:
Wherein, p (y(i)|x(i);It w) is probability density function of the x about parameter w, x(i)And y(i)Respectively indicate the defeated of i-th of sample Incoming vector and sample itself attribute classification, n are sample total;y(i)Value is 0 or 1,0 expression negative sample, and 1 indicates positive sample.
4. the method according to claim 1, wherein to the list in the word after division only including single character Word, is verified using the Adaboost classifier, and the word to keep score greater than 0.8 obtains final text detection As a result.
CN201710021572.1A 2017-01-12 2017-01-12 A method of the natural scene text detection based on self study Color-based clustering Expired - Fee Related CN106874905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710021572.1A CN106874905B (en) 2017-01-12 2017-01-12 A method of the natural scene text detection based on self study Color-based clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710021572.1A CN106874905B (en) 2017-01-12 2017-01-12 A method of the natural scene text detection based on self study Color-based clustering

Publications (2)

Publication Number Publication Date
CN106874905A CN106874905A (en) 2017-06-20
CN106874905B true CN106874905B (en) 2019-06-11

Family

ID=59158105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710021572.1A Expired - Fee Related CN106874905B (en) 2017-01-12 2017-01-12 A method of the natural scene text detection based on self study Color-based clustering

Country Status (1)

Country Link
CN (1) CN106874905B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874905B (en) * 2017-01-12 2019-06-11 中南大学 A method of the natural scene text detection based on self study Color-based clustering
CN108038458B (en) * 2017-12-20 2021-04-09 首都师范大学 Method for automatically acquiring outdoor scene text in video based on characteristic abstract diagram
CN108229386B (en) * 2017-12-29 2021-12-14 百度在线网络技术(北京)有限公司 Method, apparatus, and medium for detecting lane line
CN109582833B (en) * 2018-11-06 2023-09-22 创新先进技术有限公司 Abnormal text detection method and device
CN109558876B (en) * 2018-11-20 2021-11-16 浙江口碑网络技术有限公司 Character recognition processing method and device
CN109598272B (en) * 2019-01-11 2021-08-06 北京字节跳动网络技术有限公司 Character line image recognition method, device, equipment and medium
US11423436B2 (en) * 2019-02-19 2022-08-23 Nec Corporation Interpretable click-through rate prediction through hierarchical attention
CN116468959B (en) * 2023-06-15 2023-09-08 清软微视(杭州)科技有限公司 Industrial defect classification method, device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101615252A (en) * 2008-06-25 2009-12-30 中国科学院自动化研究所 A kind of method for extracting text information from adaptive images
JP2014229314A (en) * 2013-05-24 2014-12-08 キヤノン株式会社 Method and device for text detection
CN104809481A (en) * 2015-05-21 2015-07-29 中南大学 Natural scene text detection method based on adaptive color clustering
CN106874905A (en) * 2017-01-12 2017-06-20 中南大学 A kind of method of the natural scene text detection based on self study Color-based clustering

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101615252A (en) * 2008-06-25 2009-12-30 中国科学院自动化研究所 A kind of method for extracting text information from adaptive images
JP2014229314A (en) * 2013-05-24 2014-12-08 キヤノン株式会社 Method and device for text detection
CN104809481A (en) * 2015-05-21 2015-07-29 中南大学 Natural scene text detection method based on adaptive color clustering
CN106874905A (en) * 2017-01-12 2017-06-20 中南大学 A kind of method of the natural scene text detection based on self study Color-based clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A novel approach to text detection and extraction from videos by discriminative features and density;WEI Baogang等;《Chinese Journal or Electronics》;20140420;第23卷(第2期);第322-328页

Also Published As

Publication number Publication date
CN106874905A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
CN106874905B (en) A method of the natural scene text detection based on self study Color-based clustering
CN110334765B (en) Remote sensing image classification method based on attention mechanism multi-scale deep learning
CN106997597B (en) It is a kind of based on have supervision conspicuousness detection method for tracking target
CN103514456B (en) Image classification method and device based on compressed sensing multi-core learning
CN104809481B (en) A kind of natural scene Method for text detection based on adaptive Color-based clustering
CN106897670A (en) A kind of express delivery violence sorting recognition methods based on computer vision
CN104573685B (en) A kind of natural scene Method for text detection based on linear structure extraction
CN105447503B (en) Pedestrian detection method based on rarefaction representation LBP and HOG fusion
CN105279519B (en) Remote sensing image Clean water withdraw method and system based on coorinated training semi-supervised learning
CN104504362A (en) Face detection method based on convolutional neural network
CN109961145A (en) A kind of confrontation sample generating method for image recognition category of model boundary sensitivity
CN109583379A (en) A kind of pedestrian&#39;s recognition methods again being aligned network based on selective erasing pedestrian
Li et al. A generative/discriminative learning algorithm for image classification
CN105095884B (en) A kind of pedestrian&#39;s identifying system and processing method based on random forest support vector machines
CN106909902A (en) A kind of remote sensing target detection method based on the notable model of improved stratification
CN105488809A (en) Indoor scene meaning segmentation method based on RGBD descriptor
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN111753874A (en) Image scene classification method and system combined with semi-supervised clustering
CN108681735A (en) Optical character recognition method based on convolutional neural networks deep learning model
CN114067444A (en) Face spoofing detection method and system based on meta-pseudo label and illumination invariant feature
CN107239777A (en) A kind of tableware detection and recognition methods based on various visual angles graph model
CN109840512A (en) A kind of Facial action unit recognition methods and identification device
CN106874825A (en) The training method of Face datection, detection method and device
CN106845513A (en) Staff detector and method based on condition random forest
CN109190472A (en) Combine pedestrian&#39;s attribute recognition approach of guidance with attribute based on image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190611

Termination date: 20200112