CN104598907B - Lteral data extracting method in a kind of image based on stroke width figure - Google Patents

Lteral data extracting method in a kind of image based on stroke width figure Download PDF

Info

Publication number
CN104598907B
CN104598907B CN201310534130.9A CN201310534130A CN104598907B CN 104598907 B CN104598907 B CN 104598907B CN 201310534130 A CN201310534130 A CN 201310534130A CN 104598907 B CN104598907 B CN 104598907B
Authority
CN
China
Prior art keywords
bianry image
image
connected domain
mrow
bianry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310534130.9A
Other languages
Chinese (zh)
Other versions
CN104598907A (en
Inventor
刘春梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201310534130.9A priority Critical patent/CN104598907B/en
Publication of CN104598907A publication Critical patent/CN104598907A/en
Application granted granted Critical
Publication of CN104598907B publication Critical patent/CN104598907B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Abstract

The present invention relates to lteral data extracting method in a kind of image based on stroke width figure, including:Coloured image is read in, color is clustered using means clustering algorithm, obtains the first bianry image sequence;Using edge detection algorithm and morphology connected domain analysis method, the second bianry image sequence is obtained;The sequence after merging is carried out using geometric filter to filter out for the first time, obtains the 3rd bianry image sequence;The stroke width figure of the 3rd bianry image sequence is calculated, the 3rd bianry image sequence is filtered out for the second time according to stroke width figure, obtains the 4th bianry image sequence;By imaging importing all in the 4th bianry image sequence, the text results extracted.Compared with prior art, the distance that the present invention is adaptively got colors in clustering algorithm by judging image brightness values, can preferably handle that uneven illumination is even to wait degraded image;By improving traditional stroke width computational methods, the life energy of Word Input technology is improved.

Description

Lteral data extracting method in a kind of image based on stroke width figure
Technical field
The present invention relates to image procossing and technical field of computer vision, more particularly, to a kind of based on stroke width figure Lteral data extracting method in image.
Background technology
Word is for understanding that picture material plays an important role in image, the whether accurate direct shadow of Word Input in image Ring the subsequent treatment result of word automated processing system.The Word Input in image makes great progress in recent years, so And the Word Input in image but encounters many problems during moving towards practical, such as the smudgy Chu of image, illumination Uneven, background is complicated etc., and this is all to restrict the bottleneck that word in image automatically extracts technology practical application, is in image again Word automatically extracts focus and difficult point in technical research.
Many researchers start to automatically extract technology to word in image and studied recent decades both at home and abroad, these methods Two classes can be divided into:The first kind is the text extraction method based on Threshold segmentation, i.e., carries out binaryzation to image by asking for threshold value So as to obtain word foreground image, common threshold acquiring method has based on global threshold method and local threshold method for processing, this The method processing preferable image of quality can obtain relatively good result, and for low-quality image and the image with complex background is normal Often show helpless;Second class is the text extraction method based on regional analysis, by extracting some region prospects, and is sentenced Whether these disconnected regions meet word feature so as to exclude non-legible region, and conventional word feature has:Character area content is led to Often with having consistent color, character area to have identical stroke width etc., this method is more flexible, and can handle Word Input in image under various complex situations.Text extraction method based on regional analysis, base can be further separated into In the text extraction method of color cluster and text extraction method based on stroke width information.
Text extraction method based on color cluster is to carry out cluster to the color in image using clustering algorithm so as to shape Into some regions, then word attribute feature is recycled to evaluate these regions, and then obtain character area.Conventional clustering algorithm There are k means clustering algorithms, Isodata algorithms etc..The selection of color space can be chosen according to picture quality, conventional face The colour space has RGB, HIS etc..
Text extraction method based on stroke width information takes full advantage of an important feature of word, and character area leads to Often with there is a similar stroke width, the width between stroke will not it is different a lot.The method of most of extraction stroke width information is Image is scanned in the horizontal and vertical directions respectively, if there is paired color value mutation, so that it may calculate this to face For cluster between colour mutation pixel as stroke width information, this method handles the Word Input under complex situations, tool Have unstability, usually occur carry by mistake or leak withdraw deposit as.Another method is to utilize stroke width conversion operator detection figure Word as in, i.e., the stroke width of this point, this method are found along gradient direction divergent-ray by each stroke edge point Stroke corner stroke width information can not be calculated exactly, stroke width information substantially can only be extracted, it is difficult to extract To real stroke width information.
The content of the invention
It is an object of the present invention to overcome the above-mentioned drawbacks of the prior art and provide one kind is based on stroke width Lteral data extracting method in the image of figure.
The purpose of the present invention can be achieved through the following technical solutions:Word in a kind of image based on stroke width figure Data extraction method, it is characterised in that comprise the following steps:
S1, coloured image I is read in, color is clustered using means clustering algorithm, extracted in the image after cluster Connected domain, and bianry image corresponding to all connected domains is obtained, form the first bianry image sequenceIts In, ncFor connected domain number;
S2, coloured image I is read in, using edge detection algorithm and morphology connected domain analysis method, coloured image is entered Row edge extracting, connected domain is extracted in the image after edge extracting, and obtain bianry image corresponding to all connected domains, formed Second bianry image sequenceWherein, neFor connected domain number;
S3, the first bianry image sequence and the second bianry image sequence merged, using geometric filter to merging after The first time that bianry image sequence carries out non-legible connected domain is filtered out, and the bianry image sequence after filtering out for the first time is updated For the 3rd bianry image sequenceWherein, ngFor the connected domain number after filtering out for the first time;
S4, calculate stroke width figure corresponding to each bianry image in the 3rd bianry image sequenceAccording to stroke width FigureFiltering out for the second time for non-legible connected domain is carried out to the 3rd bianry image sequence, obtains the 4th bianry image sequenceWherein, nsFor the connected domain number after filtering out for the second time;
S5, bianry image all in the 4th bianry image sequence is superimposed as to the new bianry image I of a widths, binary map As IsProspect be the text results extracted.
Cluster carried out to color using means clustering algorithm comprised the following steps that described in step S1,
11) it is I to extract coloured image I corresponding images on the luminance channel L of HSL color spacesl, predetermined luminance threshold value trc, the clusters number k based on Euclidean distance color clusterE, the clusters number k based on cosine similarity color clusterC
12) I is judged whetherl> trc, it is that k averages are then carried out to coloured image I in RGB color using cosine similarity Every one kind after cluster, is considered as foreground image, obtains k by clusterCIndividual bianry image;Otherwise using Euclidean distance in RGB color Space carries out k mean clusters to coloured image I, obtains kEIndividual bianry image.
Implementation steps S2 comprises the following steps that,
21) coloured image I gray processings are obtained into gray level image I using weighted average methodg
22) using edge detection operator to gray level image IgCarry out rim detection and obtain edge binary images Ie1
23) to edge binary images Ie1Discontinued stroke is attached in neighborhood, that is, utilizes bianry image morphology 8 Neighborhood territory pixel attended operation is to edge image Ie1Breakpoint joint is carried out, obtains edge binary images Ie2
24) edge binary images I is extractede2In connected domain, if connected domain is enclosed region, reality is filled into it Heart connected domain, each connected domain is regarded as foreground image, obtain bianry image corresponding to each connected domain, form the second bianry image Sequence
Described in step 21) using weighted average method by coloured image I gray processings, obtain gray level image Ig, cromogram As the gray value calculation formula of every in I is:
Gray=0.2989 × IR+0.587×IG+0.114×IB,
In formula, IR、IG、IBRespectively triple channel pixel value of this in coloured image I, Gray is after the gray processing Gray value.
Edge detection operator described in step 22) is Canny edge detection operators.
Carry out non-legible connected domain to the bianry image sequence after merging using geometric filter the described in step S3 Once filter out and comprise the following steps that,
31) image I size s is setIh×SIw, the boundary rectangle lower size limit s of the connected domain of bianry imageh×sw, most Large scale ratio rI, length-width ratio lower limit rb, length-width ratio upper limit rtAnd connected domain includes hole number lower limit nhtr
32) judge whether each bianry image meets any one geometric filter rule after merging, being then will be current Bianry image is deleted from the bianry image sequence after merging, and described geometric filter is made up of four rules:
R1. too small connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be less than Connected domain lower size limit sh×sw, then it is assumed that this connected domain is non-legible region;
R2. excessive connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be more than Image I size rI×sIh×sIw, then it is assumed that this connected domain is non-legible region;
R3. long or narrow connected domain is excluded, if current bianry image IiConnected domain minimum enclosed rectangle length Wide ratio is less than rbOr more than rt, then it is assumed that this connected domain is non-legible region;
R4. the connected domain containing excessive cavity is excluded, if current bianry image IiConnected domain in contained empty number More than nhtr, then it is assumed that this connected domain is non-legible region.
Implementation steps S4 comprises the following steps that,
41) current bianry image is calculatedEach pixel j to connected domain marginal point p beeline in connected domain prospect dpj, and mark pixel j with marginal point pp
42) in the foreground pixel point j with identical marginal point p recentlypTo marginal point p beeline dpjIn choose it is maximum Distance dpj-max=max (dpj) it is used as foreground pixel point jpStroke width, use dpj-maxReplacement pixels point jp, obtain current two-value Stroke width figure corresponding to image
43) according to current stroke width figureCalculate current bianry imageThe stroke standard deviation rate R of connected domain:
In formula, niIt is bianry imageThe total number of connected domain foreground point;
44) R > tr are judged whetherr, it is to think that stroke width is inconsistent, current bianry imageConnected domain be non-text Block domain, by current bianry imageDeleted from the 3rd bianry image sequence, wherein trrFor default stroke standard deviation rate Threshold value;
45) current bianry image is judgedWhether be the 3rd bianry image sequence last bianry imageOtherwise Update ig=ig+ 1, read next bianry imageReturn to step 41);It is that will exclude the 3rd two-value behind non-legible region Image sequence is updated to the 4th bianry image sequenceJump out circulation, wherein nsFor connected domain number.
Compared with prior art, the present invention by judge image brightness values adaptively get colors in clustering algorithm away from From, can preferably handle uneven illumination it is even wait degraded image;In addition, by improving traditional stroke width computational methods, having Have and choose ultimate range as foreground pixel point in identical foreground pixel point to the beeline of marginal point of marginal point recently Stroke width, the stroke width of connected domain can be more accurately calculated, so as to improve the performance of Word Input technology.
Brief description of the drawings
Fig. 1 is the flow chart of the present invention;
Fig. 2 is stroke width figure of the embodiment of the present invention;
In Fig. 2, (a) connected domain edge graph;(b) beeline figure;
Fig. 3 is the Word Input result schematic diagram of the embodiment of the present invention.
Embodiment
The present invention is described in detail with specific embodiment below in conjunction with the accompanying drawings.It is only being preferable to carry out for the present invention below Example, only to the present invention's for example, rather than to the present invention and its application or the limitation of purposes, drawn according to the present invention Other embodiment, similarly belongs to the technological innovation scope of the present invention, and the setting for having related parameter in scheme is also not intended that only There is example value to use.
Embodiment:
From comprising English character, uneven illumination even low-quality image I, color feature space RGB, luminance threshold is set Value trc=0.9, the clusters number k based on Euclidean distance color clusterE=3, the cluster numbers based on cosine similarity color cluster Mesh kC=3.
As Figure 1-3, a kind of lteral data extracting method in image based on stroke width figure, it is characterised in that bag Include following steps:
S1, coloured image I is read in, color is clustered using means clustering algorithm, extracted in the image after cluster Connected domain, and bianry image corresponding to all connected domains is obtained, form the first bianry image sequenceIts In, ncFor connected domain number;
Cluster carried out to color using means clustering algorithm comprised the following steps that described in step S1,
11) it is I to extract coloured image I corresponding images on the luminance channel L of HSL color spacesl, predetermined luminance threshold value trc, the clusters number k based on Euclidean distance color clusterE, the clusters number k based on cosine similarity color clusterC
12) I is judged whetherl> trc, it is that k averages are then carried out to coloured image I in RGB color using cosine similarity Every one kind after cluster, is considered as foreground image, obtains k by clusterCIndividual bianry image;Otherwise using Euclidean distance in RGB color Space carries out k mean clusters to coloured image I, obtains kEIndividual bianry image.
S2, coloured image I is read in, using edge detection algorithm and morphology connected domain analysis method, coloured image is entered Row edge extracting, connected domain is extracted in the image after edge extracting, and obtain bianry image corresponding to all connected domains, formed Second bianry image sequenceWherein, neFor connected domain number;
Implementation steps S2 comprises the following steps that,
21) coloured image I gray processings are obtained into gray level image I using weighted average methodg
The gray value calculation formula of every is in coloured image I:
Gray=0.2989 × IR+0.587×IG+0.114×IB,
In formula, IR、IG、IBRespectively triple channel pixel value of this in coloured image I, Gray is after the gray processing Gray value.
22) using Canny edge detection operators to gray level image IgCarry out rim detection and obtain edge binary images Ie1
23) to edge binary images Ie1Discontinued stroke is attached in neighborhood, that is, utilizes bianry image morphology 8 Neighborhood territory pixel attended operation is to edge image Ie1Breakpoint joint is carried out, obtains edge binary images Ie2
24) edge binary images I is extractede2In connected domain, if connected domain is enclosed region, reality is filled into it Heart connected domain, each connected domain is regarded as foreground image, obtain bianry image corresponding to each connected domain, form the second bianry image Sequence
S3, the first bianry image sequence and the second bianry image sequence merged, using geometric filter to merging after The first time that bianry image sequence carries out non-legible connected domain is filtered out, and the bianry image sequence after filtering out for the first time is updated For the 3rd bianry image sequenceWherein, ngFor the connected domain number after filtering out for the first time;
The described first time for being carried out non-legible connected domain to the bianry image sequence after merging using geometric filter is filtered Except comprising the following steps that,
31) image I size s is setIh×sIw, the boundary rectangle lower size limit s of the connected domain of bianry imageh×sw, most Large scale ratio rI, length-width ratio lower limit rb, length-width ratio upper limit rtAnd connected domain includes hole number lower limit nhtr
32) judge whether each bianry image meets any one geometric filter rule after merging, being then will be current Bianry image is deleted from the bianry image sequence after merging, and described geometric filter is made up of four rules:
R1. too small connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be less than Connected domain lower size limit sh×sw, then it is assumed that this connected domain is non-legible region;
R2. excessive connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be more than Image I size rI×sIh×sIw, then it is assumed that this connected domain is non-legible region;
R3. long or narrow connected domain is excluded, if current bianry image IiConnected domain minimum enclosed rectangle length Wide ratio is less than rbOr more than rt, then it is assumed that this connected domain is non-legible region;
R4. the connected domain containing excessive cavity is excluded, if current bianry image IiConnected domain in contained empty number More than nhtr, then it is assumed that this connected domain is non-legible region.
S4, calculate stroke width figure corresponding to each bianry image in the 3rd bianry image sequenceAccording to stroke width FigureFiltering out for the second time for non-legible connected domain is carried out to the 3rd bianry image sequence, obtains the 4th bianry image sequenceWherein, nsFor the connected domain number after filtering out for the second time;
Implementation steps S4 comprises the following steps that,
41) current bianry image is calculatedEach pixel j to connected domain marginal point p beeline in connected domain prospect dpj, and mark pixel j with marginal point pp, as shown in Fig. 2 (a);
42) in the foreground pixel point j with identical marginal point p recentlypTo marginal point p beeline dpjIn choose it is maximum Distance dpj-max=max (dpj) it is used as foreground pixel point jpStroke width, use dpj-maxReplacement pixels point jp, obtain current two-value Stroke width figure corresponding to imageAs shown in Fig. 2 (b);
43) according to current stroke width figureCalculate current bianry imageThe stroke standard deviation rate R of connected domain:
In formula, niIt is bianry imageThe total number of connected domain foreground point;
44) R > tr are judged whetherr, it is to think that stroke width is inconsistent, current bianry imageConnected domain be non-text Block domain, by current bianry imageDeleted from the 3rd bianry image sequence, wherein trrFor default stroke standard deviation rate Threshold value;
45) current bianry image is judgedWhether be the 3rd bianry image sequence last bianry imageOtherwise Update ig=ig+ 1, read next bianry imageReturn to step 41);It is that will exclude the behind non-legible region the 3rd 2 Value image sequence is updated to the 4th bianry image sequenceJump out circulation, wherein nsFor connected domain number.
S5, bianry image all in the 4th bianry image sequence is superimposed as to the new bianry image I of a widths, binary map As IsProspect be the text results extracted, as shown in Figure 3.

Claims (7)

1. lteral data extracting method in a kind of image based on stroke width figure, it is characterised in that comprise the following steps:
S1, coloured image I is read in, color is clustered using means clustering algorithm, connection is extracted in the image after cluster Domain, and bianry image corresponding to all connected domains is obtained, form the first bianry image sequenceic=1 ..., nc, wherein, ncFor Connected domain number;
S2, coloured image I is read in, using edge detection algorithm and morphology connected domain analysis method, side is carried out to coloured image Edge is extracted, and connected domain is extracted in the image after edge extracting, and obtains bianry image corresponding to all connected domains, forms second Bianry image sequenceie=1 ..., ne, wherein, neFor connected domain number;
S3, the first bianry image sequence and the second bianry image sequence merged, using geometric filter to the two-value after merging The first time that image sequence carries out non-legible connected domain is filtered out, and the bianry image sequence after filtering out for the first time is updated into the Three bianry image sequencesic=1 ..., ng, wherein, ngFor the connected domain number after filtering out for the first time;
S4, calculate stroke width figure corresponding to each bianry image in the 3rd bianry image sequenceAccording to stroke width figure Filtering out for the second time for non-legible connected domain is carried out to the 3rd bianry image sequence, obtains the 4th bianry image sequenceis= 1,…,ns, wherein, nsFor the connected domain number after filtering out for the second time;
S5, bianry image all in the 4th bianry image sequence is superimposed as to the new bianry image I of a widths, bianry image Is's Prospect is the text results extracted.
2. lteral data extracting method in a kind of image based on stroke width figure according to claim 1, its feature exist In, cluster carried out to color using means clustering algorithm comprised the following steps that described in step S1,
11) it is I to extract coloured image I corresponding images on the luminance channel L of HSL color spacesl, predetermined luminance threshold value trc, Clusters number k based on Euclidean distance color clusterE, the clusters number k based on cosine similarity color clusterC
12) I is judged whetherl> trc, it is that carrying out k averages to coloured image I in RGB color using cosine similarity gathers Class, every one kind after cluster is considered as foreground image, obtains kCIndividual bianry image;Otherwise it is empty in RGB color using Euclidean distance Between to coloured image I carry out k mean clusters, obtain kEIndividual bianry image.
3. lteral data extracting method in a kind of image based on stroke width figure according to claim 1, its feature exist In, implementation steps S2 is comprised the following steps that,
21) coloured image I gray processings are obtained into gray level image I using weighted average methodg
22) using edge detection operator to gray level image IgCarry out rim detection and obtain edge binary images Ie1
23) to edge binary images Ie1Discontinued stroke is attached in neighborhood, that is, utilizes the neighborhood of bianry image morphology 8 Pixel attended operation is to edge image Ie1Breakpoint joint is carried out, obtains edge binary images Ie2
24) edge binary images I is extractede2In connected domain, if connected domain is enclosed region, solid company is filled into it Logical domain, regards each connected domain as foreground image, obtains bianry image corresponding to each connected domain, forms the second bianry image sequence
4. lteral data extracting method in a kind of image based on stroke width figure according to claim 3, its feature exist In, described in step 21) using weighted average method by coloured image I gray processings, obtain gray level image Ig, in coloured image I The gray value calculation formula of every is:
Gray=0.2989 × IR+0.587×IG+0.114×IB,
In formula, IR、IG、IBRespectively triple channel pixel value of this in coloured image I, Gray are the ash after the gray processing Angle value.
5. lteral data extracting method in a kind of image based on stroke width figure according to claim 3, its feature exist In the edge detection operator described in step 22) is Canny edge detection operators.
6. lteral data extracting method in a kind of image based on stroke width figure according to claim 1, its feature exist In the first time for being carried out non-legible connected domain to the bianry image sequence after merging using geometric filter described in step S3 is filtered Except comprising the following steps that,
31) image I size S is setIh×SIw, the boundary rectangle lower size limit s of the connected domain of bianry imageh×sw, maximum chi Very little ratio rI, length-width ratio lower limit rb, length-width ratio upper limit rtAnd connected domain includes hole number lower limit nhtr
32) judge whether each bianry image meets any one geometric filter rule after merging, be then by current two-value Image is deleted from the bianry image sequence after merging, and described geometric filter is made up of four rules:
R1. too small connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be less than connected domain Lower size limit sh×sw, then it is assumed that this connected domain is non-legible region;
R2. excessive connected domain is excluded, if current bianry image IiThe minimum enclosed rectangle size of connected domain be more than image I Size rI×SIh×sIw, then it is assumed that this connected domain is non-legible region;
R3. long or narrow connected domain is excluded, if current bianry image IiConnected domain minimum enclosed rectangle length-width ratio it is small In rbOr more than rt, then it is assumed that this connected domain is non-legible region;
R4. the connected domain containing excessive cavity is excluded, if current bianry image IiConnected domain in contained empty number be more than nhtr, then it is assumed that this connected domain is non-legible region.
7. lteral data extracting method in a kind of image based on stroke width figure according to claim 1, its feature exist In, implementation steps S4 is comprised the following steps that,
41) current bianry image is calculatedEach pixel j to connected domain marginal point p beeline d in connected domain prospectpj, And mark pixel j with marginal point pp
42) in the foreground pixel point j with identical marginal point p recentlypTo marginal point p beeline dpjIn choose ultimate range dpj-max=max (dpj) it is used as foreground pixel point jpStroke width, use dpj-maxReplacement pixels point jp, obtain current bianry image Corresponding stroke width figure
43) according to current stroke width figureCalculate current bianry imageThe stroke standard deviation rate R of connected domain:
<mrow> <mi>R</mi> <mo>=</mo> <mfrac> <msqrt> <mrow> <mfrac> <mn>1</mn> <msub> <mi>n</mi> <mi>i</mi> </msub> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>i</mi> </msub> </msubsup> <msup> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mrow> <mi>p</mi> <mi>j</mi> <mo>-</mo> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> <mo>-</mo> <mfrac> <mn>1</mn> <msub> <mi>n</mi> <mi>i</mi> </msub> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>i</mi> </msub> </msubsup> <msub> <mi>d</mi> <mrow> <mi>p</mi> <mi>j</mi> <mo>-</mo> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> </msub> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> </msqrt> <mrow> <mfrac> <mn>1</mn> <msub> <mi>n</mi> <mi>i</mi> </msub> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>i</mi> </msub> </msubsup> <msub> <mi>d</mi> <mrow> <mi>p</mi> <mi>j</mi> <mo>-</mo> <mi>max</mi> </mrow> </msub> </mrow> </mfrac> <mo>,</mo> </mrow>
In formula, niIt is bianry imageThe total number of connected domain foreground point;
44) R > tr are judged whetherr, it is to think that stroke width is inconsistent, current bianry imageConnected domain be non-literal field Domain, by current bianry imageDeleted from the 3rd bianry image sequence, wherein trrFor default stroke standard deviation rate threshold Value;
45) current bianry image is judgedWhether be the 3rd bianry image sequence last bianry imageOtherwise update ig=ig+ 1, read next bianry imageReturn to step 41);It is that will exclude the 3rd bianry image behind non-legible region Sequence is updated to the 4th bianry image sequenceis=1 ..., ns, jump out circulation, wherein nsFor connected domain number.
CN201310534130.9A 2013-10-31 2013-10-31 Lteral data extracting method in a kind of image based on stroke width figure Active CN104598907B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310534130.9A CN104598907B (en) 2013-10-31 2013-10-31 Lteral data extracting method in a kind of image based on stroke width figure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310534130.9A CN104598907B (en) 2013-10-31 2013-10-31 Lteral data extracting method in a kind of image based on stroke width figure

Publications (2)

Publication Number Publication Date
CN104598907A CN104598907A (en) 2015-05-06
CN104598907B true CN104598907B (en) 2017-12-05

Family

ID=53124680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310534130.9A Active CN104598907B (en) 2013-10-31 2013-10-31 Lteral data extracting method in a kind of image based on stroke width figure

Country Status (1)

Country Link
CN (1) CN104598907B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326895B (en) * 2015-06-16 2020-07-07 富士通株式会社 Image processing apparatus, image processing method, and program
CN105374015A (en) * 2015-10-27 2016-03-02 湖北工业大学 Binary method for low-quality document image based on local contract and estimation of stroke width
CN106650761A (en) * 2015-10-29 2017-05-10 富士通株式会社 Apparatus and method for automatically classifying stamp engraving modes
CN107845094B (en) * 2017-11-20 2020-06-19 北京小米移动软件有限公司 Image character detection method and device and computer readable storage medium
CN110197180B (en) * 2019-05-30 2022-03-01 新华三技术有限公司 Character defect detection method, device and equipment
CN113822094B (en) * 2020-06-02 2024-01-16 苏州科瓴精密机械科技有限公司 Method, system, robot and storage medium for identifying working position based on image
CN112836541B (en) * 2021-02-03 2022-06-03 华中师范大学 Automatic acquisition and identification method and device for 32-bit bar code of cigarette
CN114782950B (en) * 2022-03-30 2022-10-21 慧之安信息技术股份有限公司 2D image text detection method based on Chinese character stroke characteristics

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276461A (en) * 2008-03-07 2008-10-01 北京航空航天大学 Method for increasing video text with edge characteristic
CN101593276A (en) * 2008-05-29 2009-12-02 汉王科技股份有限公司 A kind of video OCR image-text separation method and system
CN102915438A (en) * 2012-08-21 2013-02-06 北京捷成世纪科技股份有限公司 Method and device for extracting video subtitles

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101276461A (en) * 2008-03-07 2008-10-01 北京航空航天大学 Method for increasing video text with edge characteristic
CN101593276A (en) * 2008-05-29 2009-12-02 汉王科技股份有限公司 A kind of video OCR image-text separation method and system
CN102915438A (en) * 2012-08-21 2013-02-06 北京捷成世纪科技股份有限公司 Method and device for extracting video subtitles

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《基于笔画宽度变换的自然场景文本检测方法》;宋文;《计算机工程与应用》;20130501;第49卷(第9期);全文 *
Double-edge-model based Character Stroke Extraction from Complex Backgrounds;Jing Yu;《International Conference on Pattern Recognition》;20081211;全文 *
基于笔画提取和颜色模型的视频文字分割算法;程豪;《计算机工程》;20090228;第35卷(第4期);全文 *

Also Published As

Publication number Publication date
CN104598907A (en) 2015-05-06

Similar Documents

Publication Publication Date Title
CN104598907B (en) Lteral data extracting method in a kind of image based on stroke width figure
CN107578035B (en) Human body contour extraction method based on super-pixel-multi-color space
CN105469113B (en) A kind of skeleton point tracking method and system in two-dimensional video stream
CN104134234B (en) A kind of full automatic three-dimensional scene construction method based on single image
CN106875546B (en) A kind of recognition methods of VAT invoice
CN103927016B (en) Real-time three-dimensional double-hand gesture recognition method and system based on binocular vision
CN105931295B (en) A kind of geologic map Extracting Thematic Information method
CN105205488B (en) Word area detection method based on Harris angle points and stroke width
CN102663354B (en) Face calibration method and system thereof
CN1312625C (en) Character extracting method from complecate background color image based on run-length adjacent map
CN102663382B (en) Video image character recognition method based on submesh characteristic adaptive weighting
CN106846339A (en) A kind of image detecting method and device
CN103473551A (en) Station logo recognition method and system based on SIFT operators
CN104463138B (en) The text positioning method and system of view-based access control model structure attribute
CN105740945A (en) People counting method based on video analysis
CN106228157B (en) Coloured image word paragraph segmentation and recognition methods based on image recognition technology
WO2021098163A1 (en) Corner-based aerial target detection method
WO2020038312A1 (en) Multi-channel tongue body edge detection device and method, and storage medium
CN104021566A (en) GrabCut algorithm-based automatic segmentation method of tongue diagnosis image
CN108196729A (en) A kind of finger tip point rapid detection method based on infrared video
CN106127817A (en) A kind of image binaryzation method based on passage
CN103824078B (en) The many license plate locating methods of complex scene
CN107992856A (en) High score remote sensing building effects detection method under City scenarios
CN108038458B (en) Method for automatically acquiring outdoor scene text in video based on characteristic abstract diagram
CN109948461A (en) A kind of sign language image partition method based on center coordination and range conversion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant