CN110532537A - A method of text is cut based on two points of threshold methods and sciagraphy multistage - Google Patents
A method of text is cut based on two points of threshold methods and sciagraphy multistage Download PDFInfo
- Publication number
- CN110532537A CN110532537A CN201910763993.0A CN201910763993A CN110532537A CN 110532537 A CN110532537 A CN 110532537A CN 201910763993 A CN201910763993 A CN 201910763993A CN 110532537 A CN110532537 A CN 110532537A
- Authority
- CN
- China
- Prior art keywords
- text
- scheme
- unit
- character
- width
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000011159 matrix material Substances 0.000 claims description 17
- 238000001514 detection method Methods 0.000 claims description 10
- 230000001186 cumulative effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 description 2
- 230000007717 exclusion Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Character Input (AREA)
Abstract
The method based on two points of threshold methods and sciagraphy multistage cutting text that the invention discloses a kind of, first detects segment word with two points of threshold methods, then text is accurately positioned with sciagraphy, finally looks in residual image and adds text that is less obvious, being easy leakage;Daimonji is first handled, small text is post-processed, while after having handled a part of text in iteration, being erased from image and having been detected by text, simplifies the difficulty of subsequent processing.The advantages of the invention comprehensively utilizes two points of threshold methods and sciagraphies, the case where capable of accurately dividing size text multiple rows of shuffling.
Description
Technical field
The present invention relates to technical field of character recognition, and in particular to one kind is cut based on two points of threshold methods and sciagraphy multistage
The method for cutting text.
Background technique
In text composition, it may appear that the case where size text multiple rows of shuffling, especially layout mathematical formulae when.Text
It cuts through frequently with two points of threshold methods and sciagraphy.Two points of threshold methods are discrete foreground and background picture breakdown, but threshold
It is worth bad selection, often has the mistakes such as multiword adhesion, individual character more than one piece;Sciagraphy can not divide multiple rows of text.
Summary of the invention
To solve the above problems, cutting text based on two points of threshold methods and sciagraphy multistage the present invention provides a kind of
Method first detects text by threshold method, reprocesses daimonji, finally handles small text, fully utilize two points of threshold methods and
The advantages of sciagraphy, the case where capable of accurately dividing size text multiple rows of shuffling.
To achieve the above object, the technical scheme adopted by the invention is as follows:
A method of text being cut based on two points of threshold methods and sciagraphy multistage, is included the following steps:
S1, two points of threshold values that image is calculated using Ostu method (difference method between maximum kind), it is (black to be changed into two-value for image
It is white) image, white is the word in prospect, and black is background;
S2, using size, length-width ratio, duty ratio foreground area within the possible range as candidate text, be included into text
Collect T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor row head, the end of line of current interim row, In
Similar text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all texts are included into character set N,
I.e.
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and
Lowermost position is set, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively;Count text
Collect the text minimum widith W in Nmin;
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the projection for having character portion among data is found
Maximum value Smax, minimum value Smin;
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、
Right- 1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while it is all in character set M
Text position is all set to Smin;
S4.6, exclusion left and right ends are found in horizontal projection to there is the institute on the position of character portion in 0 data
There is minimum value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, it is low-lying
Interregional region is peak region, judges a possibility that lowered zones are interword gap, peak region is text unit, can
Energy property is more than that peak region deposit text unit array, the lowered zones of empirical value are stored in interword gap array, and possibility is low
The peak region for crossing empirical value is merged into left and right lowered zones;
S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1, will be greater than text
The text unit and interword gap of mean breadth presupposition multiple are greater than the unit of maximum text width directly as detecting
Text, using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as literal field
Domain calculates the average word width W of each character areacWith average interword gap Wb;
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous inspection in region of leaving a question open
Text and region the latter detection text that leaves a question open, total L+2 unit constitute the unit collection U that leaves a question open out;With this L+2 unit construction
One (L+2) × matrix of (L+2), the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts,
In unit UeThe right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at
Word cost;
Phe=λ1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb);
In formula, λ1-λ3It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit Ue
The right edge distance,WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;
By in matrix at word cost normalized, i.e., divided by matrix at the maximum value of word cost after, in upper right three
Dynamic Programming is carried out in the belt-like zone that the width of angular moment battle array is 4, finds optimal case;Optimal case is averaged into word cost most
It is small, and the variance of the variance of character width, interword gap width is also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt+λ6δWb
In formula, λ4-λ6It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtInstitute in scheme
There are the variance of character width, δWbIt is the variance of all interword gap width in scheme;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonged to existing
Text class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlMerging is new
Text class;
S5.2, by all texts in step S5.1 iterative processing character set L, until completing.
Further, in the step S4.7, peak region width is greater than Wmin, mean height ratio lowered zones are averaged
Highly high Htmin。
Further, the step of Dynamic Programming is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable points add
Enter scheme;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become the kind side 4n
Case;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme
The number at midpoint starts to cut for the first time when being more than 3, can improve the accuracy of algorithm;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into text
Word collection N, then new literacy collection N is put into text class KjIn.
The invention has the following advantages:
The advantages of the invention comprehensively utilizes two points of threshold methods and sciagraphies, can accurately divide the multiple rows of shuffling of size text
The case where.
Detailed description of the invention
Fig. 1 is a kind of process for the method that text is cut based on two points of threshold methods and sciagraphy multistage of the embodiment of the present invention
Figure.
Fig. 2 is constructed at word Cost matrix in the embodiment of the present invention.
Fig. 3 is in the embodiment of the present invention at the dynamic programming process in word Cost matrix.
Specific embodiment
The present invention is described in detail combined with specific embodiments below.Following embodiment will be helpful to the technology of this field
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field
For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention
Protection scope.
Text is cut based on two points of threshold methods and sciagraphy multistage as shown in Figure 1, the embodiment of the invention provides one kind
Method, include the following steps:
S1, two points of threshold values that image is calculated using Ostu method (difference method between maximum kind), it is (black to be changed into two-value for image
It is white) image, white is the word in prospect, and black is background;
S2, using length-width ratio, foreground area of the size in text possible range as candidate text, be included into character set T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor row head, the end of line of current interim row, In
Similar text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all text N, i.e.,
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and
Lowermost position is set, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively.Count text
Collect the text minimum widith W in Nmin;
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the projection for having character portion among data is found
Maximum value Smax, minimum value Smin;
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、
Right- 1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while it is all in character set M
Text position is all set to Smin;
S4.6, exclusion left and right ends are found in horizontal projection to there is the institute on the position of character portion in 0 data
There is minimum value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, it is low-lying
Interregional region is peak region, judges a possibility that lowered zones are interword gap, peak region is text unit, can
Energy property is more than that peak region deposit text unit array, the lowered zones of empirical value are stored in interword gap array, and possibility is low
The peak region for crossing empirical value is merged into left and right lowered zones;Peak region width is greater than Wmin, mean height ratio lowered zones
The high H of average heighttmin。
S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1, will be greater than text
The text unit and interword gap of mean breadth presupposition multiple are greater than the unit of maximum text width directly as detecting
Text, using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as literal field
Domain calculates the average word width W of each character areacWith average interword gap Wb;
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous inspection in region of leaving a question open
Text and region the latter detection text that leaves a question open, total L+2 unit constitute the unit collection U that leaves a question open out;With this L+2 unit construction
One (L+2) × matrix of (L+2), the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts,
In unit UeThe right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at
Word cost;
Phe=λ1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb)
In formula, λ1-λ3It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit Ue
The right edge distance, WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;This formula illustrates structure
At character and left and right character similarity degree.
By in matrix at word cost normalized (divided by matrix at the maximum value of word cost).Such as add in Fig. 2
The point of Δ illustrates U4、U5A possibility that being merged into a character.If Uh=Ue, indicate UhUnit can not be closed with right cell
And individually become a character.
Since row is character start unit, column are character ends units, therefore this matrix only has upper right triangular portions;And
Due to being at most divided into four units, e-h≤4 in a character horizontal direction, this upper right triangular matrix is also diagonally gone up only
Having a width is 4 belt-like zone.
Dynamic Programming is carried out in the belt-like zone that the width of upper right triangular matrix is 4, finds optimal case.Optimal case
Averagely at word cost minimization, and the variance of the variance of character width, interword gap width is also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt+λ6δWb
In formula, λ4-λ6It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtInstitute in scheme
There are the variance of character width, δWbIt is the variance of all interword gap width in scheme;The step of Dynamic Programming, is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes (such as Fig. 3 (a));
(2) scheme is grown: every kind of scheme is grown downwards, such as from point (Uh,Ue) downwards growth when, select Ue+1Capable 4
Point addition scheme;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n
Kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme
The number at midpoint starts to cut for the first time when being more than 3, can improve the accuracy of algorithm;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into text
Word collection N, then new literacy collection N is put into text class KjIn;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonged to existing
Text class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlMerging is new
Text class;
S5.2, by all texts in 5.1 iterative processing character set L of step S, until completing.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned
Particular implementation, those skilled in the art can make a variety of changes or modify within the scope of the claims, this not shadow
Ring substantive content of the invention.In the absence of conflict, the feature in embodiments herein and embodiment can any phase
Mutually combination.
Claims (3)
1. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text, it is characterised in that:
Image is changed into bianry image by S1, two points of threshold values that image is calculated using Ostu method, and white is the word in prospect,
Black is background;
S2, using size, length-width ratio, duty ratio foreground area within the possible range as candidate text, be included into character set T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor current interim row row is first, end of line, similar
Text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all texts are included into character set N, i.e.,
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and lowermost position
It sets, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively;It counts in character set N
Text minimum widith Wmin;
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the maximum for having the projection of character portion among data is found
Value Smax, minimum value Smin;
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、Right-
1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while all texts in character set M
Position is all set to Smin;
S4.6, found in horizontal projection exclude left and right ends be 0 data in have on the position of character portion it is all most
Small value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, lowered zones
Between region be peak region, judge a possibility that lowered zones are interword gap, peak region is text unit, it would be possible to property
Peak region deposit text unit array, lowered zones more than empirical value are stored in interword gap array, and low cross of possibility passes through
The peak region for testing threshold value is merged into left and right lowered zones;
It is average to will be greater than text by S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1
The text unit and interword gap of width presupposition multiple are greater than the unit of maximum text width directly as the text detected,
Using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as character area, calculating
The average word width W of each character areacWith average interword gap Wb;
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous detection text in region of leaving a question open
With region the latter detection text that leaves a question open, total L+2 unit constitutes the unit collection U that leaves a question open;(L+ is constructed with this L+2 unit
2) × (L+2) matrix, the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts, in unit Ue
The right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at word cost;
Phe=λ1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb);
In formula, λ1-λ3It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit UeThe right side
The distance at edge,WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;
By in matrix at word cost normalized, i.e., divided by matrix at the maximum value of word cost after, in three angular moment of upper right
Dynamic Programming is carried out in the belt-like zone that the width of battle array is 4, finds optimal case;Optimal case is average at word cost minimization, and
And variance, the variance of interword gap width of character width are also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt+λ6δWb
In formula, λ4-λ6It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtAll words in scheme
Accord with the variance of width, δWbIt is the variance of all interword gap width in scheme;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonging to existing text
Class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlIt is placed in new text
Word class;
S5.2, by all texts in step S5.1 iterative processing character set L, until completing.
Further, in the step S4.7, peak region width is greater than Wmin, the average height of mean height ratio lowered zones
High Htmin。
Further, the step of Dynamic Programming is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable point addition sides
Case;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme midpoint
Number start when being more than 3 to cut for the first time, the accuracy of algorithm can be improved;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into character set
N, then new literacy collection N is put into text class KjIn.
2. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text as described in claim 1, feature
Be: in the step S4.7, peak region width is greater than Wmin, the high H of the average height of mean height ratio lowered zonestmin。
3. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text as described in claim 1, feature
Be: the step of Dynamic Programming, is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable point addition sides
Case;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme midpoint
Number start when being more than 3 to cut for the first time, the accuracy of algorithm can be improved;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into character set
N, then new literacy collection N is put into text class KjIn.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910763993.0A CN110532537A (en) | 2019-08-19 | 2019-08-19 | A method of text is cut based on two points of threshold methods and sciagraphy multistage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910763993.0A CN110532537A (en) | 2019-08-19 | 2019-08-19 | A method of text is cut based on two points of threshold methods and sciagraphy multistage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110532537A true CN110532537A (en) | 2019-12-03 |
Family
ID=68663815
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910763993.0A Pending CN110532537A (en) | 2019-08-19 | 2019-08-19 | A method of text is cut based on two points of threshold methods and sciagraphy multistage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110532537A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991440A (en) * | 2019-12-11 | 2020-04-10 | 易诚高科(大连)科技有限公司 | Pixel-driven mobile phone operation interface text detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020618A (en) * | 2011-12-19 | 2013-04-03 | 北京捷成世纪科技股份有限公司 | Detection method and detection system for video image text |
US9965695B1 (en) * | 2016-12-30 | 2018-05-08 | Konica Minolta Laboratory U.S.A., Inc. | Document image binarization method based on content type separation |
RU2656708C1 (en) * | 2017-06-29 | 2018-06-06 | Самсунг Электроникс Ко., Лтд. | Method for separating texts and illustrations in images of documents using a descriptor of document spectrum and two-level clustering |
-
2019
- 2019-08-19 CN CN201910763993.0A patent/CN110532537A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020618A (en) * | 2011-12-19 | 2013-04-03 | 北京捷成世纪科技股份有限公司 | Detection method and detection system for video image text |
US9965695B1 (en) * | 2016-12-30 | 2018-05-08 | Konica Minolta Laboratory U.S.A., Inc. | Document image binarization method based on content type separation |
RU2656708C1 (en) * | 2017-06-29 | 2018-06-06 | Самсунг Электроникс Ко., Лтд. | Method for separating texts and illustrations in images of documents using a descriptor of document spectrum and two-level clustering |
Non-Patent Citations (3)
Title |
---|
姚正斌, 丁晓青, 刘长松: "基于笔划合并和动态规划的联机汉字切分算法", 清华大学学报(自然科学版), no. 10, 30 October 2004 (2004-10-30), pages 1417 - 1421 * |
杨晓娟;宋凯;: "基于投影法的文档图像分割算法", 成都大学学报(自然科学版), no. 02, 30 June 2009 (2009-06-30), pages 139 - 141 * |
杨玲玲;叶东毅;: "一种基于图像矩和纹理特征的自然场景文本检测算法", 小型微型计算机系统, no. 06, 15 June 2016 (2016-06-15), pages 1313 - 1317 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110991440A (en) * | 2019-12-11 | 2020-04-10 | 易诚高科(大连)科技有限公司 | Pixel-driven mobile phone operation interface text detection method |
CN110991440B (en) * | 2019-12-11 | 2023-10-13 | 易诚高科(大连)科技有限公司 | Pixel-driven mobile phone operation interface text detection method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102509383B (en) | Feature detection and template matching-based mixed number identification method | |
CN101515325B (en) | Character extracting method in digital video based on character segmentation and color cluster | |
CN105261110B (en) | A kind of efficiently DSP paper money number recognition methods | |
CN102663382B (en) | Video image character recognition method based on submesh characteristic adaptive weighting | |
CN104504717B (en) | A kind of image information detecting method and device | |
CN107092871B (en) | Remote sensing image building detection method based on multiple dimensioned multiple features fusion | |
US20090123064A1 (en) | Scanning images for pornography | |
CN102496013A (en) | Chinese character segmentation method for off-line handwritten Chinese character recognition | |
CN104899892B (en) | A kind of quickly star map image asterism extracting method | |
CN107784308A (en) | Conspicuousness object detection method based on the multiple dimensioned full convolutional network of chain type | |
CN104715024A (en) | Multimedia hotspot analysis method | |
CN105513066B (en) | It is a kind of that the generic object detection method merged with super-pixel is chosen based on seed point | |
CN109410238A (en) | A kind of fructus lycii identification method of counting based on PointNet++ network | |
CN103218833A (en) | Edge-reinforced color space maximally stable extremal region detection method | |
CN105118051B (en) | A kind of conspicuousness detection method applied to still image human body segmentation | |
US8195662B2 (en) | Density-based data clustering method | |
CN110532537A (en) | A method of text is cut based on two points of threshold methods and sciagraphy multistage | |
CN112991378B (en) | Background separation method based on gray level distribution polarization and homogenization | |
CN108364300A (en) | Vegetables leaf portion disease geo-radar image dividing method, system and computer readable storage medium | |
CN111292347B (en) | Microscopic image anthrax spore density calculation method based on image processing technology | |
CN104616295A (en) | News image horizontal headline caption simply and rapidly positioning method | |
CN108710881A (en) | Neural network model, candidate target region generation method, model training method | |
CN105354845A (en) | Method for semi-supervised detection on changes in remote sensing images | |
CN103942792A (en) | Impurity detecting method in medicine detection robot based on time domain features of sequence images | |
CN106980872A (en) | K arest neighbors sorting techniques based on polling committee |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |