CN110532537A - A method of text is cut based on two points of threshold methods and sciagraphy multistage - Google Patents

A method of text is cut based on two points of threshold methods and sciagraphy multistage Download PDF

Info

Publication number
CN110532537A
CN110532537A CN201910763993.0A CN201910763993A CN110532537A CN 110532537 A CN110532537 A CN 110532537A CN 201910763993 A CN201910763993 A CN 201910763993A CN 110532537 A CN110532537 A CN 110532537A
Authority
CN
China
Prior art keywords
text
scheme
unit
character
width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910763993.0A
Other languages
Chinese (zh)
Inventor
罗胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenzhou University
Original Assignee
Wenzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wenzhou University filed Critical Wenzhou University
Priority to CN201910763993.0A priority Critical patent/CN110532537A/en
Publication of CN110532537A publication Critical patent/CN110532537A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Character Input (AREA)

Abstract

The method based on two points of threshold methods and sciagraphy multistage cutting text that the invention discloses a kind of, first detects segment word with two points of threshold methods, then text is accurately positioned with sciagraphy, finally looks in residual image and adds text that is less obvious, being easy leakage;Daimonji is first handled, small text is post-processed, while after having handled a part of text in iteration, being erased from image and having been detected by text, simplifies the difficulty of subsequent processing.The advantages of the invention comprehensively utilizes two points of threshold methods and sciagraphies, the case where capable of accurately dividing size text multiple rows of shuffling.

Description

A method of text is cut based on two points of threshold methods and sciagraphy multistage
Technical field
The present invention relates to technical field of character recognition, and in particular to one kind is cut based on two points of threshold methods and sciagraphy multistage The method for cutting text.
Background technique
In text composition, it may appear that the case where size text multiple rows of shuffling, especially layout mathematical formulae when.Text It cuts through frequently with two points of threshold methods and sciagraphy.Two points of threshold methods are discrete foreground and background picture breakdown, but threshold It is worth bad selection, often has the mistakes such as multiword adhesion, individual character more than one piece;Sciagraphy can not divide multiple rows of text.
Summary of the invention
To solve the above problems, cutting text based on two points of threshold methods and sciagraphy multistage the present invention provides a kind of Method first detects text by threshold method, reprocesses daimonji, finally handles small text, fully utilize two points of threshold methods and The advantages of sciagraphy, the case where capable of accurately dividing size text multiple rows of shuffling.
To achieve the above object, the technical scheme adopted by the invention is as follows:
A method of text being cut based on two points of threshold methods and sciagraphy multistage, is included the following steps:
S1, two points of threshold values that image is calculated using Ostu method (difference method between maximum kind), it is (black to be changed into two-value for image It is white) image, white is the word in prospect, and black is background;
S2, using size, length-width ratio, duty ratio foreground area within the possible range as candidate text, be included into text Collect T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor row head, the end of line of current interim row, In Similar text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all texts are included into character set N, I.e.
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and Lowermost position is set, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively;Count text Collect the text minimum widith W in Nmin
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the projection for having character portion among data is found Maximum value Smax, minimum value Smin
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、 Right- 1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while it is all in character set M Text position is all set to Smin
S4.6, exclusion left and right ends are found in horizontal projection to there is the institute on the position of character portion in 0 data There is minimum value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, it is low-lying Interregional region is peak region, judges a possibility that lowered zones are interword gap, peak region is text unit, can Energy property is more than that peak region deposit text unit array, the lowered zones of empirical value are stored in interword gap array, and possibility is low The peak region for crossing empirical value is merged into left and right lowered zones;
S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1, will be greater than text The text unit and interword gap of mean breadth presupposition multiple are greater than the unit of maximum text width directly as detecting Text, using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as literal field Domain calculates the average word width W of each character areacWith average interword gap Wb
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous inspection in region of leaving a question open Text and region the latter detection text that leaves a question open, total L+2 unit constitute the unit collection U that leaves a question open out;With this L+2 unit construction One (L+2) × matrix of (L+2), the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts, In unit UeThe right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at Word cost;
Phe1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb);
In formula, λ13It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit Ue The right edge distance,WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;
By in matrix at word cost normalized, i.e., divided by matrix at the maximum value of word cost after, in upper right three Dynamic Programming is carried out in the belt-like zone that the width of angular moment battle array is 4, finds optimal case;Optimal case is averaged into word cost most It is small, and the variance of the variance of character width, interword gap width is also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt6δWb
In formula, λ46It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtInstitute in scheme There are the variance of character width, δWbIt is the variance of all interword gap width in scheme;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonged to existing Text class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlMerging is new Text class;
S5.2, by all texts in step S5.1 iterative processing character set L, until completing.
Further, in the step S4.7, peak region width is greater than Wmin, mean height ratio lowered zones are averaged Highly high Htmin
Further, the step of Dynamic Programming is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable points add Enter scheme;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become the kind side 4n Case;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme The number at midpoint starts to cut for the first time when being more than 3, can improve the accuracy of algorithm;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into text Word collection N, then new literacy collection N is put into text class KjIn.
The invention has the following advantages:
The advantages of the invention comprehensively utilizes two points of threshold methods and sciagraphies, can accurately divide the multiple rows of shuffling of size text The case where.
Detailed description of the invention
Fig. 1 is a kind of process for the method that text is cut based on two points of threshold methods and sciagraphy multistage of the embodiment of the present invention Figure.
Fig. 2 is constructed at word Cost matrix in the embodiment of the present invention.
Fig. 3 is in the embodiment of the present invention at the dynamic programming process in word Cost matrix.
Specific embodiment
The present invention is described in detail combined with specific embodiments below.Following embodiment will be helpful to the technology of this field Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill of this field For personnel, without departing from the inventive concept of the premise, various modifications and improvements can be made.These belong to the present invention Protection scope.
Text is cut based on two points of threshold methods and sciagraphy multistage as shown in Figure 1, the embodiment of the invention provides one kind Method, include the following steps:
S1, two points of threshold values that image is calculated using Ostu method (difference method between maximum kind), it is (black to be changed into two-value for image It is white) image, white is the word in prospect, and black is background;
S2, using length-width ratio, foreground area of the size in text possible range as candidate text, be included into character set T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor row head, the end of line of current interim row, In Similar text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all text N, i.e.,
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and Lowermost position is set, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively.Count text Collect the text minimum widith W in Nmin
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the projection for having character portion among data is found Maximum value Smax, minimum value Smin
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、 Right- 1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while it is all in character set M Text position is all set to Smin
S4.6, exclusion left and right ends are found in horizontal projection to there is the institute on the position of character portion in 0 data There is minimum value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, it is low-lying Interregional region is peak region, judges a possibility that lowered zones are interword gap, peak region is text unit, can Energy property is more than that peak region deposit text unit array, the lowered zones of empirical value are stored in interword gap array, and possibility is low The peak region for crossing empirical value is merged into left and right lowered zones;Peak region width is greater than Wmin, mean height ratio lowered zones The high H of average heighttmin
S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1, will be greater than text The text unit and interword gap of mean breadth presupposition multiple are greater than the unit of maximum text width directly as detecting Text, using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as literal field Domain calculates the average word width W of each character areacWith average interword gap Wb
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous inspection in region of leaving a question open Text and region the latter detection text that leaves a question open, total L+2 unit constitute the unit collection U that leaves a question open out;With this L+2 unit construction One (L+2) × matrix of (L+2), the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts, In unit UeThe right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at Word cost;
Phe1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb)
In formula, λ13It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit Ue The right edge distance, WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;This formula illustrates structure At character and left and right character similarity degree.
By in matrix at word cost normalized (divided by matrix at the maximum value of word cost).Such as add in Fig. 2 The point of Δ illustrates U4、U5A possibility that being merged into a character.If Uh=Ue, indicate UhUnit can not be closed with right cell And individually become a character.
Since row is character start unit, column are character ends units, therefore this matrix only has upper right triangular portions;And Due to being at most divided into four units, e-h≤4 in a character horizontal direction, this upper right triangular matrix is also diagonally gone up only Having a width is 4 belt-like zone.
Dynamic Programming is carried out in the belt-like zone that the width of upper right triangular matrix is 4, finds optimal case.Optimal case Averagely at word cost minimization, and the variance of the variance of character width, interword gap width is also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt6δWb
In formula, λ46It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtInstitute in scheme There are the variance of character width, δWbIt is the variance of all interword gap width in scheme;The step of Dynamic Programming, is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes (such as Fig. 3 (a));
(2) scheme is grown: every kind of scheme is grown downwards, such as from point (Uh,Ue) downwards growth when, select Ue+1Capable 4 Point addition scheme;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n Kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme The number at midpoint starts to cut for the first time when being more than 3, can improve the accuracy of algorithm;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into text Word collection N, then new literacy collection N is put into text class KjIn;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonged to existing Text class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlMerging is new Text class;
S5.2, by all texts in 5.1 iterative processing character set L of step S, until completing.
Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned Particular implementation, those skilled in the art can make a variety of changes or modify within the scope of the claims, this not shadow Ring substantive content of the invention.In the absence of conflict, the feature in embodiments herein and embodiment can any phase Mutually combination.

Claims (3)

1. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text, it is characterised in that:
Image is changed into bianry image by S1, two points of threshold values that image is calculated using Ostu method, and white is the word in prospect, Black is background;
S2, using size, length-width ratio, duty ratio foreground area within the possible range as candidate text, be included into character set T;
S3, all candidate characters are arranged by height descending, is polymerized to K class using density-based algorithms;
S4, processing is followed the steps below in descending order to all candidate characters:
S4.1, to current character Ti, with its uppermost position in fig-ure Up, lowermost position set DownFor current interim row row is first, end of line, similar Text KjIn find close mass center, uppermost position in fig-ure and lowermost position and set all in Up-DownInterior all texts are included into character set N, i.e.,
In formula, NpIt is any text in character set N, KjIt is jth class text, UNp、DNpIt is text NpUppermost position in fig-ure and lowermost position It sets, MNpAnd MTiIt is text NpMass center, Th0、Th1It is the tolerance limits of upper and lower position and mass center respectively;It counts in character set N Text minimum widith Wmin
S 4.2, it finds in inhomogeneity text in UNp、DNpInterior all text M, i.e.,
In formula, MqIt is any text in character set M, UMq、DMqIt is text MqUppermost position in fig-ure and lowermost position set;
S4.3, the U by imagep-DownInterior all pixels project into horizontal projection to vertical direction is cumulative;
S4.4, after excluding the data that horizontal projection left and right ends are 0, the maximum for having the projection of character portion among data is found Value Smax, minimum value Smin
S4.5, by the position (L of all texts in character set Neft、Right) one pixel (L of each diminution in left and righteft+1、Right- 1), by position (L in horizontal projectioneft+1、Right- 1) value in is all set to Smax, while all texts in character set M Position is all set to Smin
S4.6, found in horizontal projection exclude left and right ends be 0 data in have on the position of character portion it is all most Small value;
S4.7, lowered zones are set by the region where each minimum value, finds the right boundary of lowered zones, lowered zones Between region be peak region, judge a possibility that lowered zones are interword gap, peak region is text unit, it would be possible to property Peak region deposit text unit array, lowered zones more than empirical value are stored in interword gap array, and low cross of possibility passes through The peak region for testing threshold value is merged into left and right lowered zones;
It is average to will be greater than text by S4.8, the text mean breadth for counting the width of each text unit divided by step S4.1 The text unit and interword gap of width presupposition multiple are greater than the unit of maximum text width directly as the text detected, Using other units as the unit that leaves a question open, and to continuous multiple, intermediate nothing leave a question open unit detection text as character area, calculating The average word width W of each character areacWith average interword gap Wb
S4.9, the unit that will continuously leave a question open are as region of leaving a question open, by the L unit U that leave a question openi, including the previous detection text in region of leaving a question open With region the latter detection text that leaves a question open, total L+2 unit constitutes the unit collection U that leaves a question open;(L+ is constructed with this L+2 unit 2) × (L+2) matrix, the point (U in matrixh,Ue)(Uh≤ Ue, e-h≤4) and it indicates from unit UhThe left side starts, in unit Ue The right constitutes a character, point (U in the range of terminatingh,Ue) value PheIndicate this range constitute character at word cost;
Phe1(Whe-Wc)/(Whe+Wc)+λ2(Whb-Wb)/(Whb+Wb)+λ3(Web-Wb)/(Whe+Wb);
In formula, λ13It is weighting coefficient, WheIt is unit Uh, unit UeBetween width, i.e., from unit UhLeft margin is to unit UeThe right side The distance at edge,WhbIt is unit UhThe width in left side gap,WebIt is unit UeThe width in the right gap;
By in matrix at word cost normalized, i.e., divided by matrix at the maximum value of word cost after, in three angular moment of upper right Dynamic Programming is carried out in the belt-like zone that the width of battle array is 4, finds optimal case;Optimal case is average at word cost minimization, and And variance, the variance of interword gap width of character width are also minimum, such as following formula:
Cost=λ4mean(Phe)+λ5δWt6δWb
In formula, λ46It is weighting coefficient, mean (Phe) it is the average at word cost, δ of all the points in schemeWtAll words in scheme Accord with the variance of width, δWbIt is the variance of all interword gap width in scheme;
Whether there are also other remaining texts in S5, detection image, such as there are also text L, handle according to the following steps:
S5.1, the text T for taking character set Ll, T is judged by text heightlWhether existing text class is belonged to, if belonging to existing text Class, by TlIt is placed in corresponding text class, if being not belonging to any existing text class, text class quantity adds 1, by TlIt is placed in new text Word class;
S5.2, by all texts in step S5.1 iterative processing character set L, until completing.
Further, in the step S4.7, peak region width is greater than Wmin, the average height of mean height ratio lowered zones High Htmin
Further, the step of Dynamic Programming is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable point addition sides Case;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme midpoint Number start when being more than 3 to cut for the first time, the accuracy of algorithm can be improved;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into character set N, then new literacy collection N is put into text class KjIn.
2. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text as described in claim 1, feature Be: in the step S4.7, peak region width is greater than Wmin, the high H of the average height of mean height ratio lowered zonestmin
3. a kind of method based on two points of threshold methods and sciagraphy multistage cutting text as described in claim 1, feature Be: the step of Dynamic Programming, is as follows:
(1) seed is generated: with 4 points of the first row for 4 seeds, as 4 kinds of schemes;
(2) scheme is grown: every kind of scheme is grown downwards, from point (Uh,Ue) downwards growth when, select Ue+14 capable point addition sides Case;N kind scheme, every kind of scheme has 4 kinds may select when growing downwards, therefore grows a n kind scheme and become 4n kind scheme;
(3) scheme is cut: being calculated the cost of 4n kind scheme, is selected the smallest m kind scheme of cost as seed scheme;Scheme midpoint Number start when being more than 3 to cut for the first time, the accuracy of algorithm can be improved;
(4) step (2), (3) are repeated, until each scheme reaches the last one unit in the unit collection U that leaves a question open;
(5) selecting the smallest scheme of cost is optimal case, forms character by the tactful combining unit that optimal case provides;
(6) to the text found, text point corresponding in image is all set to background, the text found is then put into character set N, then new literacy collection N is put into text class KjIn.
CN201910763993.0A 2019-08-19 2019-08-19 A method of text is cut based on two points of threshold methods and sciagraphy multistage Pending CN110532537A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910763993.0A CN110532537A (en) 2019-08-19 2019-08-19 A method of text is cut based on two points of threshold methods and sciagraphy multistage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910763993.0A CN110532537A (en) 2019-08-19 2019-08-19 A method of text is cut based on two points of threshold methods and sciagraphy multistage

Publications (1)

Publication Number Publication Date
CN110532537A true CN110532537A (en) 2019-12-03

Family

ID=68663815

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910763993.0A Pending CN110532537A (en) 2019-08-19 2019-08-19 A method of text is cut based on two points of threshold methods and sciagraphy multistage

Country Status (1)

Country Link
CN (1) CN110532537A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991440A (en) * 2019-12-11 2020-04-10 易诚高科(大连)科技有限公司 Pixel-driven mobile phone operation interface text detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text
US9965695B1 (en) * 2016-12-30 2018-05-08 Konica Minolta Laboratory U.S.A., Inc. Document image binarization method based on content type separation
RU2656708C1 (en) * 2017-06-29 2018-06-06 Самсунг Электроникс Ко., Лтд. Method for separating texts and illustrations in images of documents using a descriptor of document spectrum and two-level clustering

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020618A (en) * 2011-12-19 2013-04-03 北京捷成世纪科技股份有限公司 Detection method and detection system for video image text
US9965695B1 (en) * 2016-12-30 2018-05-08 Konica Minolta Laboratory U.S.A., Inc. Document image binarization method based on content type separation
RU2656708C1 (en) * 2017-06-29 2018-06-06 Самсунг Электроникс Ко., Лтд. Method for separating texts and illustrations in images of documents using a descriptor of document spectrum and two-level clustering

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
姚正斌, 丁晓青, 刘长松: "基于笔划合并和动态规划的联机汉字切分算法", 清华大学学报(自然科学版), no. 10, 30 October 2004 (2004-10-30), pages 1417 - 1421 *
杨晓娟;宋凯;: "基于投影法的文档图像分割算法", 成都大学学报(自然科学版), no. 02, 30 June 2009 (2009-06-30), pages 139 - 141 *
杨玲玲;叶东毅;: "一种基于图像矩和纹理特征的自然场景文本检测算法", 小型微型计算机系统, no. 06, 15 June 2016 (2016-06-15), pages 1313 - 1317 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110991440A (en) * 2019-12-11 2020-04-10 易诚高科(大连)科技有限公司 Pixel-driven mobile phone operation interface text detection method
CN110991440B (en) * 2019-12-11 2023-10-13 易诚高科(大连)科技有限公司 Pixel-driven mobile phone operation interface text detection method

Similar Documents

Publication Publication Date Title
CN102509383B (en) Feature detection and template matching-based mixed number identification method
CN101515325B (en) Character extracting method in digital video based on character segmentation and color cluster
CN105261110B (en) A kind of efficiently DSP paper money number recognition methods
CN102663382B (en) Video image character recognition method based on submesh characteristic adaptive weighting
CN104504717B (en) A kind of image information detecting method and device
CN107092871B (en) Remote sensing image building detection method based on multiple dimensioned multiple features fusion
US20090123064A1 (en) Scanning images for pornography
CN102496013A (en) Chinese character segmentation method for off-line handwritten Chinese character recognition
CN104899892B (en) A kind of quickly star map image asterism extracting method
CN107784308A (en) Conspicuousness object detection method based on the multiple dimensioned full convolutional network of chain type
CN104715024A (en) Multimedia hotspot analysis method
CN105513066B (en) It is a kind of that the generic object detection method merged with super-pixel is chosen based on seed point
CN109410238A (en) A kind of fructus lycii identification method of counting based on PointNet++ network
CN103218833A (en) Edge-reinforced color space maximally stable extremal region detection method
CN105118051B (en) A kind of conspicuousness detection method applied to still image human body segmentation
US8195662B2 (en) Density-based data clustering method
CN110532537A (en) A method of text is cut based on two points of threshold methods and sciagraphy multistage
CN112991378B (en) Background separation method based on gray level distribution polarization and homogenization
CN108364300A (en) Vegetables leaf portion disease geo-radar image dividing method, system and computer readable storage medium
CN111292347B (en) Microscopic image anthrax spore density calculation method based on image processing technology
CN104616295A (en) News image horizontal headline caption simply and rapidly positioning method
CN108710881A (en) Neural network model, candidate target region generation method, model training method
CN105354845A (en) Method for semi-supervised detection on changes in remote sensing images
CN103942792A (en) Impurity detecting method in medicine detection robot based on time domain features of sequence images
CN106980872A (en) K arest neighbors sorting techniques based on polling committee

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination