CN108647681A - A kind of English text detection method with text orientation correction - Google Patents

A kind of English text detection method with text orientation correction Download PDF

Info

Publication number
CN108647681A
CN108647681A CN201810429149.XA CN201810429149A CN108647681A CN 108647681 A CN108647681 A CN 108647681A CN 201810429149 A CN201810429149 A CN 201810429149A CN 108647681 A CN108647681 A CN 108647681A
Authority
CN
China
Prior art keywords
text
filed
preliminary
text filed
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810429149.XA
Other languages
Chinese (zh)
Other versions
CN108647681B (en
Inventor
代劲
王族
尹航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201810429149.XA priority Critical patent/CN108647681B/en
Publication of CN108647681A publication Critical patent/CN108647681A/en
Application granted granted Critical
Publication of CN108647681B publication Critical patent/CN108647681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The invention belongs to technical field of image processing, specially a kind of English text detection method with text orientation correction;The method includes:Respectively to the carry out maximum stable extremal region detection in each channel of English text image, obtain candidate text filed;The grader based on convolutional neural networks model is established, the candidate of filter false is text filed, obtains preliminary text filed;Using the double-deck text packets algorithm by the preliminary text filed grouping;By the preliminary text filed carry out correction for direction after grouping, to obtain correction text;The present invention uses a kind of multichannel MSER models of enhancing:It is finer text filed to obtain;Parallel SPP CNN graders are introduced to better discriminate between text filed and non-textual region, the image of arbitrary size can be handled, and can be in multiple dimensioned lower extraction pond feature, so as to understand more features by the multilayer spatial information of source images;The present invention can handle the scene text being slightly slanted.

Description

A kind of English text detection method with text orientation correction
Technical field
The invention belongs to technical field of image processing, specially a kind of English text detection side with text orientation correction Method.
Background technology
Text in natural scene image has accurate, abundant information, this turning over for image analysis, based on image It translates, picture search etc. is of great significance.In past 20 years, researcher proposes some and is examined in natural scene image The method for surveying text.There are many based on content multimedia understand application, as automatic vision classification, image retrieval, assisting navigation, Multilingual translation, Object identifying and the application to satisfy the needs of consumers.
Scene text detects the critical issue faced:(1) text in file and picture has regular font, similar face Color, evenly sized and evenly distributed, even if in Same Scene, the text in natural scene may also have different words Body, color, ratio and direction.(2) background of natural scene image may be extremely complex.Mark, fence, brick and meadow are difficult to It is distinguished with real text, therefore be easy to cause and obscure and mistake.(3) other disturbing factors in scene character image.Such as Non-uniform illumination obscures, translucent effects etc..
Researcher proposes many methods to detect the text in natural scene image, and there are two types of main methods.
Text is considered as a kind of texture of specific type, and uses their texture properties by the method based on texture, such as office Portion's intensity, filter response and wavelet coefficient distinguish the text filed and non-textual region of image.The meter of these usual methods Calculation amount is very big, because to scan all positions and scale.In addition, these methods mainly handle lateral text, to rotating and scaling It is very sensitive;
Text is considered as connection component by component-oriented approach, first by various methods (such as color cluster or extreme area Extract in domain) text is extracted, the grader then trained using the regular of manual designs or automatically is filtered non-textual component. It is more effective generally, based on the method for component, because component count to be processed is relatively fewer.In addition, these methods are to rotation, contracting It puts and font is all insensitive.The conventional method of detection candidate text filed (Candidate Text Region, be denoted as CTR) has Maximum stable extremal region (Maximally Stable Extremal Regions, be denoted as MSER), this method is for image It is affine variation have very strong robustness, can efficiently extract it is text filed in image, after there is scholar to improve MSER's Extraction algorithm makes the time complexity of algorithm reach linear session.
These methods are according to the rule or feature for distinguishing text filed and non-textual region, thus by text filed and non-text One's respective area is distinguished, although these methods are capable of detecting when text, lacks the correction to English text, and to tilting text Differentiation effect and bad, the text identified can have serious separation because of the inclination of word.
Invention content
In view of this, the present invention proposes a kind of English text detection method with text orientation correction, it can be effective Identify text, and will identify that tilt text be corrected, specifically include following steps:
S1, maximum stable extremal region detection is carried out to each channel of the sharpening image of English text image respectively, from figure Extraction MSER is as text candidates as in;It obtains candidate text filed;
S2, the grader based on convolutional neural networks model is established, extracts candidate text filed feature;It utilizes Softmax functions text filed are divided into text class region and non-textual class region according to candidate text filed feature, by candidate; Non-textual class region is filtered, acquisition is preliminary text filed, that is, detects English text;
S3, using the double-deck text packets algorithm by the preliminary text filed grouping;
S4, by the preliminary text filed carry out correction for direction after grouping, to realize the correction of English text.
Further, the channel includes:Red channel, green channel, blue channel, tone channel, saturation degree channel, Lightness channel and grey channel.
Further, the grader of the foundation based on convolutional neural networks model extracts candidate text filed spy Sign includes:Candidate text filed fisrt feature is obtained according to five layer architectures in grader and waited by cross-layer respectively The second feature of selection one's respective area, wherein five layer architectures include the first convolutional layer being sequentially connected, maximum pond layer, volume Two Lamination, pyramid pond layer and full articulamentum;Cross-layer indicates the first convolutional layer to full articulamentum.
Further, candidate text filed progress first time filtering is checked using the first convolution in the first layer architecture;It will Filtered candidate is text filed for the first time carries out maximum pond in the second layer architecture;Utilize the volume Two in third layer architecture Product core, to candidate's second of filtering of text filed progress behind maximum pond;It is text filed to second of filtered candidate, It carries out utilizing pyramid pond in 4th layer architecture;It is text filed to the candidate behind pyramid pond to be carried out in layer 5 framework Full connection, to obtain candidate text filed fisrt feature.
Further, using the feature added manually, by the candidate text filed progress first time filtering of the first convolution verification; It is connected entirely according to the feature added manually by filtered candidate is text filed, to obtain candidate text filed second Feature.
Further, the feature added manually includes:Depth-width ratio, compactness, stroke width area ratio, local contrast Degree and boundary key point.
Further, the calculation formula of the local contrast is:
Wherein, lc indicates local contrast;RiIndicate the ith pixel of red channel;GiIndicate i-th of green channel Pixel;BiIndicate the ith pixel of blue channel;N indicates that the pixel total number in the regions MSER, k indicate of boundary key point Number.
Further, the acquisition modes of the boundary key point are:
Build binary picture;The all pixels of binary picture described in iteration;Calculate profile point;It is drawn using Doug Si-Pu Ke compression algorithm profile points obtain boundary key point and specifically include:
The gray value for belonging to pixel in maximum stable extremal region is set as 255;Maximum stable extremal region will be belonged to Outside, and belong to the gray value of pixel in the minimum enclosed rectangle region of maximum stable extremal region and be set as 0;If pixel The pixel value p (x, y)=255 of (x, y), and at p (x+1, y), p (x-1, y), p (x, y+1) are there are one value in p (x, y-1) 0, then pixel p (x, y) belong to profile point;Using Douglas-Pu Ke compression algorithm profile points, pass through compressed remaining profile Point is boundary key point.
Further, described to include by the preliminary text filed grouping using the double-deck text packets algorithm:It will preliminary text One's respective area carries out vertical grouping and horizontal grouping respectively;
The vertical grouping specifically includes as follows:
Obtain the minimum Y axis coordinate b that n-th of preliminary text filed middle pixel is 255n;Obtain (n+1)th preliminary text area The maximum Y axis coordinate t that pixel is 255 in domainn+1;Obtain (n+1)th preliminary text filed height hn+1
Computed altitude is poorIf difference in height dn,n+1More than height threshold;Then by two preliminary texts Region division is identical class, that is, belongs to one text row;If difference in height dn,n+1Less than or equal to height threshold, then at the beginning of two It is not same class to walk text filed, (n+1)th it is preliminary it is text filed be considered as new class, and new line of text is in Y direction quilt It splits;
The level is grouped specific steps:
Two adjacent tentatively the distance between text filed poor Δ d in one text row in acquisition X-axis;Range difference Δ d packets It includes:The distance between letter d in same word1, the distance between word d2
According to coefficientIt indicates the mean breadth of all letters in line of text, word is separated according to width threshold value;
Obtain the ratio of pitch and intervalIf the ratio d of pitch and intervalhLess than width Threshold value, then the two it is adjacent it is preliminary it is text filed belong to same class, i.e., the same word, if the ratio of pitch and interval Value dhMore than or equal to width threshold value, the two it is adjacent it is preliminary it is text filed be not belonging to same class, i.e., the two regions are not belonging to The same word, by the preliminary text filed beginning as a new word of the latter.
Further, the preliminary text filed progress correction for direction by after grouping includes:
S401, respectively will be preliminary text filed to rotate clockwise α degree after grouping using coordinate rotation formula;Setting is just Initial value i=1, α=- 30 °;
S402, by Model Matching process, the filtering of the group box that introduces errors into;Obtain i-th of correction text area undetermined Domain;
S403, work as i<When 6 ,+10 ° of i=i+1, α=α, return to step S401;As i=6, by the 1st correction text undetermined This is to the 6th correction text overlay undetermined, to obtain correction of a final proof text.
Further, the coordinate rotation formula includes:
X'=xcos θ+ysin θ
Y'=ycos θ-xsin θ
Wherein, x indicates the abscissa of pixel;Y indicates the ordinate of pixel;θ indicates rotation angle threshold value;X' is indicated The abscissa of pixel after rotation;Y' indicates the ordinate of pixel after rotation;
The group box includes:Tilt group box and long interval group box;The inclination group box includes a letter; The letter that the long interval group box includes is located at both ends.
Beneficial effects of the present invention:The invention has the advantages that:Using a kind of multichannel MSER models of enhancing:From R, G, MSER is detected in B, H, S, V and grey channel, it is text filed with the candidate for obtaining finer.Introduce parallel SPP-CNN (Spatial pyramid pooling (SPP)-Convolutional Neural Networks (CNN), pyramid pond- Convolutional neural networks) grader better discriminates between text filed and non-textual region, and which can handle arbitrary size Image, and can be in multiple dimensioned lower extraction pond feature, so as to pass through the multilayer space of source images (English text image) Information understands more features;By Model Matching process, the group box introduced errors into filters;Inclined field can be handled Scape text realizes the correction to English text.
Description of the drawings
Fig. 1 is the flow chart of the present invention;
Fig. 2 is the cross-layer SPP-CNN algorithm Organization Charts that the present invention uses;
Fig. 3 is the schematic diagram of SPP working methods in the prior art;
Fig. 4 is the text packets system assumption diagram of the present invention;
Fig. 5 is the constraint schematic diagram of the line of text of the present invention;
Fig. 6 obtains for the present invention without correction for direction preliminary text filed;
Fig. 7 is that the present invention is text filed by the correction of a final proof of correction for direction;
Fig. 8 is the direction rotating model of the present invention;
Fig. 9 is the Matching Model of group box of the present invention;
Figure 10 present invention is the testing result figure of different rotary;
Figure 11 present invention is the case diagram of testing result.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with attached drawing to of the invention real The technical solution applied in example is clearly and completely described, it is clear that described embodiment is only that a present invention part is implemented Example, instead of all the embodiments.
The present invention provides a kind of English text detection methods with text orientation correction, as shown in Figure 1, it includes such as Lower step:
S1, maximum stable extremal region detection is carried out to each channel of the sharpening image of English text image respectively, from figure Extraction MSER is as text candidates as in;It obtains candidate text filed;
S2, the classification based on convolutional neural networks (Convolutional Neural Networks, CNN) model is established Device extracts candidate text filed feature, using softmax functions according to candidate text filed feature, by candidate text Region is divided into text class region and non-textual class region;Non-textual class region is filtered, acquisition is preliminary text filed, that is, detects English Text;
S3, using the double-deck text packets algorithm by the preliminary text filed grouping;
S4, the preliminary text filed carry out correction for direction after grouping is obtained into school to realize the correction of English text Positive text, the correction text are the English text after correcting.
Preferably, five layer architectures of the CNN models that the present invention uses are as shown in Figure 2:
First layer architecture use size for 7 × 7 × 5 the first convolution kernel;It for 7 width be 7 depth is 5 that expression, which uses length, Convolution kernel;
Second layer architecture uses 5 × 5 × 5 maximum pond;It indicates to use length for maximum pond that 5 width are that 5 depth are 5 Change;
Second convolution kernel of 5 × 3 × 5 convolution of third layer framework applications;It for 5 width be 3 depth is 5 that expression, which uses length, Convolution kernel;
4th layer architecture uses the ponds SPP;Fig. 3 is the schematic diagram of SPP working methods, and 3 are respectively adopted to same image × 3 ponds (that is to say that length is 3 pond of 3 width) are divided into 9 blocks, and 2 × 2 ponds are divided into 4 blocks and 1 × 1 pondization point At 1 block, each piece of maximum value is calculated separately, to obtain output neuron, then the image of arbitrary size is converted into one 14 dimensional features of a fixed size.It is understood that the present invention can be increased pyramidal with arbitrarily devised different dimensions size The number of plies or the size for changing grid division.
Layer 5 framework uses full articulamentum;It specifically includes:
Candidate text filed progress first time filtering is checked using the first convolution in the first layer architecture;First time is filtered The text filed maximum pond of the progress in the second layer architecture of candidate afterwards;Using the second convolution kernel in third layer architecture, to most Candidate's second of filtering of text filed progress after great Chiization;It is text filed to second of filtered candidate, in the 4th layer frame It carries out utilizing pyramid pond in structure;It is text filed to the candidate behind pyramid pond to be connected entirely in layer 5 framework, To extract candidate text filed fisrt feature;
Using the feature added manually, by the candidate text filed progress first time filtering of the first convolution verification;After filtering Candidate it is text filed connected entirely according to the feature added manually, to extract the text filed second feature of candidate.
Preferably, the feature of manual designs is embedded into entire CNN, i.e. cross-layer.Cross-layer is only in the first layer architecture and It works in five layer architectures, the feature used in cross-layer namely the feature added manually include:
Depth-width ratioCompactnessStroke width area ratioLocal contrast lc and boundary key point k.
Wherein w and h respectively represents the width and height (pixel number) of the minimum enclosed rectangle of maximum stable extremal region;a Indicate the area (all pixels point number in region) of the minimum enclosed rectangle of maximum stable extremal region;P indicates maximum stable The number of the minimum enclosed rectangle boundary point of extremal region indicates p with boundary key point k in the present invention.
Local contrast can be obtained using following equation:
Wherein, lc indicates local contrast;RiIndicate the ith pixel of red channel;GiIndicate i-th of green channel Pixel;BiIndicate the ith pixel of blue channel;N indicates that the pixel total number in the regions MSER, k indicate of boundary key point Number.
By connecting boundary key point in sequence, it can approximatively restore original area, that is to say and obtain It gets preliminary text filed.
The calculating process of k:
Build binary picture;The all pixels of binary picture described in iteration;Calculate profile point;It is drawn using Doug Si-Pu Ke compression algorithm profile points, the profile point after overcompression are boundary key point;It specifically includes:
The gray value for belonging to pixel in maximum stable extremal region is set as 255;Maximum stable extremal region will be belonged to Outside, and belong to the gray value of pixel in the minimum enclosed rectangle region of maximum stable extremal region and be set as 0;If pixel The pixel value p (x, y)=255 of (x, y), and at p (x+1, y), p (x-1, y), p (x, y+1) are there are one value in p (x, y-1) 0, then pixel p (x, y) belong to profile point;It is compressed using Douglas-Pu Ke algorithms (Douglas-Peucker algorithm) Profile point is boundary key point by compressed remaining profile point.
Alternatively, the classification of final feature is obtained using SoftMax classification functions;
After the preliminary text filed grouping is carried out text packets using the double-deck text packets algorithm, then carry out low dip Correction for direction, as shown in Figure 4.It is specifically divided into three parts:Vertical grouping, horizontal grouping and correction for direction:
Vertical grouping key step is as follows:
Obtain the minimum Y axis coordinate b that n-th of preliminary text filed middle pixel is 255n;Obtain (n+1)th preliminary text area The maximum Y axis coordinate t that pixel is 255 in domainn+1;Obtain (n+1)th preliminary text filed height hn+1;As shown in Figure 5;
Computed altitude is poorIf difference in height dn,n+1More than height threshold;Then by two preliminary texts One's respective area is divided into identical class, that is, belongs to one text row;If difference in height dn,n+1Less than or equal to height threshold, then two It is preliminary it is text filed be not identical class, (n+1)th it is preliminary it is text filed be considered as new line of text, and the new line of text is existed Y direction is split;
Wherein, height threshold of the invention takes 0.62;
Level grouping key step is as follows:
Two adjacent tentatively the distance between text filed poor Δ d in one text row in acquisition X-axis;Range difference Δ d packets It includes:The distance between letter d in same word1The distance between word d2
According to coefficientIt indicates the mean breadth of all letters in line of text, word is separated according to width threshold value;
Obtain the ratio of pitch and intervalIf the ratio d of pitch and intervalhLess than width Threshold value, two it is adjacent it is preliminary it is text filed belong to same class, i.e., the same word, if the ratio d of pitch and intervalhGreatly In or be equal to width threshold value, two it is adjacent it is preliminary it is text filed be not belonging to same class, i.e., the two it is adjacent it is preliminary it is text filed not Belong to the same word, by the preliminary text filed beginning as new word of the latter.
Wherein, coefficientBe according to use from 2013 training sets of ICDAR include 229 pictures and 1226 words What experiment obtained;
The width threshold value of the present invention takes 2.33.
Steps are as follows for low dip correction for direction:
Fig. 6 is illustrated to tentatively text filed, it can be seen that word is seriously detached because of inclination, and " ne1Wor " is recognized To be the word of same a line;According to experiment, show that the word within 10 degree of slight inclination can be correctly grouped, therefore use The strategy of rotatable coordinate axis, has obtained the correction of a final proof text as shown in Fig. 7.
Due to the rotation of reference axis, group box " wordline1 " is correctly grouped, but the group box that mistake introduces " wordline2 " is not corrected correctly, so carrying out innovatory algorithm using rotation convergence strategy:
The text filed carry out correction for direction by after grouping includes:
S401, respectively will be preliminary text filed to rotate clockwise α degree after grouping using coordinate rotation formula;Setting is just Initial value i=1, α=- 30 °;
S402, by Model Matching process, the filtering of the group box that introduces errors into;Obtain i-th of correction text area undetermined Domain;
S403, work as i<When 6 ,+10 ° of i=i+1, α=α;Return to step S401, as i=6, by the 1st correction text undetermined This is to the 6th correction text overlay undetermined, to obtain correction of a final proof text.
Specifically,
It respectively will be preliminary text filed with tens degree clockwise or counterclockwise after grouping using coordinate rotation formula; Set initial value i=1;
In the present invention, respectively to be rotated both clockwise and counterclockwise 10,20,30 degree, as shown in Figure 8;
By Model Matching process, the group box introduced errors into filters;It is text filed to obtain i-th of correction undetermined;
By i-th of correction text undetermined with tens degree clockwise or counterclockwise, introduced errors by Model Matching Group box filtering, obtain i+1 correction text undetermined;
Work as i<When 6, i=i+1;Return to step S401, as i=6, by the 1st correction text undetermined to the 6th school undetermined Positive text overlay, to obtain correction of a final proof text;As shown in Figure 9.
As another realization method, in step S403, i can be not limited to be equal to 6, can also be arbitrary in 5,7,8 One number.
It is understood that Model Matching process is according to corresponding in the postrotational preliminary text filed image to training set The matched process of English text retains the lap, accordingly if postrotational image can be Chong Die with the training set image , then another rotation is done to tentatively text filed, if postrotational image can be Chong Die with another training set image, again Retain lap finally to be superimposed out by all laps, obtains correction of a final proof text.
(a) shows a model for being known as " tilting group box " in Fig. 9, includes only one the model describe each frame Letter.The group box that this mistake introduces occurs when mainly text tilts in a single direction.
(b) shows that one is known as " long interval group box " in Fig. 9, and the group box that this mistake introduces indicates that letter is located at There is prodigious interval at the both ends of each frame between them.Such case will be in the distance between the text that different directions arrange Occur when too close.
An important factor for increment of rotation and number of revolutions are testing results.It, will most for balance quality and time complexity High rotation angle degree is set as 30 degree.As shown in Figure 10, the increment of rotation is spent from 1 to 15 and is differed, and increment is smaller, obtains maximal degree institute The number needed is more, the experimental results showed that, when increment reaches 10 degree, three indexs such as precision, recall rate and f indexs reach Peak value, this is the final rotation angle threshold value of the method proposed.
In the present invention, in order to verify proposition algorithm correctness and validity, to ICDAR 2011 and ICDAR Contrast experiment has been carried out on 2013 data sets.2011 test sets of ICDAR include 255 images, and 2013 test sets of ICDAR include 233 images.Each image corresponds to a txt document, it has recorded the specific coordinate for the text for needing to detect.
The assessment of detection result mainly calculates the text filed registration between actual text region of correction detected. For each rectangle to be assessed, maximum matching value is used.Formula is as follows:
m(r;R)=max m (r, r') | r' ∈ R }
R indicates that correction is text filed, and r' indicates actual text region;A (r) indicates the rectangular area of the text filed r of correction, R indicates matched region collection.Maximum area matching is obtained, then computational accuracy, recall rate and f indexs.Formula is as follows:
E indicates that the text filed set of correction to be detected, T indicate rectangular set to be assessed.F-measure be precision and The combination of recall rate.The relative weighting of precision and recall rate is controlled by parameter alpha, is usually arranged as 0.5, and precision and recall rate is made to have There is identical weight.
In the present invention, some comparative experiments demonstrate proposed method can extract it is more text filed.
The extraction result of 1 difference MSER methods of table
According to table 1 (only considering the performance of alphabetical rank, do not consider the final result of word level), in Laplacian and After the pretreatment of multichannel, more text filed (recall rate increases) can be extracted, but is also extracted more non-textual Region (precision reduction).
Context of methods and existing Method for text detection are carried out quantitative ratio by the validity of the method used to illustrate the invention Compared with.Training set is manually generated by ICDAR 2011 and ICDAR 2013 using multichannel MSER, it includes 44908 English texts Image and 56139 non English language text images.Collect 25% training set as verification, by training process, rate of accuracy reached arrives 96%.Training for SPP-CNN, using cross validation and stochastic gradient descent (SGD).To ICDAR 2011 and ICDAR 2013 compare experiment in 5 kinds of methods, the English text image identified for the present invention such as Figure 11, it can be seen that this Invention can effectively identify English text and can realize correction.
As can be seen that the method for the present invention is in recall rate and f indexs from table 2 and table 3, it is superior to the prior art.
Influence of the table 2 in 2011 Scene text detections of ICDAR
Influence of the table 3 in 2013 Scene text detections of ICDAR
It is corresponded to respectively in table 2 and the documents in table 3:
[1]Liu Z,Li Y,Qi X,et al.Method for unconstrained text detection in natural scene image[J].Iet Computer Vision,2017,11(7):596-604.
[2]Wu H,Zou B,Zhao YQ,et al.Natural scene text detection by multi- scale adaptive color clustering and non-text filtering[J].Neurocomputing, 2016,214:1011–1025.
[3]Yu C,Song Y,Zhang Y.Scene text localization using edge analysis and feature pool[J].Neurocomputing,2015,175:652-661.
[4]Yao Li,Wenjing Jia,Chunhua Shen,et al.Characterness:An Indicator of Text in the Wild[J].IEEE transactions on image processing:a publication of the IEEE Signal Processing Society,2014,23(4):1666-77.
[5]Tian C,Xia Y,Zhang X,et al.Natural Scene Text Detection with MC-MR Candidate Extraction and Coarse-to-Fine Filtering[J].Neurocomputing,2017.
[6]Zhu A,Gao R,Uchida S.Could scene context be beneficial for scene text detection[J].Pattern Recognition,2016,58:204-215.
[7]Neumann L,Matas J.Efficient Scene text localization and recognition with local character refinement[C]//International Conference on Document Analysis and Recognition.IEEE,2015:746-750.
[8]Gomez L,Karatzas D.A fast hierarchical method for multi-script and arbitrary oriented scene text extraction[J].2014,19(4):1-15.
The present invention can detect the text of slight inclination and the English text with different fonts or size, as Fig. 9 be at Work(detects case.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include:ROM, RAM, disk or CD etc..
Embodiment provided above has carried out further detailed description, institute to the object, technical solutions and advantages of the present invention It should be understood that embodiment provided above is only the preferred embodiment of the present invention, be not intended to limit the invention, it is all Any modification, equivalent substitution, improvement and etc. made for the present invention, should be included in the present invention within the spirit and principles in the present invention Protection domain within.

Claims (10)

1. a kind of English text detection method with text orientation correction, which is characterized in that include the following steps:
S1, maximum stable extremal region detection is carried out to each channel of the sharpening image of English text image respectively, from image Extract maximum stable extremal region;It obtains candidate text filed;
S2, the grader based on convolutional neural networks model is established, extracts candidate text filed feature;Utilize softmax Function text filed is divided into text class region and non-textual class region according to candidate text filed feature, by candidate;It filters non- Text class region, acquisition is preliminary text filed, that is, detects English text;
S3, using the double-deck text packets algorithm by the preliminary text filed grouping;
S4, by the preliminary text filed carry out correction for direction after grouping, to realize the correction of English text.
2. a kind of English text detection method with text orientation correction according to claim 1, which is characterized in that institute Stating channel includes:Red channel, green channel, blue channel, tone channel, saturation degree channel, lightness channel and grey channel.
3. a kind of English text detection method with text orientation correction according to claim 1, which is characterized in that institute It states and establishes the grader based on convolutional neural networks model, extracting candidate text filed feature includes:Respectively according to classification Five layer architectures in device obtain candidate text filed fisrt feature and obtain candidate the second text filed spy by cross-layer Sign, wherein five layer architectures include the first convolutional layer being sequentially connected, maximum pond layer, the second convolutional layer, pyramid pond layer with And full articulamentum;Cross-layer indicates the first convolutional layer to full articulamentum.
4. a kind of English text detection method with text orientation correction according to claim 3, which is characterized in that institute The acquisition modes for stating fisrt feature are:Candidate text filed progress first time filter is checked using the first convolution in the first layer architecture Wave;It will the text filed maximum pond of the progress in the second layer architecture of first time filtered candidate;Using in third layer architecture Second convolution kernel, to candidate's second of filtering of text filed progress behind maximum pond;Candidate text filtered to second Region carries out utilizing pyramid pond in the 4th layer architecture;It is text filed in layer 5 frame to the candidate behind pyramid pond It is connected entirely in structure, to extract candidate text filed fisrt feature.
5. a kind of English text detection method with text orientation correction according to claim 3, which is characterized in that institute The acquisition modes for stating second feature are:Using the feature added manually, the first convolution is checked into candidate text filed progress first Secondary filtering;It is connected entirely according to the feature added manually by filtered candidate is text filed, to extract candidate text The second feature in region.
6. a kind of English text detection method with text orientation correction according to claim 5, which is characterized in that institute Stating the feature added manually includes:Depth-width ratio, compactness, stroke width area ratio, local contrast and boundary key point.
7. a kind of English text detection method with text orientation correction according to claim 1, which is characterized in that institute It states and includes by the preliminary text filed grouping using the double-deck text packets algorithm:Preliminary text filed carry out vertical grouping is incited somebody to action, It specifically includes:
Obtain the minimum Y axis coordinate b that n-th of preliminary text filed middle pixel is 255n;Obtain (n+1)th it is preliminary it is text filed in The maximum Y axis coordinate t that pixel is 255n+1;Obtain (n+1)th preliminary text filed height hn+1
Computed altitude is poorIf difference in height dn,n+1More than height threshold;Then by two preliminary text areas Domain is divided into identical class, that is, belongs to one text row;If difference in height dn,n+1Less than or equal to height threshold, then two it is preliminary Text filed is not identical class, (n+1)th it is preliminary it is text filed be considered as new line of text, and by the new line of text in Y-axis Direction is split.
8. a kind of English text detection method with text orientation correction according to claim 7, which is characterized in that institute It states and further includes by the preliminary text filed grouping using the double-deck text packets algorithm:It will horizontal point of preliminary text filed progress Group specifically includes:
Two adjacent tentatively the distance between text filed poor Δ d in one text row in acquisition X-axis;Range difference Δ d includes:Together The distance between letter d in one word1The distance between word d2
According to coefficientIt indicates the mean breadth of all letters in line of text, word is separated according to width threshold value;
Obtain the ratio of pitch and intervalIf the ratio d of pitch and intervalhLess than width threshold value, Two it is adjacent it is preliminary it is text filed belong to same class, i.e., the same word, if the ratio d of pitch and intervalhBe more than or Equal to width threshold value, two it is adjacent it is preliminary it is text filed be not belonging to same class, i.e., the two adjacent preliminary text filed are not belonging to The same word, by the preliminary text filed beginning as new word of the latter.
9. a kind of English text detection method with text orientation correction according to claim 1, which is characterized in that institute It states the preliminary text filed carry out correction for direction after grouping, to realize that the correction of English text includes:
S401, respectively will be preliminary text filed to rotate clockwise α degree after grouping using coordinate rotation formula;Set initial value I=1, α=- 30 °;
S402, by Model Matching process, the filtering of the group box that introduces errors into;It is text filed to obtain i-th of correction undetermined;
S403, work as i<When 6 ,+10 ° of i=i+1, α=α;Return to step S401, as i=6, extremely by the 1st correction text undetermined 6th correction text overlay undetermined, to obtain correction of a final proof text.
10. a kind of English text detection method with text orientation correction according to claim 9, which is characterized in that The group box includes:Tilt group box and long interval group box;The inclination group box includes a letter;Between the length It is located at both ends every the letter that group box includes.
CN201810429149.XA 2018-05-08 2018-05-08 A kind of English text detection method with text orientation correction Active CN108647681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810429149.XA CN108647681B (en) 2018-05-08 2018-05-08 A kind of English text detection method with text orientation correction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810429149.XA CN108647681B (en) 2018-05-08 2018-05-08 A kind of English text detection method with text orientation correction

Publications (2)

Publication Number Publication Date
CN108647681A true CN108647681A (en) 2018-10-12
CN108647681B CN108647681B (en) 2019-06-14

Family

ID=63749675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810429149.XA Active CN108647681B (en) 2018-05-08 2018-05-08 A kind of English text detection method with text orientation correction

Country Status (1)

Country Link
CN (1) CN108647681B (en)

Cited By (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800735A (en) * 2019-01-31 2019-05-24 中国人民解放军国防科技大学 Accurate detection and segmentation method for ship target
CN109934229A (en) * 2019-03-28 2019-06-25 网易有道信息技术(北京)有限公司 Image processing method, device, medium and calculating equipment
CN110298343A (en) * 2019-07-02 2019-10-01 哈尔滨理工大学 A kind of hand-written blackboard writing on the blackboard recognition methods
CN110674815A (en) * 2019-09-29 2020-01-10 四川长虹电器股份有限公司 Invoice image distortion correction method based on deep learning key point detection
CN111353493A (en) * 2020-03-31 2020-06-30 中国工商银行股份有限公司 Text image direction correction method and device
WO2021056255A1 (en) * 2019-09-25 2021-04-01 Apple Inc. Text detection using global geometry estimators
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
CN112825141A (en) * 2019-11-21 2021-05-21 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
CN113298079A (en) * 2021-06-28 2021-08-24 北京奇艺世纪科技有限公司 Image processing method and device, electronic equipment and storage medium
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
WO2021196013A1 (en) * 2020-03-31 2021-10-07 京东方科技集团股份有限公司 Word recognition method and device, and storage medium
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
CN113837169A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Text data processing method and device, computer equipment and storage medium
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
CN114283431A (en) * 2022-03-04 2022-04-05 南京安元科技有限公司 Text detection method based on differentiable binarization
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11670289B2 (en) 2014-05-30 2023-06-06 Apple Inc. Multi-command single utterance input method
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11675829B2 (en) 2017-05-16 2023-06-13 Apple Inc. Intelligent automated assistant for media exploration
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11853647B2 (en) 2015-12-23 2023-12-26 Apple Inc. Proactive assistance based on dialog communication between devices
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103325099A (en) * 2013-07-11 2013-09-25 北京智诺英特科技有限公司 Image correcting method and device
CN105279149A (en) * 2015-10-21 2016-01-27 上海应用技术学院 Chinese text automatic correction method
CN105426887A (en) * 2015-10-30 2016-03-23 北京奇艺世纪科技有限公司 Method and device for text image correction
CN105740774A (en) * 2016-01-25 2016-07-06 浪潮软件股份有限公司 Text region positioning method and apparatus for image
CN105868758A (en) * 2015-01-21 2016-08-17 阿里巴巴集团控股有限公司 Method and device for detecting text area in image and electronic device
CN106650725A (en) * 2016-11-29 2017-05-10 华南理工大学 Full convolutional neural network-based candidate text box generation and text detection method
CN106778757A (en) * 2016-12-12 2017-05-31 哈尔滨工业大学 Scene text detection method based on text conspicuousness
CN106997470A (en) * 2017-02-28 2017-08-01 信雅达系统工程股份有限公司 Tilt bearing calibration and the system of text image
CN107066972A (en) * 2017-04-17 2017-08-18 武汉理工大学 Natural scene Method for text detection based on multichannel extremal region
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103325099A (en) * 2013-07-11 2013-09-25 北京智诺英特科技有限公司 Image correcting method and device
CN105868758A (en) * 2015-01-21 2016-08-17 阿里巴巴集团控股有限公司 Method and device for detecting text area in image and electronic device
CN105279149A (en) * 2015-10-21 2016-01-27 上海应用技术学院 Chinese text automatic correction method
CN105426887A (en) * 2015-10-30 2016-03-23 北京奇艺世纪科技有限公司 Method and device for text image correction
CN105740774A (en) * 2016-01-25 2016-07-06 浪潮软件股份有限公司 Text region positioning method and apparatus for image
CN107992869A (en) * 2016-10-26 2018-05-04 深圳超多维科技有限公司 For tilting the method, apparatus and electronic equipment of word correction
CN106650725A (en) * 2016-11-29 2017-05-10 华南理工大学 Full convolutional neural network-based candidate text box generation and text detection method
CN106778757A (en) * 2016-12-12 2017-05-31 哈尔滨工业大学 Scene text detection method based on text conspicuousness
CN106997470A (en) * 2017-02-28 2017-08-01 信雅达系统工程股份有限公司 Tilt bearing calibration and the system of text image
CN107066972A (en) * 2017-04-17 2017-08-18 武汉理工大学 Natural scene Method for text detection based on multichannel extremal region

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
BAOGUANG SHI等: "Detecting Oriented Text in Natural Images by Linking Segments", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 *
JIN DAI等: "Scene Text Detection Based on Enhanced Multi-channels MSER and a Fast Text Grouping Process", 《2018 THE 3RD IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS》 *
KAIMING HE等: "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
RUI ZHU等: "Text detection based on convolutional neural networks with spatial pyramid pooling", 《2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING》 *
TONG HE等: "Text-Attentional Convolutional Neural Network for Scene Text Detection", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》 *
朱其猛: "基于文字结构特征的文本图像方向的研究与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
李玉冰: "基于深度网络的视觉跟踪算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (94)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11900936B2 (en) 2008-10-02 2024-02-13 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US11862186B2 (en) 2013-02-07 2024-01-02 Apple Inc. Voice trigger for a digital assistant
US11636869B2 (en) 2013-02-07 2023-04-25 Apple Inc. Voice trigger for a digital assistant
US11557310B2 (en) 2013-02-07 2023-01-17 Apple Inc. Voice trigger for a digital assistant
US11388291B2 (en) 2013-03-14 2022-07-12 Apple Inc. System and method for processing voicemail
US11798547B2 (en) 2013-03-15 2023-10-24 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11727219B2 (en) 2013-06-09 2023-08-15 Apple Inc. System and method for inferring user intent from speech inputs
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11670289B2 (en) 2014-05-30 2023-06-06 Apple Inc. Multi-command single utterance input method
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US11516537B2 (en) 2014-06-30 2022-11-29 Apple Inc. Intelligent automated assistant for TV user interactions
US11838579B2 (en) 2014-06-30 2023-12-05 Apple Inc. Intelligent automated assistant for TV user interactions
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US11070949B2 (en) 2015-05-27 2021-07-20 Apple Inc. Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US11126400B2 (en) 2015-09-08 2021-09-21 Apple Inc. Zero latency digital assistant
US11954405B2 (en) 2015-09-08 2024-04-09 Apple Inc. Zero latency digital assistant
US11809886B2 (en) 2015-11-06 2023-11-07 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US11886805B2 (en) 2015-11-09 2024-01-30 Apple Inc. Unconventional virtual assistant interactions
US11853647B2 (en) 2015-12-23 2023-12-26 Apple Inc. Proactive assistance based on dialog communication between devices
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US11599331B2 (en) 2017-05-11 2023-03-07 Apple Inc. Maintaining privacy of personal information
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11837237B2 (en) 2017-05-12 2023-12-05 Apple Inc. User-specific acoustic models
US11538469B2 (en) 2017-05-12 2022-12-27 Apple Inc. Low-latency intelligent automated assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11380310B2 (en) 2017-05-12 2022-07-05 Apple Inc. Low-latency intelligent automated assistant
US11862151B2 (en) 2017-05-12 2024-01-02 Apple Inc. Low-latency intelligent automated assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US11675829B2 (en) 2017-05-16 2023-06-13 Apple Inc. Intelligent automated assistant for media exploration
US11532306B2 (en) 2017-05-16 2022-12-20 Apple Inc. Detecting a trigger of a digital assistant
US11710482B2 (en) 2018-03-26 2023-07-25 Apple Inc. Natural assistant interaction
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US11907436B2 (en) 2018-05-07 2024-02-20 Apple Inc. Raise to speak
US11169616B2 (en) 2018-05-07 2021-11-09 Apple Inc. Raise to speak
US11854539B2 (en) 2018-05-07 2023-12-26 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US11431642B2 (en) 2018-06-01 2022-08-30 Apple Inc. Variable latency device coordination
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11630525B2 (en) 2018-06-01 2023-04-18 Apple Inc. Attention aware virtual assistant dismissal
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
CN109800735A (en) * 2019-01-31 2019-05-24 中国人民解放军国防科技大学 Accurate detection and segmentation method for ship target
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
CN109934229B (en) * 2019-03-28 2021-08-03 网易有道信息技术(北京)有限公司 Image processing method, device, medium and computing equipment
CN109934229A (en) * 2019-03-28 2019-06-25 网易有道信息技术(北京)有限公司 Image processing method, device, medium and calculating equipment
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11657813B2 (en) 2019-05-31 2023-05-23 Apple Inc. Voice identification in digital assistant systems
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
CN110298343A (en) * 2019-07-02 2019-10-01 哈尔滨理工大学 A kind of hand-written blackboard writing on the blackboard recognition methods
WO2021056255A1 (en) * 2019-09-25 2021-04-01 Apple Inc. Text detection using global geometry estimators
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
CN110674815A (en) * 2019-09-29 2020-01-10 四川长虹电器股份有限公司 Invoice image distortion correction method based on deep learning key point detection
US11928872B2 (en) 2019-11-21 2024-03-12 Shanghai Goldway Intelligent Transportation System Co., Ltd. Methods and apparatuses for recognizing text, recognition devices and storage media
CN112825141B (en) * 2019-11-21 2023-02-17 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
CN112825141A (en) * 2019-11-21 2021-05-21 上海高德威智能交通系统有限公司 Method and device for recognizing text, recognition equipment and storage medium
WO2021196013A1 (en) * 2020-03-31 2021-10-07 京东方科技集团股份有限公司 Word recognition method and device, and storage medium
CN111353493B (en) * 2020-03-31 2023-04-28 中国工商银行股份有限公司 Text image direction correction method and device
US11651604B2 (en) 2020-03-31 2023-05-16 Boe Technology Group Co., Ltd. Word recognition method, apparatus and storage medium
CN111353493A (en) * 2020-03-31 2020-06-30 中国工商银行股份有限公司 Text image direction correction method and device
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11924254B2 (en) 2020-05-11 2024-03-05 Apple Inc. Digital assistant hardware abstraction
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11755276B2 (en) 2020-05-12 2023-09-12 Apple Inc. Reducing description length based on confidence
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11750962B2 (en) 2020-07-21 2023-09-05 Apple Inc. User identification using headphones
CN113298079A (en) * 2021-06-28 2021-08-24 北京奇艺世纪科技有限公司 Image processing method and device, electronic equipment and storage medium
CN113298079B (en) * 2021-06-28 2023-10-27 北京奇艺世纪科技有限公司 Image processing method and device, electronic equipment and storage medium
CN113837169B (en) * 2021-09-29 2023-12-19 平安科技(深圳)有限公司 Text data processing method, device, computer equipment and storage medium
CN113837169A (en) * 2021-09-29 2021-12-24 平安科技(深圳)有限公司 Text data processing method and device, computer equipment and storage medium
CN114283431A (en) * 2022-03-04 2022-04-05 南京安元科技有限公司 Text detection method based on differentiable binarization

Also Published As

Publication number Publication date
CN108647681B (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN108647681B (en) A kind of English text detection method with text orientation correction
CN111325203B (en) American license plate recognition method and system based on image correction
Qureshi et al. A bibliography of pixel-based blind image forgery detection techniques
CN107491730A (en) A kind of laboratory test report recognition methods based on image procossing
Yin et al. Hot region selection based on selective search and modified fuzzy C-means in remote sensing images
CN104408449B (en) Intelligent mobile terminal scene literal processing method
CN110232387B (en) Different-source image matching method based on KAZE-HOG algorithm
CN111539409B (en) Ancient tomb question and character recognition method based on hyperspectral remote sensing technology
CN106529532A (en) License plate identification system based on integral feature channels and gray projection
CN116071763B (en) Teaching book intelligent correction system based on character recognition
CN110738216A (en) Medicine identification method based on improved SURF algorithm
Hallale et al. Twelve directional feature extraction for handwritten English character recognition
CN110689003A (en) Low-illumination imaging license plate recognition method and system, computer equipment and storage medium
CN115311746A (en) Off-line signature authenticity detection method based on multi-feature fusion
CN104899551B (en) A kind of form image sorting technique
CN110084229A (en) A kind of seal detection method, device, equipment and readable storage medium storing program for executing
CN109741351A (en) A kind of classification responsive type edge detection method based on deep learning
CN110222660B (en) Signature authentication method and system based on dynamic and static feature fusion
CN101727579A (en) Method for detecting deformed character, method and device for determining water marking information in deformed character
Su et al. Skew detection for Chinese handwriting by horizontal stroke histogram
CN111612045B (en) Universal method for acquiring target detection data set
CN114862883A (en) Target edge extraction method, image segmentation method and system
CN110555792B (en) Image tampering blind detection method based on normalized histogram comprehensive feature vector
Sathisha Bank automation system for Indian currency-a novel approach
Sushma et al. Text detection in color images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant