CN106127118A - A kind of English word recognition methods and device - Google Patents

A kind of English word recognition methods and device Download PDF

Info

Publication number
CN106127118A
CN106127118A CN201610430159.6A CN201610430159A CN106127118A CN 106127118 A CN106127118 A CN 106127118A CN 201610430159 A CN201610430159 A CN 201610430159A CN 106127118 A CN106127118 A CN 106127118A
Authority
CN
China
Prior art keywords
text
connected domain
line
image
stroke width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610430159.6A
Other languages
Chinese (zh)
Inventor
刁志敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Gotech Intelligent Technology Co Ltd
Original Assignee
Zhuhai Gotech Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Gotech Intelligent Technology Co Ltd filed Critical Zhuhai Gotech Intelligent Technology Co Ltd
Priority to CN201610430159.6A priority Critical patent/CN106127118A/en
Publication of CN106127118A publication Critical patent/CN106127118A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/30Writer recognition; Reading and verifying signatures
    • G06V40/37Writer recognition; Reading and verifying signatures based only on signature signals such as velocity or pressure, e.g. dynamic signature recognition
    • G06V40/382Preprocessing; Feature extraction
    • G06V40/388Sampling; Contour coding; Stroke extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Character Discrimination (AREA)

Abstract

This application discloses a kind of English word recognition methods and device, the method includes: the video image of input is carried out stroke width conversion;The image of output after stroke width converts is carried out connected domain analysis, and to filter out from analysis result be text filed connected domain;The connected domain filtered out is merged, obtains line of text;Utilizing optical character recognition model to be identified described line of text, wherein, the training data of described optical character recognition model is English alphabet, and each English alphabet has the template of multiple different degree of corrosion;The line of text identified is carried out semantic analysis, selects the line of text meeting semanteme.The application improves English word identification accuracy under complex scene.

Description

A kind of English word recognition methods and device
Technical field
The present invention relates to technical field of character recognition, more particularly, it relates to a kind of English word recognition methods and device.
Background technology
Text is a key character in many application of computer vision, and the text in video image usually contains Abundant information, extracts the text in video image and identifies, for the analysis of video image content, understanding, information The aspects such as retrieval have great importance.
Extracting the contour feature of word from video image is the important ring during Text region, such as, at English During language word identification, the contour feature first extracting each English alphabet is needed to be merged to identify whole English list again Word.But owing to video image is natural scene, under complex scene, its background noise is overweight, letter profile disappearance can be made to be difficult to Identify thus English word missing inspection occurs and identifies mistake, affect English word identification accuracy.
Summary of the invention
In view of this, the present invention provides a kind of English word recognition methods and device, to improve English list under complex scene Word identification accuracy.
A kind of English word recognition methods, including:
The video image of input is carried out stroke width conversion;
The image of output after stroke width converts is carried out connected domain analysis, and to filter out from analysis result be text The connected domain in region;
The connected domain filtered out is merged, obtains line of text;
Utilize optical character recognition model that described line of text is identified, wherein, described optical character recognition model Training data is English alphabet, and each English alphabet has the template of multiple different degree of corrosion;
The line of text identified is carried out semantic analysis, selects the line of text meeting semanteme.
Wherein, described input picture is carried out stroke width conversion, including:
It is RGB image by the video image decoding of input;
Described RGB image is changed into gray-scale map;
Described gray-scale map is changed into normal window widget workbox image;
Utilize Canny edge detection operator that described normal window widget workbox image is carried out rim detection, obtain All edge pixel points;
Sobel operator is utilized to be calculated the gradient direction of each edge pixel point respectively;
Find the edge pixel point contrary with its gradient direction for each described edge pixel point, form edge pixel point Right;
Calculating respectively by the edge pixel point stroke width value to determining each described, the size of its stroke width value is This edge pixel between Euclidean distance.
Wherein, described filtering out from analysis result is text filed connected domain, including:
Filtering out from analysis result is text filed connected domain, and screening conditions include: the stroke width one of connected domain Cause;And pixel in connected domain the proportion identical with the color of English word to be identified be not less than first preset Value.
Wherein, described filtering out from analysis result is text filed connected domain, including:
Filtering out from analysis result is text filed connected domain, and screening conditions include: the stroke width one of connected domain Cause;And the stroke variance of connected domain is not less than the second preset value, stroke average is not less than the 3rd preset value and connection field width is high Ratio is less than the 4th preset value.
Alternatively, described utilize optical character recognition model that described line of text is identified before, also include: utilize maximum Inter-class variance binaryzation filters the background noise of described line of text;
Corresponding, described utilize optical character recognition model that described line of text is identified, for: utilize optical character to know Line of text after background noise is filtered by other model is identified.
A kind of English word identification device, including:
Stroke width conversion module, for carrying out stroke width conversion to the video image of input;
Connected domain analysis screening unit, for the image of output after stroke width converts is carried out connected domain analysis, and Filtering out from analysis result is text filed connected domain;
Line of text combining unit, for merging the connected domain filtered out, obtains line of text;
OCR recognition unit, is used for utilizing optical character recognition model to be identified described line of text, wherein, and described light The training data learning character recognition model is English alphabet, and each English alphabet has the template of multiple different degree of corrosion;
Semantic analysis unit, carries out semantic analysis to the line of text identified, and selects the line of text meeting semanteme.
Wherein, described stroke width conversion module specifically includes:
RGB image conversion unit, being used for the video image decoding of input is RGB image;
Gray-scale map conversion unit, for changing into gray-scale map described RGB image;
SWT image conversion unit, for changing into SWT image described gray-scale map;
Edge detection unit, is used for utilizing Canny edge detection operator that described SWT image is carried out rim detection, obtains All edge pixel points;
Gradient direction computing unit, for utilizing sobel operator to be calculated the gradient direction of each edge pixel point respectively;
Stroke width computing unit, for finding the edge contrary with its gradient direction for each described edge pixel point Pixel, forms edge pixel point pair;Calculating by each edge pixel point stroke width value to determining respectively, size is this Edge pixel between Euclidean distance.
Wherein, described connected domain analysis screening unit is specifically for carrying out even the image of output after stroke width converts Logical domain analysis, and it is consistent therefrom to filter out stroke width, and also the pixel identical with the color of English word to be identified exists In connected domain, proportion is not less than the connected domain of the first preset value.
Wherein, described connected domain analysis screening unit is specifically for carrying out even the image of output after stroke width converts Logical domain analysis, and the stroke width therefrom filtering out connected domain is consistent, and the stroke variance of connected domain is not less than second presets Value, stroke average are not less than the 3rd preset value and connected domain the ratio of width to height connected domain less than the 4th preset value.
Alternatively, described device also includes: background noise filter element, for utilizing optical character recognition model to institute State before line of text is identified, filter the background noise of described line of text first with maximum between-cluster variance binaryzation.
From above-mentioned technical scheme it can be seen that the present invention is by corroding optical character recognition Model Identification difference in advance The English alphabet of degree is trained study, increases the discrimination under letter profile damage situations, reduces English word missing inspection Rate;And this present invention also carries out semantic analysis screening to the line of text identified, to select the line of text meeting semanteme, reduce English word fallout ratio, thus improve English word identification accuracy under complex scene.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this Some embodiments of invention, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to Other accompanying drawing is obtained according to these accompanying drawings.
Fig. 1 is a kind of English word recognition methods flow chart disclosed by the invention;
Fig. 2 is a kind of stroke width alternative approach flow chart disclosed by the invention;
Fig. 3 is a kind of English word identification apparatus structure schematic diagram disclosed by the invention;
Fig. 4 is another English word identification apparatus structure schematic diagram disclosed by the invention;
Fig. 5 is another English word identification apparatus structure schematic diagram disclosed by the invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe, it is clear that described embodiment is only a part of embodiment of the present invention rather than whole embodiments wholely.Based on Embodiment in the present invention, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of protection of the invention.
See Fig. 1, the embodiment of the invention discloses a kind of English word recognition methods, to improve English list under complex scene Word identification accuracy, including:
Step 100: the video image of input is carried out stroke width conversion;
The purpose that the video image inputted carries out stroke width conversion is to obtain connected domain information.Stroke width converts Thinking as follows: first to input video image carry out rim detection, obtain marginal information;Then from each edge pixel Point sets out, and finds the edge pixel point that gradient direction therewith is contrary, forms an edge pixel point pair;Calculate each limit respectively Edge pixel between Euclidean distance, and this value is given this edge pixel point between all of pixel.Through stroke After width conversion, the image slices vegetarian refreshments of output represents possible stroke width.Utilize stroke width information can obtain possible literary composition This information, because the consistent connected domain of stroke width is likely to be text filed.
The detailed process of stroke width conversion is as shown in Figure 2.Including:
Step 101: be RGB image by the video image decoding of input;
Step 102: described RGB image is changed into gray-scale map;
Step 103: described gray-scale map is changed into SWT (Standard Widget Toolkit, normal window widget Workbox) image;
Step 104: utilize Canny edge detection operator that described SWT image is carried out rim detection, obtain all edges picture Vegetarian refreshments;Wherein, Canny edge detection operator is that the multistage rim detection that John F.Canny developed in 1986 is calculated Method;
Step 105: utilize sobel operator (Sobel operator, Sobel Operator) to be calculated each edge picture respectively The gradient direction of vegetarian refreshments;
Step 106: find the edge pixel point contrary with its gradient direction for each described edge pixel point, forms limit Edge pixel pair;
Step 107: calculate respectively by the edge pixel point stroke width value to determining each described, its stroke width value Size be this edge pixel between Euclidean distance.
Step 200: the image of output after stroke width converts is carried out connected domain analysis, and screens from analysis result Going out is text filed connected domain;
Connected domain refers to have in the image of output the adjacent prospect of same pixel value and position after stroke width converts The image-region of pixel composition.Connected domain analysis refers to each connection in the image of the output after stroke width converts Territory is found out and labelling.Prior art during English word identification, therefrom filter out be text filed connected domain time logical The most only consider that the stroke width of connected domain is the most consistent, but the interference of background color is easily caused English word false retrieval, therefore originally Embodiment increases by screening conditions: pixel in connected domain the proportion identical with the color of English word is not less than first Preset value, such as, English word to be identified is black, then may call for black pixel point proportion in connected domain the lowest In 60%.Additionally, for avoiding, because English word is too small, false retrieval occurs, it is also possible to it is further added by screening conditions: stroke variance is the lowest It is not less than the 3rd preset value and connected domain the ratio of width to height less than the 4th preset value in the second preset value, stroke average.
Step 300: the connected domain filtered out is merged, obtains line of text;
Such as, the several connected domains filtered out are to show that connected domain that content is l, display content are u the most successively Connected domain, display content be c connected domain, display content be k connected domain, display content be the connected domain of y, then merge it Rear available line of text lucky.
Step 400: utilize OCR (Optical Character Recognition, optical character recognition) model to described Line of text is identified, and wherein, the training data of described OCR model is that (described 26 English alphabets include 26 to English alphabet Capitalization English letter A~Z and/or 26 small English alphabet a~z), each English alphabet has multiple different degree of corrosion Template;
The present embodiment English alphabet to described OCR Model Identification difference degree of corrosion in advance is trained study, adds Discrimination under English alphabet profile damage situations, training pattern can use existing SVM (Support Vector Machine, support vector machine) algorithm, but do not limit to.Wherein, the template of described multiple different degree of corrosions, may is that complete The template of free from corrosion template, low degree corrosion, the template of middle degree corrosion and the template of high level corrosion.
Step 500: the line of text identified is carried out semantic analysis, selects the line of text meeting semanteme.
English word occurs in the number of times in video image and has regular hour regularity, and therefore the present embodiment is to The English word identified carries out semantic statistics, and the quantity of statistics is the most, and semantic statistical result is the most accurate, if this identifies Line of text do not meet semanteme, be not i.e. inconsistent with the semantic statistical result previously obtained, then the line of text that this identified is got rid of, To reduce fallout ratio, this is the basic thought that this line of text identified carries out semantic analysis.Such as, identified English word include happy, happiness, joy, relaxed etc. of repeatedly occurring in video image, its semantic similarity, If the text behavior pain that this identifies, then semantic contrary with the former due to it, it is known that it does not meets semanteme, is a false retrieval list Word, needs to get rid of.The present embodiment can use HMM (Hidden Markov Model, hidden Markov model) to carry out line of text Semantic analysis is added up, but does not limit to.
From the foregoing, it will be observed that the video image of input is carried out stroke width conversion, then to output after stroke width converts Image carries out connected domain analysis, and therefrom to filter out be text filed connected domain, then merges the connected domain filtered out, Obtain line of text, more described line of text is carried out OCR identification, be the routine techniques hands that word in video image is identified Section.But the interference that in video image, background noise is overweight can make letter profile disappearance be difficult to thus English word missing inspection occurs With identification mistake, affect English word identification accuracy.To this, the present embodiment is in advance to OCR Model Identification difference degree of corrosion English alphabet is trained study, increases the discrimination under letter profile damage situations, reduces English word loss;And And the present embodiment also carries out semantic analysis to the line of text identified, select the line of text meeting semanteme, reduce English word Fallout ratio, thus improve English word identification accuracy under complex scene.
Additionally, before utilizing OCR model that described line of text is identified, also can be first with OSTU (maximum between-cluster variance) Binaryzation filters the background noise of described line of text, and the line of text after background noise is filtered by recycling OCR model afterwards is carried out Identify.It has the beneficial effects that: by filter background noise, and line of text can be made clear-cut, reduces background noise and treats knowledge The interference corrosion of other English word, reduces false retrieval situation further.
Additionally, see Fig. 3, the embodiment of the invention also discloses a kind of English word identification device, to improve complex scene Lower English word identification accuracy, including:
Stroke width conversion module 100, for carrying out stroke width conversion to the video image of input;
Connected domain analysis screening unit 200, for the image of output after stroke width converts is carried out connected domain analysis, And to filter out from analysis result be text filed connected domain;
Line of text combining unit 300, for merging the connected domain filtered out, obtains line of text;
OCR recognition unit 400, is used for utilizing optical character recognition model to be identified described line of text, wherein, described The training data of optical character recognition model is English alphabet, and each English alphabet has the template of multiple different degree of corrosion;
Semantic analysis unit 500, carries out semantic analysis to the line of text identified, and selects the line of text meeting semanteme.
Wherein, seeing Fig. 4, stroke width conversion module 100 specifically includes:
RGB image conversion unit 101, being used for the video image decoding of input is RGB image one by one;
Gray-scale map conversion unit 102, for changing into gray-scale map described RGB image;
SWT image conversion unit 103, for changing into SWT image described gray-scale map;
Edge detection unit 104, is used for utilizing Canny edge detection operator that described SWT image is carried out rim detection, To all edge pixel points;
Gradient direction computing unit 105, for utilizing sobel operator to be calculated the gradient side of each edge pixel point respectively To;
Stroke width computing unit 106, for finding contrary with its gradient direction for each described edge pixel point Edge pixel point, forms edge pixel point pair;Calculate respectively by each edge pixel point stroke width value to determining, size For this edge pixel between Euclidean distance.
Wherein, connected domain analysis screening unit 200 is specifically for carrying out even the image of output after stroke width converts Logical domain analysis, and it is consistent therefrom to filter out stroke width, and also the pixel identical with the color of English word to be identified exists In connected domain, proportion is not less than the connected domain of the first preset value.
Or, connected domain analysis screening unit 200 is specifically for carrying out even the image of output after stroke width converts Logical domain analysis, and the stroke width therefrom filtering out connected domain is consistent, and the stroke variance of connected domain is not less than second presets Value, stroke average are not less than the 3rd preset value and connected domain the ratio of width to height connected domain less than the 4th preset value.
Alternatively, as it is shown in figure 5, described English word identification device also includes: background noise filter element 600, it is used for Before utilizing optical character recognition model that described line of text is identified, filter described first with maximum between-cluster variance binaryzation The background noise of line of text.
In sum, the present invention is by carrying out the English alphabet of optical character recognition Model Identification difference degree of corrosion in advance Training study, increases the discrimination under letter profile damage situations, reduces English word loss;And this present invention is also The line of text identified is carried out semantic analysis screening, to select the line of text meeting semanteme, reduces English word fallout ratio, Thus improve English word identification accuracy under complex scene.
In this specification, each embodiment uses the mode gone forward one by one to describe, and what each embodiment stressed is and other The difference of embodiment, between each embodiment, identical similar portion sees mutually.For device disclosed in embodiment For, owing to it corresponds to the method disclosed in Example, so describe is fairly simple, relevant part sees method part and says Bright.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the present invention. Multiple amendment to these embodiments will be apparent from for those skilled in the art, as defined herein General Principle can realize in the case of without departing from the spirit or scope of the embodiment of the present invention in other embodiments.Therefore, The embodiment of the present invention is not intended to be limited to the embodiments shown herein, and be to fit to principles disclosed herein and The widest scope that features of novelty is consistent.

Claims (10)

1. an English word recognition methods, it is characterised in that including:
The video image of input is carried out stroke width conversion;
The image of output after stroke width converts is carried out connected domain analysis, and to filter out from analysis result be text filed Connected domain;
The connected domain filtered out is merged, obtains line of text;
Utilize optical character recognition model that described line of text is identified, wherein, the training of described optical character recognition model Data are English alphabet, and each English alphabet has the template of multiple different degree of corrosion;
The line of text identified is carried out semantic analysis, selects the line of text meeting semanteme.
Method the most according to claim 1, it is characterised in that described input picture is carried out stroke width conversion, including:
It is RGB image by the video image decoding of input;
Described RGB image is changed into gray-scale map;
Described gray-scale map is changed into normal window widget workbox image;
Utilize Canny edge detection operator that described normal window widget workbox image is carried out rim detection, owned Edge pixel point;
Sobel operator is utilized to be calculated the gradient direction of each edge pixel point respectively;
Find the edge pixel point contrary with its gradient direction for each described edge pixel point, form edge pixel point pair;
Calculating by the edge pixel point stroke width value to determining each described respectively, the size of its stroke width value is this limit Edge pixel between Euclidean distance.
Method the most according to claim 1, it is characterised in that described filtering out from analysis result is text filed company Logical territory, including:
Filtering out from analysis result is text filed connected domain, and screening conditions include: the stroke width of connected domain is consistent;And And pixel in connected domain the proportion identical with the color of English word to be identified is not less than the first preset value.
Method the most according to claim 1, it is characterised in that described filtering out from analysis result is text filed company Logical territory, including:
Filtering out from analysis result is text filed connected domain, and screening conditions include: the stroke width of connected domain is consistent;And And the stroke variance of connected domain is not less than the second preset value, stroke average is not less than the 3rd preset value and connected domain the ratio of width to height does not surpasses Cross the 4th preset value.
5. according to the method according to any one of claim 1-4, it is characterised in that described utilize optical character recognition model pair Before described line of text is identified, also include: utilize maximum between-cluster variance binaryzation to filter the background noise of described line of text;
Corresponding, described utilize optical character recognition model that described line of text is identified, for: utilize optical character recognition mould Line of text after background noise is filtered by type is identified.
6. an English word identification device, it is characterised in that including:
Stroke width conversion module, for carrying out stroke width conversion to the video image of input;
Connected domain analysis screening unit, for carrying out connected domain analysis, and from dividing to the image of output after stroke width converts Filtering out in analysis result is text filed connected domain;
Line of text combining unit, for merging the connected domain filtered out, obtains line of text;
OCR recognition unit, is used for utilizing optical character recognition model to be identified described line of text, wherein, and described optics word Symbol identifies that the training data of model is English alphabet, and each English alphabet has the template of multiple different degree of corrosion;
Semantic analysis unit, carries out semantic analysis to the line of text identified, and selects the line of text meeting semanteme.
Device the most according to claim 6, it is characterised in that described stroke width conversion module specifically includes:
RGB image conversion unit, being used for the video image decoding of input is RGB image;
Gray-scale map conversion unit, for changing into gray-scale map described RGB image;
SWT image conversion unit, for changing into SWT image described gray-scale map;
Edge detection unit, is used for utilizing Canny edge detection operator that described SWT image is carried out rim detection, is owned Edge pixel point;
Gradient direction computing unit, for utilizing sobel operator to be calculated the gradient direction of each edge pixel point respectively;
Stroke width computing unit, for finding the edge pixel contrary with its gradient direction for each described edge pixel point Point, forms edge pixel point pair;Calculating respectively by each edge pixel point stroke width value to determining, size is this edge Pixel between Euclidean distance.
Device the most according to claim 6, it is characterised in that described connected domain analysis screening unit is specifically for through pen After drawing width conversion, the image of output carries out connected domain analysis, and it is consistent therefrom to filter out stroke width, and with to be identified Pixel proportion in connected domain that the color of English word is identical is not less than the connected domain of the first preset value.
Device the most according to claim 6, it is characterised in that described connected domain analysis screening unit is specifically for through pen After drawing width conversion, the image of output carries out connected domain analysis, and the stroke width therefrom filtering out connected domain is consistent, Er Qielian The stroke variance in logical territory is not less than the second preset value, stroke average is not less than the 3rd preset value and connected domain the ratio of width to height is less than the The connected domain of four preset values.
10. according to the device according to any one of claim 6-9, it is characterised in that described device also includes: background noise mistake Filter unit, for before utilizing optical character recognition model to be identified described line of text, first with maximum between-cluster variance two The background noise of described line of text is filtered in value.
CN201610430159.6A 2016-06-15 2016-06-15 A kind of English word recognition methods and device Pending CN106127118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610430159.6A CN106127118A (en) 2016-06-15 2016-06-15 A kind of English word recognition methods and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610430159.6A CN106127118A (en) 2016-06-15 2016-06-15 A kind of English word recognition methods and device

Publications (1)

Publication Number Publication Date
CN106127118A true CN106127118A (en) 2016-11-16

Family

ID=57469919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610430159.6A Pending CN106127118A (en) 2016-06-15 2016-06-15 A kind of English word recognition methods and device

Country Status (1)

Country Link
CN (1) CN106127118A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038481A (en) * 2017-12-11 2018-05-15 江苏科技大学 A kind of combination maximum extreme value stability region and the text positioning method of stroke width change
CN110929647A (en) * 2019-11-22 2020-03-27 科大讯飞股份有限公司 Text detection method, device, equipment and storage medium
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof
US20140023275A1 (en) * 2012-07-19 2014-01-23 Qualcomm Incorporated Redundant aspect ratio decoding of devanagari characters
CN104112130A (en) * 2014-06-26 2014-10-22 小米科技有限责任公司 Optical character recognition method and device
CN104268512A (en) * 2014-09-17 2015-01-07 清华大学 Method and device for recognizing characters in image on basis of optical character recognition
CN104408449A (en) * 2014-10-27 2015-03-11 西安电子科技大学宁波信息技术研究院 Intelligent mobile terminal scene character processing method
US20150070373A1 (en) * 2012-08-23 2015-03-12 Google Inc. Clarification of Zoomed Text Embedded in Images

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777124A (en) * 2010-01-29 2010-07-14 北京新岸线网络技术有限公司 Method for extracting video text message and device thereof
US20140023275A1 (en) * 2012-07-19 2014-01-23 Qualcomm Incorporated Redundant aspect ratio decoding of devanagari characters
US20150070373A1 (en) * 2012-08-23 2015-03-12 Google Inc. Clarification of Zoomed Text Embedded in Images
CN104112130A (en) * 2014-06-26 2014-10-22 小米科技有限责任公司 Optical character recognition method and device
CN104268512A (en) * 2014-09-17 2015-01-07 清华大学 Method and device for recognizing characters in image on basis of optical character recognition
CN104408449A (en) * 2014-10-27 2015-03-11 西安电子科技大学宁波信息技术研究院 Intelligent mobile terminal scene character processing method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038481A (en) * 2017-12-11 2018-05-15 江苏科技大学 A kind of combination maximum extreme value stability region and the text positioning method of stroke width change
CN110929647A (en) * 2019-11-22 2020-03-27 科大讯飞股份有限公司 Text detection method, device, equipment and storage medium
CN112488107A (en) * 2020-12-04 2021-03-12 北京华录新媒信息技术有限公司 Video subtitle processing method and processing device

Similar Documents

Publication Publication Date Title
Yi et al. Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification
CN104182750B (en) A kind of Chinese detection method based on extreme value connected domain in natural scene image
CN104408449B (en) Intelligent mobile terminal scene literal processing method
CN106384112A (en) Rapid image text detection method based on multi-channel and multi-dimensional cascade filter
CN103093228A (en) Chinese detection method in natural scene image based on connected domain
CN108986125B (en) Object edge extraction method and device and electronic equipment
CN103295009B (en) Based on the license plate character recognition method of Stroke decomposition
EP3846122B1 (en) Method and apparatus for generating background-free image, device, and medium
Darab et al. A hybrid approach to localize farsi text in natural scene images
CN105117740A (en) Font identification method and device
Mishchenko et al. Chart image understanding and numerical data extraction
CN106127118A (en) A kind of English word recognition methods and device
Roy et al. Date-field retrieval in scene image and video frames using text enhancement and shape coding
Kesiman et al. Southeast Asian palm leaf manuscript images: a review of handwritten text line segmentation methods and new challenges
Ayesh et al. A robust line segmentation algorithm for Arabic printed text with diacritics
Owamoyo et al. Number plate recognition for Nigerian vehicles
CN104281850A (en) Character area identification method and device
CN104966109A (en) Medical laboratory report image classification method and apparatus
Jain et al. A hybrid approach for detection and recognition of traffic text sign using MSER and OCR
Mukhiddinov Scene text detection and localization using fully convolutional network
Tran et al. A novel approach for text detection in images using structural features
CN109800758A (en) A kind of natural scene character detecting method of maximum region detection
Romic et al. Character recognition based on region pixel concentration for license plate identification
CN104504385A (en) Recognition method of handwritten connected numerical string
Sun et al. Contextual models for automatic building extraction in high resolution remote sensing image using object-based boosting method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161116