CN101777124A - Method for extracting video text message and device thereof - Google Patents

Method for extracting video text message and device thereof Download PDF

Info

Publication number
CN101777124A
CN101777124A CN201010104243A CN201010104243A CN101777124A CN 101777124 A CN101777124 A CN 101777124A CN 201010104243 A CN201010104243 A CN 201010104243A CN 201010104243 A CN201010104243 A CN 201010104243A CN 101777124 A CN101777124 A CN 101777124A
Authority
CN
China
Prior art keywords
character
text
english
chinese
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201010104243A
Other languages
Chinese (zh)
Inventor
周景超
苗广义
鲍东山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd
Original Assignee
BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd filed Critical BEIJING NUFRONT SOFTWARE TECHNOLOGY Co Ltd
Priority to CN201010104243A priority Critical patent/CN101777124A/en
Publication of CN101777124A publication Critical patent/CN101777124A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention discloses a method for extracting video text message and a device thereof. The method comprises the steps of: confirming the position of the text block in a video image; performing segmentation and character recognition to the text block image according to Chinese and English character characteristic, and obtaining Chinese and English character strings; calibrating the recognition reliability; and combining the Chinese character string and the English character string based on the calibrated character recognition reliability and the position relationship between the Chinese character and the English character, and obtaining the text message. The invention can perform the character segmentation recognition to the Chinese and English mixed text in the video image, can solve the problem that different kinds of video texts are difficult to be treated in an unified process, and can organize and classify different kinds of text message in a video. The architecture not only can effectively treat different kinds of videos, but also can be conveniently customized, modified and expanded.

Description

A kind of method and device that extracts video text message
Technical field
The present invention relates to image and areas of information technology, be specifically related to extract the method and the device of video text message.
Background technology
In the scheme of existing extraction video text message, have processing power usually, but can't accomplish the videotext of a large amount of different-styles is all handled a certain class text.And be difficult in unified flow process, handle for the videotext of different-style.
In the prior art, difference of two squares aggregate-value is that a kind of algorithm commonly used during videotext is followed the tracks of is (at IEEETransactions on Image Processing, Vol.9, No.1, Pages 147-56,2000, be described in " AutomaticText Detection and Tracking in Digital Video "), but this algorithm is not distinguished text filed interior character and background, when background changes, difference of two squares aggregate-value just obviously increases, and causes erroneous judgement easily.
At present, the Character segmentation that solves under the Chinese and English mixing situation has two kinds of thinkings:
One) unified recognition engine.The sample of Chinese and English character put together training OCR engine (at The Proceedings of the Seventh International Conference on DocumentAnalysis and Recognition, 2003, be described in " Improving Chinese/English OCR Performanceby Using MCE-based Character-pair Modeling and Negative Traning "), solve the problem of Chinese and English mixing in the identification link.Because the radical of Chinese character may be identified as English character during Character segmentation, the combination of the combination of adjacent English character or the radical of Chinese character and English character may be identified as Chinese character, and this just brings very big challenge to scope and the classification policy that OCR engine training sample covers.
Two) separate in the Chinese and English zone.Geometric properties according to character is divided Chinese zone in the character string and English zone, the Chinese zone uses Chinese OCR engine to discern, English zone uses English OCR engine to discern, at last two groups of recognition results are merged, (at " software journal ", Vol 16, and No 5 to obtain final recognition result, 2005, be described in " Chinese and English mixes the article identification problem ").Under many circumstances, difference is not remarkable between the Chinese and English character, and the zone is difficult to make the right judgement result when separating, in case and misjudgment just can not get correct recognition result.
In the prior art, the degree of confidence of carrying out in Multiple Classifier Fusion is proofreaied and correct and is normally carried out under the same sample collection, this helps special classifier design, because identical sample set provides a natural unified standard, but, can't set up a unified recognition confidence standard between the different sample sets for the user that the Multiple Classifier Fusion demand is arranged.
In the industry the research direction that video text message is extracted concentrate on text the location, cut apart, links such as enhancing and identification, try hard to from video, extract text message comprehensively and accurately, still, in actual applications, the text message that does not add differentiation is difficult to use.
In view of above shortcomings in the prior art and defective, require to provide better solution.
Summary of the invention
In view of this, the invention provides a kind of method and device that extracts video text message, can from dissimilar videos, extract text message.
A kind of method of extracting video text message that the embodiment of the invention provides comprises:
Determine the position of video image Chinese version piece;
According to the Chinese character feature described text block image is cut apart and character recognition, obtained the Chinese character string;
Geometric properties and positional information according to connected domain in the described text block image are determined English zone, and described English zone is cut apart and character recognition, obtain the English character string;
Calculate the recognition confidence of resulting Chinese character, English character respectively, and recognition confidence is proofreaied and correct;
Based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character described Chinese character string and English character string are merged, obtain text message.
Ground preferably, this method also comprises:
Monitoring is also followed the tracks of text block in the continuous videos picture frame, judges whether to be the one text piece according to the position relation and the picture material of adjacent video picture frame Chinese version piece;
When described text block disappears, determine the position of text piece, and text piece is carried out follow-up cutting apart and character recognition.
Ground preferably, this method also comprises:
Text block is cut apart with character recognition before, described text block region image is carried out pre-service.
The embodiment of the invention also provides a kind of device that extracts video text message, comprising:
Position determination unit is used for determining the position of video image Chinese version piece;
First processing unit is cut apart and character recognition described text block according to the Chinese character feature, obtains the Chinese character string;
Second processing unit is determined English zone according to the geometric properties and the positional information of connected domain in the described text block, and described English zone is cut apart and character recognition, obtains the English character string;
Computing unit is used for calculating respectively the recognition confidence of resulting Chinese character, English character, and recognition confidence is proofreaied and correct;
Merge cells is used for based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character described Chinese character string and Chinese character string being merged, and obtains text message.
Ground preferably, this device also comprises:
The monitoring tracking cell, the text block that is used for monitoring and following the tracks of the continuous videos picture frame;
Judging unit, positional information and picture material that the adjacent video picture frame Chinese version piece that provides according to described monitoring tracking cell is provided judge whether to be the one text piece;
If in the described video frame image is different text block, described judging unit is determined the zone of this difference text block, and then described first processing unit is cut apart and character recognition these different text block respectively with second processing unit.
In sum, a kind of method and device that extracts video text message provided by the invention is by determining the position of video image Chinese version piece; According to Chinese, English character feature the text block image is cut apart and character recognition respectively again, obtained Chinese and English character string; And recognition confidence proofreaied and correct; Based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character Chinese character string and English character string are merged, obtain text message.According to the present invention, can carry out Character segmentation identification to the text of the Chinese and English mixing in the video image, the videotext that can solve different-style is difficult to the problem handled in unified flow process, can organize, classify text messages dissimilar in the video.This framework both can effectively be handled various dissimilar videos, also can conveniently customize, revises, expand.
Figure of description
Fig. 1 is the method flow diagram of the extraction video text message that provides of the embodiment of the invention;
Fig. 2 is the process flow diagram that text block is positioned that the embodiment of the invention provides;
Fig. 3 be the embodiment of the invention provide the text block image is carried out the process flow diagram that character string is cut apart and discerned;
Fig. 4 is the synoptic diagram that the recognition confidence of the centering that provides of the embodiment of the invention, English character is proofreaied and correct;
Fig. 5 is the synoptic diagram that extracts Chinese and English digital mixing text from video image that the embodiment of the invention provides;
Fig. 6 is the video image synoptic diagram with polytype text that the embodiment of the invention provides;
Fig. 7 is the printed page analysis process flow diagram that the embodiment of the invention provides;
Fig. 8 is the device architecture synoptic diagram of the extraction video text message that provides of the embodiment of the invention.
Embodiment
In view of deficiency of the prior art and defective, the present invention proposes a kind of method of from video image, extracting text message, can more effectively under Chinese and English mixing situation, carry out Character segmentation identification, the videotext that can solve different-style is difficult to the problem handled in unified flow process, can organize, classify text messages dissimilar in the video.This framework both can effectively be handled various dissimilar videos, also can conveniently customize, revises, expand.
Character segmentation method under the Chinese and English mixing situation that the present invention proposes, centering, English character OCR engine carry out recognition confidence and proofread and correct, make the recognition confidence of two engines have comparability, then character string is carried out cutting apart of Chinese character and discern, from character string, find the English zone of candidate according to character feature again, carry out cutting apart and discerning of English character, have in the recognition result of two kinds of characters and replenish or overlapping part, make choice by the position and the recognition confidence of character.So both avoided the complicated OCR engine of training, the judgement that the not serious domain of dependence of segmentation result is separated has guaranteed efficient and stability.
In the technical scheme provided by the invention, can on different sample sets, carry out the method that the sorter recognition confidence is proofreaied and correct.According to actual conditions,, a kind of effective ways of proofreading and correct degree of confidence on different sample sets have been proposed from the angle of statistics.
In addition, utilize character feature to carry out printed page analysis.The present invention has proposed a kind of method that character feature carries out printed page analysis of collecting from system and point of view of application, and the text message of system's export structureization is convenient to post-processed.
For making principle of the present invention, characteristic and advantage, describe specific implementation of the present invention below in detail.
Embodiment one
With reference to Fig. 1, a kind of method of extracting the video structural text message that the embodiment of the invention provides comprises the steps:
S101 determines the position of text block in video image;
As Fig. 2, at first text block is positioned: pre-service, coarse positioning, projection cutting and screening.Specific as follows:
(1) pre-service comprises that calculating stroke responds (at The Proceedings of the IEEEInternational Conference on Image Processing, October.2006, be described in " Stroke Filter forText Localization in Video Images ") and color cluster, color cluster adopts the K Mean Method (at The Proceedings of the Eighteenth International Conference onMachine Learning, 2001, be described in " Constrained K-means Clustering with BachgroundKnowledge "), the former gives prominence to character according to the uniform characteristics of character stroke, the latter gives prominence to character according to the color characteristic of character, selects wherein a kind of treatment scheme according to configuration item.
Can strengthen text by calculating the stroke response, suppress background.Calculate the step of stroke response: the spacing of determining the stroke response according to configuration file; The response of calculating stroke; Binaryzation, and the bianry image that obtains carried out expansive working, to connect the stroke of some disconnections.
(2) coarse positioning
Detect text filedly according to the characteristics of character dense arrangement, obtain its approximate location.The projection cutting splits into the single file text with detected multiline text, obtains text filed comparatively accurate border, is convenient to follow-up cutting apart.Extract text filed feature in the checking link, the screening false-alarm.
On bianry image, at first obtain text filed approximate location by coarse positioning, accurately locate at intra-zone then.The coarse positioning step: connected domain is demarcated; Determine text filed, geometrical constraint according to the real text piece, as: size, arrangement position etc., to text filed merging on level or vertical direction (at The Proceedings of International Conference on MachineVision.Dec, 2007, be described in " A Robust System for Text Extraction in Video ").
(3) projection cutting
Often occur multiline text in the video image, multiline text is detected as a text block through regular meeting when rough detection.The follow-up link of cutting apart requires to be the single file text, potential multiline text need to be cut into a plurality of single file texts at this text filed.With the connected domain is unit, the method that adopts the projection cutting is (at PatternRecognition, Volume 36, Issue 10, Pages 2287-2299,2003, be described in " Character location inscene images from digital camera "), effectively solve the adhesion of multiline text and text and its adhesion of background on every side in some cases, guarantee that the candidate region after the cutting is the single file text.
(4) screening
At first, there is false-alarm in candidate's text block that above-mentioned processing obtains, need verifies: verify that according to text filed geometric properties response is verified according to stroke, verifies according to the graded feature.The checking link can be screened most of false-alarm in the positioning result, in that follow the tracks of and cut apart link still can be according to the information sifting false-alarm of current acquisition.
Step S102 judges whether to be the one text piece according to the position of adjacent video picture frame Chinese version piece relation and picture material;
When text block tracked in the described video frame image disappears, no longer continue or be replaced as text block, determine text piece, and text piece is carried out follow-up cutting apart and character recognition.
In the text block position fixing process, because in video image, text block can continue for some time usually, so the one text piece all can be positioned on consecutive numbers frame even hundreds of two field picture.If each positioning result is all cut apart, discerned, can waste a large amount of processing times.Adopt the method for following the tracks of, the one text piece was only once cut apart, discerned in the time period that disappearance occurs, thereby avoid re-treatment.And the beginning and ending time of text block and disappearance mode all are the important evidence of printed page analysis link.Therefore need follow the tracks of text block.
Following the tracks of link comprises position judgment, sequential judgement and safeguards array three parts.Whether position judgment and sequential are judged respectively from the position overlapping and whether content continues two aspect analyzing and positioning results, according to processing logic, provide independent text block in maintenance trail array link.Specific as follows:
I) position judgment
The stationkeeping that the one text piece occurs on the two field picture of front and back is constant, the text block position that obtains during the location is overlapped, and the position difference that different text block occurs on the two field picture of front and back, can not overlap, therefore, location overlap is whether two text block that the location obtains on the frame before and after judging are the necessary condition of one text piece.Position relation has four kinds: independence, underlap, overlapping and comprise, make judgement according to area shared proportion in text block of two text block overlapping regions.If independence or underlap illustrate that then it doesn't matter on the position, be judged as different text block; If overlapping or comprise, then explanation may need be done further judgement from same text block.According to the position of text block on the frame of front and back, definite border that needs the text block of tracking.
II) sequential is judged
Sequential judges it is to judge that from picture material whether two text block that consecutive frame navigates to are from same text.Sequential relationship has four kinds: a) keep, the text in two two field pictures of front and back does not change; B) replace, the text in the former frame image is replaced the content of text difference by the new text in one two field picture of back; C) disappear, the text in the former frame image disappears; D) false-alarm, locating obtain text filed in the former frame image is noise.
Under the situation that text position is fixed, the difference of two squares aggregate-value of front and back frame gray level image is an effective standard judging whether content of text changes.If do not distinguish the pixel of text filed inner character stroke and background, calculate the difference of two squares aggregate-value in whole zone, then judged result is subjected to the influence of change of background and instability easily, this paper only compares the bigger pixel of those stroke responses, these points all are positioned on the character stroke, make this algorithm more stable.Carry out the sequential judgement according to gray difference between two text block and stroke response difference.
III) maintenance trail array
In order to follow the tracks of the text block that occurs in the video, need to safeguard one and follow the tracks of array.Particularly, to emerging text block on the present frame, be located the result and add into array; To the text block that continues to occur, in array, keep this element; To the text block that disappears, determine the beginning and ending time and the disappearance mode of text piece, in its beginning and ending time, find out top-quality piece image, submit to and cut apart link, then this element of deletion from array.
Another task of maintenance trail array is from the multiple image that text block continues to occur, and picks out a top-quality frame, submits to and cuts apart link, helps to reduce the difficulty of cutting apart link like this, improves final recognition correct rate.
Step S103 obtains the text block image and this image is carried out pre-service;
With reference to Fig. 3, before cutting apart identification,, need the text block image is carried out pre-service when video image is a coloured image, described video image is transformed gray level image; Respectively Chinese, English character are cut apart identification again, the Chinese that will obtain, the merging of English character string obtain text message then.Then need not carry out pre-service for gray level image, can directly cut apart identification Chinese, English character.
The text block image is carried out binary conversion treatment, and character in the separate picture and background are to determine the character boundary;
To carry out the connected domain analysis to the bianry image that generates, to obtain the position and the dimension information of character stroke.
Pre-service comprises conversion gray level image, binaryzation and connected domain analysis.The text filed image of candidate that obtains of link is a coloured image in the location, and what use when binaryzation and character recognition is gray level image, therefore needs conversion, specifically comprises:
I) extract light intensity level;
Ii) extract some Color Channels (R, G and B) of coloured image, the most obvious in the intensity contrast between character and the background on this Color Channel;
Iii) converting colors space, distance metric mode between the change different colours is (at The Proceedingof International Conference on Document Analysis and Recognition, 2005, be described in " Colortext extraction from camera-based images:the impact of the choice of theclustering distance "), obtain the tangible gray level image of intensity contrast between character and the background;
Iv) color strengthens.One or more representative colors of difference designated character and background, adopt the method for K average that the pixel on the coloured image is carried out cluster, the luminance component that extracts pixel simultaneously is as gray level image, on gray level image, strengthen character pixels, suppress background pixel, increase the intensity contrast between character and the background.
In actual applications, should be according to the characteristics of video image, especially the relation of the color contrast between character and the background disposes appropriate conversion method, improves the effect of follow-up binary conversion treatment.
Binaryzation is used for the character and the background of separate picture, for determining that the character boundary lays the foundation.The binaryzation algorithm is important direction that is widely studied in the OCR field, has proposed multiple algorithm at present, for example:
Overall situation binaryzation algorithm: Ostu is (at IEEE Transaction on System Man Cybernet, Vol9, Pages 62-66,1979, be described in " A threshold selection method from gray-scale histogram "), Kittler is (at Pattern Recognition, Vol.19, Issue 1, Pages 41-47,1986, be described in " Minimum Error Thresholding ").
Local binaryzation algorithm: Niblack is (at An Introduction to Digital Image Processing, Prentice Hall, be described in 1986), Sauvola is (at Pattern Recognition, Vol.33, Issue 2, Pages 225-236,2000, " Adaptive document image binarization " and TheProceedings of SPIE, 2008, be described in " Efficient Implementation of Local AdaptiveThresholding Techniques Using Integral Images ").
In application, need to select different algorithms for use according to pending video image quality situation.
To carry out the connected domain analysis to the bianry image that generates, to obtain the position and the dimension information of character stroke.The connected domain analysis comprises three partial contents: connected domain is demarcated, is screened and merges.It is in order to reflect that the connected relation between the pixel in the bianry image is (at Computer Vision and ImageUnderstanding that connected domain is demarcated, Vol 89, Issue 1, Pages 1-23,2003, be described in " Linear-timeconnected-component labeling based on sequential local operations ").After demarcating, can access the information such as position, size and pixel number of each connected region in the bianry image.In the connected domain screening, design rule removes those irrational connected domains on features such as position, size, shape, dutycycle, lays the foundation for subsequent treatment reduces to disturb.Because generally being the stroke by a plurality of dispersions, Chinese character constitutes, if its connected domain is not reasonably merged (at IEEE Transaction On Pattern Analysis And Mechine Itelligence, Vol.24, No.11, November, 2002, be described in " Lexicon-Driven S egmentation and Recognition ofHandwritten Character Strings for Japanese Address Reading "), will influence choosing of cut-point.
Step S104 is cut apart and character recognition the text block image according to the Chinese character feature, obtains the Chinese character string;
The flow process that Chinese character is cut apart comprises that definite cut-point, pre-segmentation, character recognition and character string filter four parts.
According to actual conditions, determine that the strategy of cut-point has:
A. the connected domain feature of character is (at IEEE Transactions On Pattern Analysis AndMachine Intelligence, Vol.18, No.7, July 1996, are described in " A Survey of Methods andStrategies in Character Segmentation ").Under simple, ideal situation, certain intervals is arranged between the character, character stroke can adhesion, and the height and the width of character can accurately be determined cut-point in result who analyzes in conjunction with connected domain and the configuration item.
B. the vertical projection of character zone gray level image.In some programs, character pitch is less, and the stroke of adjacent character sticks together easily, should not use the connected domain analysis, and should be based on the local minizing point in the vertical projection of character zone gray level image, in conjunction with in the configuration item to the constraint of character duration, determine cut-point.
C. the background skeleton pattern is (at Pattern Recognition, Vol 32, Pages 921-933,1999, be described in " ABackground Thinning Based Approach for Seperating and RecognizingConnected Handwriting Digit Strings ").For adjacent character stroke adhesion situation more closely, need judge the position and the adhesion width of the generation of stroke adhesion according to the vertical projection of background pixel point, in conjunction with in the configuration item to the constraint of character duration, determine cut-point.
D. contacting model is (at IEEE Transaction On Pattern Analysis And MechineItelligence, Vol.24, No.11, November, 2002, being described in " Lexicon-Driven Segmentation andRecognition of Handwritten Character Strings for Japanese Address Reading ") shape facility of the exterior contour of connected domain can be determined some cut-points during according to stroke adhesion.
In actual applications, should select appropriate segmentation strategy, perhaps Different Strategies be combined, replenish mutually, determine cut-point comprehensively and accurately according to character feature.
During pre-segmentation, determine the border of candidate characters according to cut-point.If character duration is fixed, directly use character duration in the configuration item as constraint, from cut-point, determine the candidate characters border; If character duration is along with the composing situation changes within the specific limits, need to adopt the method for statistics with histogram and estimate character duration under the present case in conjunction with the variation range of character duration, again with this estimated value as constraint, from cut-point, determine candidate characters border (in denomination of invention for " character extracting method and device " application number is 200810246654.7 application documents, being described).
When character recognition, according to the position of candidate characters, the image of the single character of intercepting is discerned from image.The logical OCR engine of Tsing-Hua University's literary composition is adopted in character recognition, optimal identification result with present image is final recognition result, and according to the degree of confidence of the current recognition result of distance calculation between the candidate's recognition result number returned and the prototype (at Pattern Recognition Letters, Vol.19, No.10,1998, be described in " Adaptive Confidence Transform Based Classifier Combination for ChineseCharacter Recognition "), as the foundation of character string filtration.
Character string cut apart the strategy that adopts over-segmentation, the number of candidate characters contains wrong character learning symbol greater than the true number of character in the recognition result, therefore need filter to obtain correct character string recognition result.When filtering, accept or reject according to location overlap degree and recognition confidence between candidate's adjacent character.The character string that obtains after the filtration is exported as net result.
Step S105 determines English zone according to the geometric properties and the positional information of connected domain in the text block image, and described English zone is cut apart and character recognition, obtains the English character string;
Exist in the text of Chinese and English mixing; the combination of single English character or adjacent English character; being known by mistake through regular meeting is Chinese character; simultaneously; the simple Chinese character of the radical of Chinese character or some strokes can be known for English character, so can not replace English to cut apart only according to recognition result by mistake.
In the embodiment of the invention,, have and tendentiously cut apart, discern determining earlier English zone according to surface, comprise and judge that English zone and English character discern, recognition result is exported with the form of English character string.
In the English region decision link of candidate,, find out the English zone of candidate in the image according to the geometric properties and the adjacent situation of connected domain.In Chinese and English mixing text, English character is compared with Chinese character, and two characteristics are arranged: the width difference of Chinese and English character, and the English character width is less; The center distance of English character is less, and the center distance of Chinese character is bigger, and at Chinese and English character intersection, the center distance of character changes.
From the pre-service result, can obtain the size and the positional information of connected domain.English character all is single character, and under the situation of not considering adhesion, the width of English character connected domain is exactly its character duration; The width of Chinese character is cut apart link by Chinese character and is obtained.The center distance of character is the distance between the connected domain central point of adjacent character.Calculate the position of character duration and central point, can determine the English zone of candidate in conjunction with above-mentioned two characteristics.
In the English zone of the candidate who determines, comprise non-English zone through regular meeting, as the stroke of punctuate, Chinese character etc., merging link at the Chinese and English character can remove.
The OCR engine of oneself developing is adopted in the identification of English character, (1) recognition engine only is absorbed in the identification of English alphabet and numeral, owing to need the classification number of differentiation very little, can obtain higher recognition correct rate, (2) can be according to the actual conditions exptended sample, the customization training set makes more closing to reality application of recognition result.
Recognition engine is extracted the directional line element feature of character (at IEEE Transactions On Pattern AnalysisAnd Machine Intelligence, Vol 21, No 3, March 1999, be described in " A Handwritten CharacterRecognition System Using Directional Element Feature and AsymmetricMahalanobis Distance ") and gradient (at IEEE Transactions On PatternAnalysis And Machine Intelligence, Vol 29, No 8, March 2007, be described in " Normalization-Cooperated Gradient Feature Extraction for HandwrittenCharacter Recognition ") assemblage characteristic, feature adopts the LDA dimensionality reduction (at " Introduction to Statistical Pattern Recognition ", 2nd edition, Academic Press, NewYork, be described in 1990), sorter adopts DLQDF (at IEEE Transactions OnNeural Networks, Vol 15, No 2, March 2004, being described in " Discriminative Learning QuadraticDiscriminant Function for Handwriting Recognition ") algorithm trains, sorter output recognition result and degree of confidence, the confidence calculations method is identical with Chinese character.
Step S106 calculates the recognition confidence of resulting Chinese character, English character respectively, and recognition confidence is proofreaied and correct;
Because different recognition engine is adopted in Chinese and English identification respectively, the prototype space scale of two recognition engine differs greatly, sample separation is also inequality from the tolerance mode, and therefore the recognition confidence that calculates does not have comparability, and insertion needs two class recognition confidences are proofreaied and correct before merging.Proofread and correct recognition confidence and generally on identical sample space, carry out, but hereinto, separately identification of English character, the sample space of two recognition engine is not overlapping, can't directly proofread and correct.
With reference to Fig. 4, for example, the recognition confidence of supposing the Chinese and English character is that Gaussian distribution is (at PatternRecognition Vol.38, Pagess 11-28,2005, be described in " Classifier Combination Based onConfidence Transformation "), be as the criterion with the recognition confidence of Chinese character, the recognition confidence of English character is proofreaied and correct:
(1) on sample set (headline), the statistical conditions according to the recognition confidence of Chinese character are divided into 5 grades, try to achieve the degree of confidence average a of each grade 1, a 2, a 3, a 4, a 5
(2) English character in the same row headers has the grade identical with Chinese character;
(3) calculate the degree of confidence average b of the English character of each grade 1, b 2, b 3, b 4, b 5
(4) the degree of confidence average of centering, five grades of English character is carried out linear fit (in " statistical inference ", China Machine Press is described in 2005.);
(5), redefine the recognition confidence of English character according to fitting parameter.
English character after overcorrect has and the corresponding to degree of confidence of Chinese character like this.
Step S107 merges Chinese character string and English character string based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character, obtains text message.
Merging link, by comparing the relation of Chinese and English character string on position and recognition confidence, two character strings are merged, the result after the merging exports as net result.Adopt " plug-in type " strategy to merge in the embodiment of the invention, specifically comprise:
In the appropriate location of Chinese character string, fill the English character of being omitted, the reason of omission is when the Chinese character pre-segmentation, the width of English character does not meet the demands and screenedly falls;
Place at the Chinese and English character overlap, the recognition confidence that compares two class characters, those recognition results of being known for Chinese character by mistake are replaced with the higher English recognition result of degree of confidence, the reason that mistake is known be two adjacent English characters when pre-segmentation by as a Chinese character.
For example, as shown in Figure 5, it is the text image that obtains from the screen intercepting with Chinese and English digital mixing, content is " London 7,200,000 pounds escort G20 summit ", what cut apart that identification obtains according to Chinese character is " London add sterling escort to add summit ", wherein general ' 7 ', ' 72 ' the character string selection link that is combined in is removed, ' 20 ' mistake is identified as ' adding ', by the relation of Chinese and English character string on position and recognition confidence relatively, obtain correct result " London 7,200,000 pounds escort G20 summit " after the merging.
Step S108 analyzes the space of a whole page of video image, obtains the text feature in the video image; The text message that obtains after merging is organized, classified.
The text that comprises in the video is of a great variety, different types of text implication difference, and as shown in Figure 6, the text in the zone comprises: types such as title, subtitle, station symbol, adjunct, scroll bar.In video search and video automated cataloging, need from video, extract structurized text message, text is the feature of equal importance with content of text.
According to text feature to its carry out careful, organize accurately and classify, the text message of export structureization to satisfy the needs of different application aspect, as shown in Figure 7, comprises and collects feature, text tissue and text classification.In printed page analysis, will use the temporal aspect of text block, and temporal aspect is handled and could be determined at one section program, therefore adopt the mode of processed offline, promptly after one section program is handled, just carry out printed page analysis.
Printed page analysis comprises collects feature, text tissue and text classification.
The text feature of using in the printed page analysis comprises:
Polarity, reflect text filed in the shade relativity of character and background, be the dark character of the light background of 0 expression as polarity, polarity is 1 expression dark-background light color character.Cutting apart link can utilize algorithm to judge text polarity automatically; Also can in configuration file, provide polarity, and instruct with this and to cut apart.
Color comprises character color and background color.In some cases, polarity is not enough to distinguish different types of text, all is 1 as white under the red background and yellow character polarity, at this moment just needs to consider colouring information.
Character size, comprise single character in the line of text mean breadth and the height.In cutting apart link, carry out accessing after the pre-segmentation width and the height of single character, add up the mean breadth and the height of single character in the line of text with this.
The text block position, comprise text block about, border, the left and right sides.
Recognition result.The text block image is the character string that obtains after over-segmentation, identification, provides cutting apart link.
The beginning and ending time of text block.The moment of text block appearing and subsiding;
The sequential relationship of text block.Following the tracks of link, carrying out providing when sequential is judged four kinds of relations: maintenance, disappearance, replacement and false-alarm belong to have two kinds of text block: disappear and replace.
These features are bases of printed page analysis, in subsequent treatment, should be according to the characteristics of processed video, and flexible combination feature and design rule do not have unified treatment scheme.
The text tissue comprises: the merging of multiline text on the same two field picture; The merging of same text block on the continuous multiple frames image.
Through after the projection cutting, the text block of processing all is the single file text, these single file texts may need to combine could The expressed implication, as the headline of multirow.On same two field picture,,, the single file group of text of disperseing on the space is synthesized complete logic unit in conjunction with the characteristics of processed video according to information such as the position of text block, character size, colors.
In some cases, continuously the text that occurs may need to combine could The expressed implication, perhaps same text is intermittent to be occurred repeatedly, as headline.This just need go up the group of text of disperseing with the time and synthesize complete logic unit according to information such as the recognition result of text block, character size, colors.
Text classification, in different video frequency programs, the form of expression of text has nothing in common with each other.At a class program, can sum up the rule that draws some text classifications by observing, but in another kind of program, rule may be set up no longer.Therefore, text classification does not have concrete unified treatment scheme, can classify in conjunction with text feature and template.
Embodiment two
With reference to Fig. 8, the embodiment of the invention also provides a kind of device 200 that extracts video text message, comprising:
Position determination unit 210 is used for determining the position in video image Chinese version piece zone;
Chinese character processing unit 220 is cut apart and character recognition described text block according to the Chinese character feature, obtains the Chinese character string;
English character processing unit 230 is determined English zone according to the geometric properties and the positional information of connected domain in the described text block, and described English zone is cut apart and character recognition, obtains the English character string;
Computing unit 240 is used for calculating respectively the recognition confidence of resulting Chinese character, English character, and recognition confidence is proofreaied and correct;
Merge cells 250 is used for based on character recognition degree of confidence and the relation between Chinese character and the English character after proofreading and correct described Chinese character string and Chinese character string being merged, and obtains text message.
This device 200 also comprises:
Monitoring tracking cell 260, the text block that is used for monitoring and following the tracks of the continuous videos picture frame;
Judging unit 270, positional information and picture material that the adjacent video picture frame Chinese version piece that provides according to described monitoring tracking cell is provided judge whether to be the one text piece;
If in the video frame image is different text block, judging unit 270 is determined the zone of this difference text block, and then Chinese character processing unit 220 is cut apart and character recognition these different text block respectively with English character processing unit 230.
Have syndrome unit 241 in the computing unit 240, be used for being as the criterion with the recognition confidence of Chinese character, the recognition confidence of English character is proofreaied and correct, this syndrome unit 241 comprises:
Diversity module 241a is used for the recognition confidence of Chinese character is divided into some grades, and calculates the degree of confidence average of each grade, and the English character of same capable text block has identical grade with Chinese character;
Computing module 241b is used to calculate the degree of confidence average of the English character of each grade;
Adjusting module 241c, the degree of confidence average that is used for centering, each grade of English character is carried out linear fit; And, redefine the recognition confidence of English character according to fitting parameter.
This device 200 in, also be provided with pretreatment unit 270, be used for text block is cut apart with character recognition before, text block is carried out pre-service, this pretreatment unit 270 specifically comprises:
Image processing module 270a carries out binary conversion treatment to text block region image, and character in the separate picture and background are to determine the character boundary;
Image analysis module 270b is used for will carrying out the connected domain analysis to the bianry image that generates, to obtain the position and the dimension information of character stroke.
In sum, a kind of method and device that extracts the video structural text message provided by the invention is by locating the position of determining video image Chinese version piece; And text block followed the tracks of; According to Chinese, English character feature the text block image is cut apart and character recognition respectively again, obtained Chinese and English character string; And the recognition confidence of centering, English character is proofreaied and correct; Based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character Chinese character string and English character string are merged, obtain text message.According to the present invention, can carry out Character segmentation identification to the text of the Chinese and English mixing in the video image, the videotext that can solve different-style is difficult to the problem handled in unified flow process, can organize, classify text messages dissimilar in the video.This framework both can effectively be handled various dissimilar videos, also can conveniently customize, revises, expand.
According to described disclosed embodiment, can be so that those skilled in the art can realize or use the present invention.To those skilled in the art, the various modifications of these embodiment are conspicuous, and the general principles of definition here also can be applied to other embodiment on the basis that does not depart from the scope of the present invention with purport.Above-described embodiment only is preferred embodiment of the present invention, and is in order to restriction the present invention, within the spirit and principles in the present invention not all, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. a method of extracting video text message is characterized in that, comprising:
Determine the position of video image Chinese version piece;
According to the Chinese character feature described text block image is cut apart and character recognition, obtained the Chinese character string;
Geometric properties and positional information according to connected domain in the described text block image are determined English zone, and described English zone is cut apart and character recognition, obtain the English character string;
Calculate the recognition confidence of resulting Chinese character, English character respectively, and recognition confidence is proofreaied and correct;
Based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character described Chinese character string and English character string are merged, obtain text message.
2. the method for claim 1 is characterized in that, also comprises:
Monitoring is also followed the tracks of text block in the continuous videos picture frame, judges whether to be the one text piece according to the position relation and the picture material of adjacent video picture frame Chinese version piece;
When described text block disappears, determine the position of text piece, and text piece is carried out follow-up cutting apart and character recognition.
3. method as claimed in claim 2 is characterized in that, described position relation and picture material according to adjacent video picture frame Chinese version piece judges whether to be the one text piece, is specially:
If the regional separate or underlap of adjacent video picture frame Chinese version piece judges that then adjacent video picture frame Chinese version piece is different text block;
If the region overlapping of adjacent video picture frame Chinese version piece or comprise judges that then adjacent video picture frame Chinese version piece is the one text piece.
4. the method for claim 1 is characterized in that, the described step that recognition confidence is proofreaied and correct comprises that the recognition confidence with Chinese character is as the criterion, and the recognition confidence of English character is proofreaied and correct:
The recognition confidence of described Chinese character is divided into some grades, and calculates the degree of confidence average of each grade, and the English character of same capable text block has identical grade with Chinese character;
Calculate the degree of confidence average of the English character of each grade;
Degree of confidence average with each grade of Chinese character is a benchmark, and the degree of confidence average of centering, English character same levels is carried out linear fit;
According to fitting parameter, redefine the recognition confidence of English character.
5. the method for claim 1 is characterized in that, described text block is cut apart with character recognition before, also comprise described text block region image carried out pretreated step:
When described video image is a coloured image, described video image is transformed gray level image;
Described text block region image is carried out binary conversion treatment, and character in the separate picture and background are to determine the character boundary;
To carry out the connected domain analysis to the bianry image that generates, to obtain the position and the dimension information of character stroke.
6. the method for claim 1 is characterized in that, also comprises:
Described video image is carried out printed page analysis, obtain the text feature in the described video image;
According to described text feature, described text message is organized, classified.
7. a device that extracts video text message is characterized in that, comprising:
Position determination unit is used for determining the position of video image Chinese version piece;
First processing unit is cut apart and character recognition described text block according to the Chinese character feature, obtains the Chinese character string;
Second processing unit is determined English zone according to the geometric properties and the positional information of connected domain in the described text block, and described English zone is cut apart and character recognition, obtains the English character string;
Computing unit is used for calculating respectively the recognition confidence of resulting Chinese character, English character, and recognition confidence is proofreaied and correct;
Merge cells is used for based on character recognition degree of confidence after proofreading and correct and the relation of the position between Chinese character and the English character described Chinese character string and Chinese character string being merged, and obtains text message.
8. device as claimed in claim 7 is characterized in that, also comprises:
The monitoring tracking cell, the text block that is used for monitoring and following the tracks of the continuous videos picture frame;
Judging unit, positional information and picture material that the adjacent video picture frame Chinese version piece that provides according to described monitoring tracking cell is provided judge whether to be the one text piece;
If in the described video frame image is different text block, described judging unit is determined the zone of this difference text block, and then described first processing unit is cut apart and character recognition these different text block respectively with second processing unit.
9. device as claimed in claim 7 is characterized in that, has the syndrome unit in the described computing unit, is used for being as the criterion with the recognition confidence of Chinese character, and the recognition confidence of English character is proofreaied and correct, and this syndrome unit comprises:
Diversity module is used for the recognition confidence of Chinese character is divided into some grades, and calculates the degree of confidence average of each grade, and the English character of same capable text block has identical grade with Chinese character;
Computing module is used to calculate the degree of confidence average of the English character of each grade;
Adjusting module, the degree of confidence average that is used for each grade of Chinese character is a target, the degree of confidence average of centering, each grade of English character is carried out linear fit; And, redefine the recognition confidence of English character according to fitting parameter.
10. device as claimed in claim 7 is characterized in that, also is provided with pretreatment unit, be used for described text block is cut apart with character recognition before, described text block is carried out pre-service, this pretreatment unit specifically comprises:
Image processing module transforms gray level image with described text block image, and this gray level image is carried out binary conversion treatment, and character in the separate picture and background are to determine the character boundary;
Image analysis module is used for will carrying out the connected domain analysis to the bianry image that generates, to obtain the position and the dimension information of character stroke.
CN201010104243A 2010-01-29 2010-01-29 Method for extracting video text message and device thereof Pending CN101777124A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010104243A CN101777124A (en) 2010-01-29 2010-01-29 Method for extracting video text message and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010104243A CN101777124A (en) 2010-01-29 2010-01-29 Method for extracting video text message and device thereof

Publications (1)

Publication Number Publication Date
CN101777124A true CN101777124A (en) 2010-07-14

Family

ID=42513582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010104243A Pending CN101777124A (en) 2010-01-29 2010-01-29 Method for extracting video text message and device thereof

Country Status (1)

Country Link
CN (1) CN101777124A (en)

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169413A (en) * 2011-03-30 2011-08-31 黄冬明 Device and method for obtaining character stroke lines based on video stream image
CN102411775A (en) * 2011-08-10 2012-04-11 Tcl集团股份有限公司 Method and system for correcting brightness of text image
CN103744971A (en) * 2014-01-10 2014-04-23 广东小天才科技有限公司 Method and equipment for actively pushing information
CN104106078A (en) * 2012-01-09 2014-10-15 高通股份有限公司 Ocr cache update
CN104751153A (en) * 2013-12-31 2015-07-01 中国科学院深圳先进技术研究院 Scene text recognizing method and device
CN104933429A (en) * 2015-06-01 2015-09-23 深圳市诺比邻科技有限公司 Method and device for extracting information from image
CN105247509A (en) * 2013-03-11 2016-01-13 微软技术许可有限责任公司 Detection and reconstruction of east asian layout features in a fixed format document
CN106127118A (en) * 2016-06-15 2016-11-16 珠海迈科智能科技股份有限公司 A kind of English word recognition methods and device
CN106650714A (en) * 2016-10-08 2017-05-10 迪堡金融设备有限公司 Paper note serial number identification method and apparatus
CN106845473A (en) * 2015-12-03 2017-06-13 富士通株式会社 For determine image whether be the image with address information method and apparatus
CN107301414A (en) * 2017-06-23 2017-10-27 厦门商集企业咨询有限责任公司 Chinese positioning, segmentation and recognition methods in a kind of natural scene image
CN107784301A (en) * 2016-08-31 2018-03-09 百度在线网络技术(北京)有限公司 Method and apparatus for identifying character area in image
CN107920272A (en) * 2017-11-14 2018-04-17 维沃移动通信有限公司 A kind of barrage screening technique, device and mobile terminal
CN109389139A (en) * 2017-08-11 2019-02-26 中国农业大学 A kind of locust method of counting and device
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment
CN109508406A (en) * 2018-12-12 2019-03-22 北京奇艺世纪科技有限公司 A kind of information processing method, device and computer readable storage medium
WO2019085971A1 (en) * 2017-11-03 2019-05-09 腾讯科技(深圳)有限公司 Method and apparatus for positioning text over image, electronic device, and storage medium
CN110032348A (en) * 2019-03-21 2019-07-19 北京空间飞行器总体设计部 A kind of character display method, device, medium
CN110147724A (en) * 2019-04-11 2019-08-20 北京百度网讯科技有限公司 For detecting text filed method, apparatus, equipment and medium in video
CN110188762A (en) * 2019-04-23 2019-08-30 山东大学 Chinese and English mixing merchant store fronts title recognition methods, system, equipment and medium
CN110532833A (en) * 2018-05-23 2019-12-03 北京国双科技有限公司 A kind of video analysis method and device
CN110717492A (en) * 2019-10-16 2020-01-21 电子科技大学 Method for correcting direction of character string in drawing based on joint features
CN111291794A (en) * 2020-01-21 2020-06-16 上海眼控科技股份有限公司 Character recognition method, character recognition device, computer equipment and computer-readable storage medium
CN111310441A (en) * 2020-01-20 2020-06-19 上海眼控科技股份有限公司 Text correction method, device, terminal and medium based on BERT (binary offset transcription) voice recognition
CN112101317A (en) * 2020-11-17 2020-12-18 深圳壹账通智能科技有限公司 Page direction identification method, device, equipment and computer readable storage medium
CN112101324A (en) * 2020-11-18 2020-12-18 鹏城实验室 Multi-view image coexisting character detection method, equipment and computer storage medium
CN112418215A (en) * 2020-11-17 2021-02-26 峰米(北京)科技有限公司 Video classification identification method and device, storage medium and equipment
CN112633343A (en) * 2020-12-16 2021-04-09 国网江苏省电力有限公司检修分公司 Power equipment terminal strip wiring checking method and device
CN112749599A (en) * 2019-10-31 2021-05-04 北京金山云网络技术有限公司 Image enhancement method and device and server
WO2024139300A1 (en) * 2022-12-30 2024-07-04 成都云天励飞技术有限公司 Video text processing method and apparatus, and electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1808468A (en) * 2005-01-17 2006-07-26 佳能信息技术(北京)有限公司 Optical character recognition method and system
CN101002198A (en) * 2004-06-23 2007-07-18 Google公司 Systems and methods for spell correction of non-roman characters and words
CN101097600A (en) * 2006-06-29 2008-01-02 北大方正集团有限公司 Character recognizing method and system
CN101251892A (en) * 2008-03-07 2008-08-27 北大方正集团有限公司 Method and apparatus for cutting character

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101002198A (en) * 2004-06-23 2007-07-18 Google公司 Systems and methods for spell correction of non-roman characters and words
CN1808468A (en) * 2005-01-17 2006-07-26 佳能信息技术(北京)有限公司 Optical character recognition method and system
CN101097600A (en) * 2006-06-29 2008-01-02 北大方正集团有限公司 Character recognizing method and system
CN101251892A (en) * 2008-03-07 2008-08-27 北大方正集团有限公司 Method and apparatus for cutting character

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102169413A (en) * 2011-03-30 2011-08-31 黄冬明 Device and method for obtaining character stroke lines based on video stream image
CN102411775A (en) * 2011-08-10 2012-04-11 Tcl集团股份有限公司 Method and system for correcting brightness of text image
CN102411775B (en) * 2011-08-10 2015-02-11 Tcl集团股份有限公司 Method and system for correcting brightness of text image
CN104106078A (en) * 2012-01-09 2014-10-15 高通股份有限公司 Ocr cache update
CN105247509A (en) * 2013-03-11 2016-01-13 微软技术许可有限责任公司 Detection and reconstruction of east asian layout features in a fixed format document
US10127221B2 (en) 2013-03-11 2018-11-13 Microsoft Technology Licensing, Llc Detection and reconstruction of East Asian layout features in a fixed format document
CN104751153A (en) * 2013-12-31 2015-07-01 中国科学院深圳先进技术研究院 Scene text recognizing method and device
CN104751153B (en) * 2013-12-31 2018-08-14 中国科学院深圳先进技术研究院 A kind of method and device of identification scene word
CN103744971B (en) * 2014-01-10 2017-09-15 广东小天才科技有限公司 Method and equipment for actively pushing information
CN103744971A (en) * 2014-01-10 2014-04-23 广东小天才科技有限公司 Method and equipment for actively pushing information
CN104933429A (en) * 2015-06-01 2015-09-23 深圳市诺比邻科技有限公司 Method and device for extracting information from image
CN106845473B (en) * 2015-12-03 2020-06-02 富士通株式会社 Method and device for determining whether image is image with address information
CN106845473A (en) * 2015-12-03 2017-06-13 富士通株式会社 For determine image whether be the image with address information method and apparatus
CN106127118A (en) * 2016-06-15 2016-11-16 珠海迈科智能科技股份有限公司 A kind of English word recognition methods and device
CN107784301B (en) * 2016-08-31 2021-06-11 百度在线网络技术(北京)有限公司 Method and device for recognizing character area in image
CN107784301A (en) * 2016-08-31 2018-03-09 百度在线网络技术(北京)有限公司 Method and apparatus for identifying character area in image
CN106650714A (en) * 2016-10-08 2017-05-10 迪堡金融设备有限公司 Paper note serial number identification method and apparatus
CN107301414A (en) * 2017-06-23 2017-10-27 厦门商集企业咨询有限责任公司 Chinese positioning, segmentation and recognition methods in a kind of natural scene image
CN107301414B (en) * 2017-06-23 2020-07-07 厦门商集网络科技有限责任公司 Chinese positioning, segmenting and identifying method in natural scene image
CN109389115A (en) * 2017-08-11 2019-02-26 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment
CN109389139A (en) * 2017-08-11 2019-02-26 中国农业大学 A kind of locust method of counting and device
CN109389115B (en) * 2017-08-11 2023-05-23 腾讯科技(上海)有限公司 Text recognition method, device, storage medium and computer equipment
WO2019085971A1 (en) * 2017-11-03 2019-05-09 腾讯科技(深圳)有限公司 Method and apparatus for positioning text over image, electronic device, and storage medium
US11087168B2 (en) 2017-11-03 2021-08-10 Tencent Technology (Shenzhen) Company Ltd Method and apparatus for positioning text over image, electronic apparatus, and storage medium
CN107920272A (en) * 2017-11-14 2018-04-17 维沃移动通信有限公司 A kind of barrage screening technique, device and mobile terminal
CN110532833A (en) * 2018-05-23 2019-12-03 北京国双科技有限公司 A kind of video analysis method and device
CN109508406A (en) * 2018-12-12 2019-03-22 北京奇艺世纪科技有限公司 A kind of information processing method, device and computer readable storage medium
CN109508406B (en) * 2018-12-12 2020-11-13 北京奇艺世纪科技有限公司 Information processing method and device and computer readable storage medium
CN110032348B (en) * 2019-03-21 2022-05-24 北京空间飞行器总体设计部 Character display method, device and medium
CN110032348A (en) * 2019-03-21 2019-07-19 北京空间飞行器总体设计部 A kind of character display method, device, medium
CN110147724A (en) * 2019-04-11 2019-08-20 北京百度网讯科技有限公司 For detecting text filed method, apparatus, equipment and medium in video
CN110147724B (en) * 2019-04-11 2022-07-01 北京百度网讯科技有限公司 Method, apparatus, device, and medium for detecting text region in video
CN110188762A (en) * 2019-04-23 2019-08-30 山东大学 Chinese and English mixing merchant store fronts title recognition methods, system, equipment and medium
CN110188762B (en) * 2019-04-23 2021-02-05 山东大学 Chinese-English mixed merchant store name identification method, system, equipment and medium
CN110717492A (en) * 2019-10-16 2020-01-21 电子科技大学 Method for correcting direction of character string in drawing based on joint features
CN110717492B (en) * 2019-10-16 2022-06-21 电子科技大学 Method for correcting direction of character string in drawing based on joint features
CN112749599A (en) * 2019-10-31 2021-05-04 北京金山云网络技术有限公司 Image enhancement method and device and server
CN111310441A (en) * 2020-01-20 2020-06-19 上海眼控科技股份有限公司 Text correction method, device, terminal and medium based on BERT (binary offset transcription) voice recognition
CN111291794A (en) * 2020-01-21 2020-06-16 上海眼控科技股份有限公司 Character recognition method, character recognition device, computer equipment and computer-readable storage medium
CN112101317A (en) * 2020-11-17 2020-12-18 深圳壹账通智能科技有限公司 Page direction identification method, device, equipment and computer readable storage medium
CN112418215A (en) * 2020-11-17 2021-02-26 峰米(北京)科技有限公司 Video classification identification method and device, storage medium and equipment
CN112101324A (en) * 2020-11-18 2020-12-18 鹏城实验室 Multi-view image coexisting character detection method, equipment and computer storage medium
CN112633343A (en) * 2020-12-16 2021-04-09 国网江苏省电力有限公司检修分公司 Power equipment terminal strip wiring checking method and device
CN112633343B (en) * 2020-12-16 2024-04-19 国网江苏省电力有限公司检修分公司 Method and device for checking wiring of power equipment terminal strip
WO2024139300A1 (en) * 2022-12-30 2024-07-04 成都云天励飞技术有限公司 Video text processing method and apparatus, and electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN101777124A (en) Method for extracting video text message and device thereof
CN107748888B (en) A kind of image text row detection method and device
CN102332096B (en) Video caption text extraction and identification method
CN104951784B (en) A kind of vehicle is unlicensed and license plate shading real-time detection method
CN101510258B (en) Certificate verification method, system and certificate verification terminal
CN103761531B (en) The sparse coding license plate character recognition method of Shape-based interpolation contour feature
CN109255350B (en) New energy license plate detection method based on video monitoring
Kumar et al. Segmentation of isolated and touching characters in offline handwritten Gurmukhi script recognition
CN103034848B (en) A kind of recognition methods of form types
CN108596166A (en) A kind of container number identification method based on convolutional neural networks classification
CN102629322B (en) Character feature extraction method based on stroke shape of boundary point and application thereof
Sheikh et al. Traffic sign detection and classification using colour feature and neural network
CN103679678B (en) A kind of semi-automatic splicing restored method of rectangle character features a scrap of paper
CN106875546A (en) A kind of recognition methods of VAT invoice
CN103903018A (en) Method and system for positioning license plate in complex scene
CN102663377A (en) Character recognition method based on template matching
CN103902981A (en) Method and system for identifying license plate characters based on character fusion features
CN103914680A (en) Character image jet-printing, recognition and calibration system and method
AU2009281901A1 (en) Segmenting printed media pages into articles
Roy et al. Wavelet-gradient-fusion for video text binarization
CN103336961A (en) Interactive natural scene text detection method
Garz et al. A binarization-free clustering approach to segment curved text lines in historical manuscripts
CN104834891A (en) Method and system for filtering Chinese character image type spam
Garlapati et al. A system for handwritten and printed text classification
CN103049749A (en) Method for re-recognizing human body under grid shielding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Zhou Jingchao

Inventor after: Miao Guangyi

Inventor after: Bao Dongshan

Inventor before: Zhou Jingchao

Inventor before: Miao Guangyi

Inventor before: Bao Dongshan

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100714