CN107203763A - Character recognition method and device - Google Patents

Character recognition method and device Download PDF

Info

Publication number
CN107203763A
CN107203763A CN201610157743.9A CN201610157743A CN107203763A CN 107203763 A CN107203763 A CN 107203763A CN 201610157743 A CN201610157743 A CN 201610157743A CN 107203763 A CN107203763 A CN 107203763A
Authority
CN
China
Prior art keywords
languages
recognition result
picture
identified
significant character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610157743.9A
Other languages
Chinese (zh)
Other versions
CN107203763B (en
Inventor
张明明
杨建武
于晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Peking University
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University
Priority to CN201610157743.9A priority Critical patent/CN107203763B/en
Publication of CN107203763A publication Critical patent/CN107203763A/en
Application granted granted Critical
Publication of CN107203763B publication Critical patent/CN107203763B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention provides a kind of character recognition method and device, wherein, this method includes:Obtain picture to be identified;Text region is carried out to picture to be identified using the OCR core libraries of at least one languages, generates in the recognition result of each languages, recognition result and includes at least one character;Determine the significant character ratio of the recognition result of each languages;According to the significant character ratio of the recognition result of each languages, the effective result of the identification of languages and the word in picture to be identified of the word in picture to be identified is judged.After the languages that the word in picture is determined to picture to be identified that need not be artificial, then carry out Text region;The languages of the word in picture to be identified can automatically be judged, while determining the recognition result of the word in picture to be identified, it is not necessary to artificial operation, shorten the time of identification, improve recognition efficiency.

Description

Character recognition method and device
Technical field
The present invention relates to multimedia technology, more particularly to a kind of character recognition method and device.
Background technology
With continuing to develop for multimedia technology, picture, video etc. multimedia messages material are increasingly Many is applied in every technology., can be equipped with substantial amounts of in the multimedia messages such as picture, video Text information, such as have explanatory note, in video with captions and other words in picture Descriptive information., it is necessary to from single picture or video when handling multimedia messages Each frame picture in extract word, extracting word needs the method using Text region.
In the prior art, to the word in picture, in the artificial languages that the word in picture is determined Afterwards, optical character identification (Optical Character corresponding with the languages can be used Recognition, abbreviation OCR) core library carry out and the identification of word.
But be generally all to substantial amounts of when the Text region of picture is carried out in the prior art Picture carries out Text region, needs artificial each picture to be determined in picture in the prior art After the languages of word, just Text region can be carried out using OCR core libraries corresponding with each languages, So as to need substantial amounts of artificial operation, the time of identification is longer, and recognition efficiency is low.
The content of the invention
The present invention provides a kind of character recognition method and device, artificial to solve to need in the prior art After the languages that word in picture is determined to each picture, it can just use corresponding with each languages OCR core libraries carry out Text region, so as to need substantial amounts of artificial operation, the time of identification is longer, know The problem of other efficiency is low.
It is an aspect of the present invention to provide a kind of character recognition method, including:
Obtain picture to be identified;
Text region, generation are carried out to the picture to be identified using the OCR core libraries of at least one languages Include at least one character in the recognition result of each languages, the recognition result;
Determine the significant character ratio of the recognition result of each languages;
According to the significant character ratio of the recognition result of each languages, the word in the picture to be identified is judged Languages and the word in the picture to be identified the effective result of identification.
In method as described above, the significant character ratio of the recognition result for determining each languages, including:
The character number in the recognition result of each languages is determined, and is determined each in the recognition result of each languages The character code of character;
The character code of each character in the recognition result of each languages is determined, the character for belonging to each languages is compiled Significant character number in code interval;
According to the significant character of the character number of the recognition result of each languages, and the recognition result of each languages Number, determines the significant character ratio of the recognition result of each languages.
In method as described above, the significant character ratio of the recognition result according to each languages judges The identification of the languages of word in the picture to be identified and the word in the picture to be identified is effective As a result, including:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio The recognition result of languages is the effective result of identification of the word of the picture to be identified.
In method as described above, the OCR core libraries using at least one languages are to described to be identified Picture carries out Text region, generates the recognition result of each languages, including:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries, The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the significant character ratio of the recognition result according to each languages, judges described to be identified The effective result of identification of the languages of word in picture and the word in the picture to be identified, including:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages, The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know The effective result of identification of word in other picture is the recognition result of English languages.
Another aspect of the present invention there is provided a kind of character recognition device, including:
Acquisition module, for obtaining picture to be identified;
Identification module, is carried out for the OCR core libraries using at least one languages to the picture to be identified Text region, generates in the recognition result of each languages, the recognition result and includes at least one character;
Determining module, the significant character ratio of the recognition result for determining each languages;
Determination module, for the significant character ratio of the recognition result according to each languages, waits to know described in judgement The effective result of identification of the languages of word in other picture and the word in the picture to be identified.
In device as described above, the determining module, including:
First determination sub-module, the character number in recognition result for determining each languages, and determine each The character code of each character in the recognition result of languages;
Second determination sub-module, the character code of each character in recognition result for determining each languages, The significant character number belonged in the character code interval of each languages;
Calculating sub module, the knowledge for the character number of the recognition result according to each languages, and each languages The significant character number of other result, determines the significant character ratio of the recognition result of each languages.
In device as described above, the determination module, specifically for:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio The recognition result of languages is the effective result of identification of the word of the picture to be identified.
In device as described above, the identification module, specifically for:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries, The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the determination module, specifically for:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages, The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know The effective result of identification of word in other picture is the recognition result of English languages.
The present invention is by obtaining picture to be identified, using the OCR core libraries of at least one languages to be identified Picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one word Symbol;The significant character ratio of the recognition result of each languages is calculated, according to having for the recognition result of each languages Character ratio is imitated, the languages and the word in picture to be identified of word in picture to be identified are judged Recognize effective result.So as to not need the artificial word determined to picture to be identified in picture After languages, then carry out Text region;The languages of the word in picture to be identified can automatically be judged, The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification Time, improve recognition efficiency.
Brief description of the drawings
Fig. 1 is the flow chart for the character recognition method that the embodiment of the present invention one is provided;
Fig. 2 is the flow chart for the character recognition method that the embodiment of the present invention two is provided;
Fig. 3 shows for a kind of picture to be identified in the character recognition method of the offer of the embodiment of the present invention two It is intended to;
The Chinese languages identification of Fig. 3 in the character recognition method that Fig. 4 provides for the embodiment of the present invention two As a result schematic diagram;
The English languages identification of Fig. 3 in the character recognition method that Fig. 5 provides for the embodiment of the present invention two As a result schematic diagram;
The Tibetan language languages identification of Fig. 3 in the character recognition method that Fig. 6 provides for the embodiment of the present invention two As a result schematic diagram;
Fig. 7 is the flow chart for the character recognition method that the embodiment of the present invention three is provided;
Fig. 8 is the structural representation for the character recognition device that the embodiment of the present invention four is provided;
Fig. 9 is the structural representation for the character recognition device that the embodiment of the present invention five is provided.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with this hair Accompanying drawing in bright embodiment, the technical scheme in the embodiment of the present invention is clearly and completely described, Obviously, described embodiment is a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of creative work is not made The every other embodiment obtained, belongs to the scope of protection of the invention.
Fig. 1 is the flow chart for the character recognition method that the embodiment of the present invention one is provided, as shown in figure 1, this The method of embodiment includes:
Step 101, acquisition picture to be identified.
In the present embodiment, specifically, in the multimedia messages such as picture, video, meeting is equipped with a large amount of Text information, such as in picture have explanatory note, in the picture of microblogging have long microblogging word Picture, in video have captions and other caption informations.
Picture to be identified is obtained first, and picture to be identified includes a pictures, or in video Single frames picture.It is then possible to split to picture to be identified, and then after being easy to subsequent step to segmentation Picture to be identified in word be identified one by one;Picture pretreatment can also be carried out to picture to be identified Work, for example adjust the brightness of picture to be identified and comparison diagram, picture to be identified be adjusted to black and white values Picture etc..
Step 102, using at least one languages OCR core libraries to picture to be identified carry out Text region, Generate and include at least one character in the recognition result of each languages, recognition result.
In the present embodiment, specifically, OCR identification technologies have the OCR core libraries of a variety of languages, So as to carry out Text region to picture to be identified using the OCR core libraries of at least one languages, and then The recognition result of each languages is generated, includes at least one character respectively in variant recognition result.
For example, can using the OCR core libraries of Chinese languages, the OCR core libraries of English languages, The OCR core libraries of Tibetan language languages, the OCR core libraries of German languages, the OCR core libraries of French languages Deng Text region is carried out to picture to be identified, so as to the recognition result for generating Chinese languages, English language The recognition result of kind, the recognition result of Tibetan language languages, the recognition result of German languages, the knowledge of French languages Other result.
Step 103, determine each languages recognition result significant character ratio.
In the present embodiment, specifically, for the recognition result of each languages generated in step 102, Calculate the significant character ratio of the recognition result of each languages.So as to obtain the recognition result of different languages Respective significant character ratio.
Step 104, the significant character ratio according to the recognition results of each languages, judge in picture to be identified Word the effective result of the identification of languages and the word in picture to be identified.
In the present embodiment, specifically, having according to the recognition result of each languages generated in step 103 Character ratio is imitated, according to default judgement makes policy come the word in the picture to be identified in determination step 101 Languages, and which kind of languages recognition result be picture to be identified in word the effective result of identification.
For example, picture to be identified can be carried out using the OCR core libraries of at least one languages multiple After identification, to the multiple significant character ratio of the obtained recognition result of each languages, average is asked for, So that it is determined that the languages of mean-max be picture to be identified in word languages, determine mean-max Recognition result be picture to be identified in word the effective result of identification.
The present embodiment treats knowledge by obtaining picture to be identified using the OCR core libraries of at least one languages Other picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one Character;The significant character ratio of the recognition result of each languages is calculated, according to the recognition result of each languages Word in the languages and picture to be identified of word in significant character ratio, judgement picture to be identified The effective result of identification.So as to the word determined to picture to be identified in picture that need not be artificial Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged, The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification Time, improve recognition efficiency.
Fig. 2 is the flow chart for the character recognition method that the embodiment of the present invention two is provided, as shown in Fig. 2 On the basis of embodiment one, in the method for the present embodiment, step 103, including:
Step 1031, the character number determined in the recognition results of each languages, and determine the identification of each languages As a result the character code of each character in.
In the present embodiment, specifically, because the recognition result of each languages is all made up of multiple characters respectively, The character number of the recognition result of each languages generated in step 102 is may thereby determine that out, is determined simultaneously The character code of each character gone out in the recognition result of each languages.
The character code of step 1032, each character determined in the recognition results of each languages, belongs to each language The significant character number in character code interval planted.
In the present embodiment, specifically, under due to being encoded in unicode, different characters correspond to difference Coding, it is interval that the character of different language also correspond to different character codes.So as to according to step The character code of each character in the recognition result of each languages determined in 1031, determines each languages respectively The character code of each character in recognition result, the effective word belonged in the character code interval of each languages Accord with number.
For example, it is a kind of to be identified in the character recognition method that Fig. 3 provides for the embodiment of the present invention two The schematic diagram of picture, as shown in figure 3, the picture to be identified provided in Fig. 3 is identified;It can adopt With the OCR core libraries of Chinese languages, the OCR core libraries of English languages, Tibetan language languages OCR cores Storehouse carries out Text region to the picture to be identified in Fig. 3 respectively, and Fig. 4 is what the embodiment of the present invention two was provided The schematic diagram of the Chinese languages recognition result of Fig. 3 in character recognition method, Fig. 5 is the embodiment of the present invention The schematic diagram of the English languages recognition result of Fig. 3 in two character recognition methods provided, Fig. 6 is this hair The schematic diagram of the Tibetan language languages recognition result of Fig. 3 in the character recognition method that bright embodiment two is provided, such as Shown in Fig. 4, Fig. 5 and Fig. 6, can generate the recognition results of Chinese languages, the recognition result of English languages, The recognition result of Tibetan language languages;Determine that the character number in the recognition result of the Chinese languages generated is first Character number in 6 characters, the recognition result of English languages is 10 characters, the identification of Tibetan language languages As a result character number in is 6 characters, and determines each character in the recognition result of each languages respectively Character code;The character code interval of Chinese languages is 0x4E00-0x9FA5, and the character of English languages is compiled Interval code is 0x0000-0x00FF, and the character code interval of Tibetan language languages is 0x0F00-0x0FFF;And Spcial character in recognition result etc., not in the character code of each languages is interval;So that it is determined that Chinese The character code of each character in the recognition result of languages, the character code for belonging to Chinese languages is interval interior Significant character number be 4;It is determined that the character code of each character in the recognition result of English languages, The significant character number belonged in the character code interval of English languages is 6;Determine Tibetan language languages The character code of each character in recognition result, is belonged to effective in the character code interval of Tibetan language languages Character number is 6.
Step 1033, the character number according to the recognition results of each languages, and each languages recognition result Significant character number, determine the significant character ratio of the recognition result of each languages.
In the present embodiment, specifically, according to the recognition result for each languages determined in step 1031 The significant character number of the recognition result of each languages calculated in character number, and step 1032, can With the significant character ratio for the recognition result for calculating each languages respectively.
For example, can be according to the character number 6 of the recognition result of Chinese languages, the knowledge of Chinese languages The significant character number 4 of other result, it is determined that the significant character ratio of the recognition result of Chinese languages is 2/3; According to the character number 10 of the recognition result of English languages, the significant character of the recognition result of English languages Number 6, it is determined that the significant character ratio of the recognition result of English languages is 3/5;According to the identification of Tibetan language languages As a result character number 6, the significant character number 6 of the recognition result of Tibetan language languages, determines Tibetan language languages The significant character ratio of recognition result is 1/1.
Step 104 is specifically included:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in picture to be identified, and languages of determining maximum significant character ratio Recognition result for picture to be identified word the effective result of identification.
In the present embodiment, specifically, determine the recognition result of each languages significant character ratio it Afterwards, the size of the significant character ratio of the recognition result of each languages can be compared, significant character ratio is taken It is worth maximum languages as the recognition result of final languages, may thereby determine that out the language of picture to be identified Plant the result with Text region.For accurate, using the languages of maximum significant character ratio as to be identified The languages of word in picture, using the recognition result of the languages of maximum significant character ratio as to be identified The effective result of identification of the word of picture.
For example, the significant character ratio of the recognition result of Chinese languages is 67%, the knowledge of English languages The significant character ratio of other result is 60%, and the significant character ratio of the recognition result of Tibetan language languages is 100%, so that the value of the significant character ratio of the recognition result of Tibetan language languages is maximum, it may be determined that to be identified The languages of word in picture be Tibetan language, using the recognition result of Tibetan language languages as picture to be identified word The effective result of identification.
Character number in recognition result of the present embodiment by determining each languages;And determine the knowledge of each languages The character code of each character in other result, the significant character belonged in the character code interval of each languages Number;Have so as to the character number of the recognition result according to each languages, the recognition results of each languages Character number is imitated, the significant character ratio of the recognition result of each languages can be calculated respectively;And then take most The languages of big value significant character ratio for the word in picture to be identified languages, take maximum significant character The recognition result of the languages of ratio is the effective result of identification of the word of picture to be identified.So as to be not required to After the languages that the word in picture is determined to picture to be identified that will be artificial, then carry out Text region; The languages of the word in picture to be identified can automatically be judged, while determining the word in picture to be identified Recognition result, it is not necessary to artificial operation, shorten the time of identification, improve recognition efficiency.
Fig. 7 is the flow chart for the character recognition method that the embodiment of the present invention three is provided, as shown in fig. 7, On the basis of embodiment one and embodiment two, in the method for the present embodiment, step 102 is specifically included:
Text region is carried out to picture to be identified using the OCR core libraries of three kinds of languages, each languages are generated Recognition result, wherein the OCR core libraries of three kinds of languages are respectively OCR core libraries, the English of Chinese languages The OCR core libraries of literary languages, the OCR core libraries of Tibetan language languages;
Accordingly, step 104 is specifically included:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, judgement is treated The effective result of identification of word of the languages of word in identification picture for Tibetan language languages, in picture to be identified For the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of the word in picture to be identified as Tibetan language languages, wait to know The effective result of identification of word in other picture is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in picture to be identified as English languages, figure to be identified The effective result of identification of word in piece is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in picture to be identified as Chinese languages, figure to be identified The effective result of identification of word in piece is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of word in picture to be identified as in English languages, picture to be identified Word the effective result of identification be English languages recognition result.
In the present embodiment, specifically, obtaining picture to be identified, picture to be identified being schemed After the work of piece pretreatment, picture to be identified can be entered using the OCR core libraries of three kinds of languages Row Text region, wherein, the OCR core libraries of three kinds of languages be respectively the OCR core libraries of Chinese languages, The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages.So as to generate the identification of each languages As a result:Recognition result, the recognition result of Tibetan language languages of the recognition result of Chinese languages, English languages.
The recognition result, the recognition result of English languages, the knowledge of Tibetan language languages of Chinese languages are calculated respectively The other respective significant character ratio of result.
First, it is determined that whether the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to default ratio Example T1.If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio T1, Judge that the languages of the word in picture to be identified have as the identification of the word in Tibetan language languages, picture to be identified Imitate the recognition result that result is Tibetan language languages.If the significant character ratio R1 of the recognition result of Tibetan language languages Less than preset ratio T1, then judge whether the significant character ratio R1 of the recognition result of Tibetan language languages is more than Equal to the significant character ratio R2 of the recognition result of Chinese languages.
Then, it is determined that it is less than preset ratio T1 in the significant character ratio R1 of the recognition result of Tibetan language languages, And the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to the recognition result of Chinese languages During significant character ratio R2, then go to judge the recognition result of Tibetan language languages significant character ratio R1 whether More than or equal to the significant character ratio R3 of the recognition result of English languages, if the now identification knot of Tibetan language languages The significant character ratio R1 of fruit is more than or equal to the significant character ratio R3 of the recognition result of English languages, then Judge that the languages of the word in picture to be identified have as the identification of the word in Tibetan language languages, picture to be identified The recognition result that result is Tibetan language languages is imitated, if the now significant character ratio of the recognition result of Tibetan language languages R1 is less than the significant character ratio R3 of the recognition result of English languages, then judges the text in picture to be identified The knowledge that the languages of word are English languages, the effective result of identification of word in picture to be identified is English languages Other result.
It is determined that being less than preset ratio T1 in the significant character ratio R1 of the recognition result of Tibetan language languages, and hide The significant character ratio R1 of the recognition result of literary languages is less than the significant character of the recognition result of Chinese languages During ratio R2, then go whether the significant character ratio R2 for judging the recognition result of Chinese languages is more than or equal to The significant character ratio R3 of the recognition result of English languages, if the now recognition result of Chinese languages is effective Character ratio R2 is more than or equal to the significant character ratio R3 of the recognition result of English languages, then judges to wait to know The languages of word in other picture are that Chinese languages, the effective result of identification of word in picture to be identified are The recognition result of Chinese languages, if now the significant character ratio R2 of the recognition result of Chinese languages is less than English The significant character ratio R3 of the recognition result of literary languages, then judge the languages of word in picture to be identified as English languages, the recognition result that the effective result of identification of word in picture to be identified is English languages.
The present embodiment is in the OCR core libraries using Chinese languages, the OCR core libraries of English languages, Tibetan When the OCR core libraries of literary languages carry out Text region to picture to be identified respectively, using the knowledge of Tibetan language languages The significant character ratio of other result, the significant character ratio of the recognition result of Chinese languages, English languages The decision-making discriminant approach that the significant character ratio of recognition result is compared to each other, is finally determined to be identified The languages and the effective result of identification of word in picture.So as to need not be artificial to be identified Picture is determined after the languages of the word in picture, then carries out Text region;It can automatically judge to treat The languages of the word in picture are recognized, while determining the recognition result of the word in picture to be identified, are not required to Manually to operate, shorten the time of identification, improve recognition efficiency.
Fig. 8 is the structural representation for the character recognition device that the embodiment of the present invention four is provided, as shown in figure 8, The character recognition device that the present embodiment is provided, including:
Acquisition module 31, for obtaining picture to be identified;
Identification module 32, is carried out for the OCR core libraries using at least one languages to picture to be identified Text region, generates in the recognition result of each languages, recognition result and includes at least one character;
Determining module 33, the significant character ratio of the recognition result for determining each languages;
Determination module 34, for the significant character ratio of the recognition result according to each languages, judges to be identified The effective result of identification of the languages of word in picture and the word in picture to be identified.
The character recognition device of the present embodiment can perform the character recognition method that the embodiment of the present invention one is provided, Its realization principle is similar, and here is omitted.
The present embodiment treats knowledge by obtaining picture to be identified using the OCR core libraries of at least one languages Other picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one Character;The significant character ratio of the recognition result of each languages is calculated, according to the recognition result of each languages Word in the languages and picture to be identified of word in significant character ratio, judgement picture to be identified The effective result of identification.So as to the word determined to picture to be identified in picture that need not be artificial Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged, The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification Time, improve recognition efficiency.
Fig. 9 is the structural representation for the character recognition device that the embodiment of the present invention five is provided, in example IV On the basis of, as shown in figure 9, the character recognition device that the present embodiment is provided, determining module 33, including:
First determination sub-module 331, the character number in recognition result for determining each languages, and really The character code of each character in the recognition result of fixed each languages;
Second determination sub-module 332, the character of each character in recognition result for determining each languages is compiled Code, the significant character number belonged in the character code interval of each languages;
Calculating sub module 333, for the character number of the recognition result according to each languages, and each languages Recognition result significant character number, determine the significant character ratio of the recognition result of each languages.
Determination module 34, specifically for:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in picture to be identified, and languages of determining maximum significant character ratio Recognition result for picture to be identified word the effective result of identification.
Or, identification module 32, specifically for:
Text region is carried out to picture to be identified using the OCR core libraries of three kinds of languages, each languages are generated Recognition result, wherein the OCR core libraries of three kinds of languages are respectively OCR core libraries, the English of Chinese languages The OCR core libraries of literary languages, the OCR core libraries of Tibetan language languages;
Accordingly, determination module 34, specifically for:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, judgement is treated The effective result of identification of word of the languages of word in identification picture for Tibetan language languages, in picture to be identified For the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of the word in picture to be identified as Tibetan language languages, wait to know The effective result of identification of word in other picture is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in picture to be identified as English languages, figure to be identified The effective result of identification of word in piece is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in picture to be identified as Chinese languages, figure to be identified The effective result of identification of word in piece is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of word in picture to be identified as in English languages, picture to be identified Word the effective result of identification be English languages recognition result.
The character recognition device of the present embodiment can perform the word that the embodiment of the present invention one and embodiment two are provided Recognition methods, its realization principle is similar, and here is omitted.
Character number in recognition result of the present embodiment by determining each languages;And determine the knowledge of each languages The character code of each character in other result, the significant character belonged in the character code interval of each languages Number;Have so as to the character number of the recognition result according to each languages, the recognition results of each languages Character number is imitated, the significant character ratio of the recognition result of each languages can be calculated respectively;And then take most The languages of big value significant character ratio for the word in picture to be identified languages, take maximum significant character The recognition result of the languages of ratio is the effective result of identification of the word of picture to be identified.And in use The OCR core libraries of literary languages, the OCR core libraries of English languages, the OCR core libraries point of Tibetan language languages It is other to picture to be identified carry out Text region when, using the significant character ratio of the recognition result of Tibetan language languages, The significant character ratio of the significant character ratio of the recognition result of Chinese languages, the recognition result of English languages The decision-making discriminant approach being compared to each other, finally determine word in picture to be identified languages, with And recognize effective result.So as to the word determined to picture to be identified in picture that need not be artificial Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged, The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification Time, improve recognition efficiency.
One of ordinary skill in the art will appreciate that:Realize all or part of step of above-mentioned each method embodiment Suddenly it can be completed by the related hardware of programmed instruction.Foregoing program can be stored in a computer can Read in storage medium.The program upon execution, performs the step of including above-mentioned each method embodiment;And Foregoing storage medium includes:ROM, RAM, magnetic disc or CD etc. are various can be with store program codes Medium.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than to it Limitation;Although the present invention is described in detail with reference to the foregoing embodiments, the ordinary skill of this area Personnel should be understood:It can still modify to the technical scheme described in foregoing embodiments, or Person carries out equivalent substitution to which part technical characteristic;And these modifications or replacement, do not make corresponding skill The essence of art scheme departs from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (8)

1. a kind of character recognition method, it is characterised in that including:
Obtain picture to be identified;
The picture to be identified is carried out using the optical character identification OCR core libraries of at least one languages Text region, generates in the recognition result of each languages, the recognition result and includes at least one character;
Determine the significant character ratio of the recognition result of each languages;
According to the significant character ratio of the recognition result of each languages, the word in the picture to be identified is judged Languages and the word in the picture to be identified the effective result of identification.
2. according to the method described in claim 1, it is characterised in that the identification knot for determining each languages The significant character ratio of fruit, including:
The character number in the recognition result of each languages is determined, and is determined each in the recognition result of each languages The character code of character;
The character code of each character in the recognition result of each languages is determined, the character for belonging to each languages is compiled Significant character number in code interval;
According to the significant character of the character number of the recognition result of each languages, and the recognition result of each languages Number, determines the significant character ratio of the recognition result of each languages.
3. method according to claim 1 or 2, it is characterised in that the knowledge according to each languages The significant character ratio of other result, judges the languages of the word in the picture to be identified and described treats The effective result of identification of the word in picture is recognized, including:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio The recognition result of languages is the effective result of identification of the word of the picture to be identified.
4. method according to claim 1 or 2, it is characterised in that described using at least one language The OCR core libraries planted carry out Text region to the picture to be identified, generate the recognition result of each languages, Including:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries, The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the significant character ratio of the recognition result according to each languages, judges described to be identified The effective result of identification of the languages of word in picture and the word in the picture to be identified, including:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages, The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know The effective result of identification of word in other picture is the recognition result of English languages.
5. a kind of character recognition device, it is characterised in that including:
Acquisition module, for obtaining picture to be identified;
Identification module, is carried out for the OCR core libraries using at least one languages to the picture to be identified Text region, generates in the recognition result of each languages, the recognition result and includes at least one character;
Determining module, the significant character ratio of the recognition result for determining each languages;
Determination module, for the significant character ratio of the recognition result according to each languages, waits to know described in judgement The effective result of identification of the languages of word in other picture and the word in the picture to be identified.
6. device according to claim 5, it is characterised in that the determining module, including:
First determination sub-module, the character number in recognition result for determining each languages, and determine each The character code of each character in the recognition result of languages;
Second determination sub-module, the character code of each character in recognition result for determining each languages, The significant character number belonged in the character code interval of each languages;
Calculating sub module, the knowledge for the character number of the recognition result according to each languages, and each languages The significant character number of other result, determines the significant character ratio of the recognition result of each languages.
7. the device according to claim 5 or 6, it is characterised in that the determination module, specifically For:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio The recognition result of languages is the effective result of identification of the word of the picture to be identified.
8. the device according to claim 5 or 6, it is characterised in that the identification module, specifically For:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries, The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the determination module, specifically for:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages, The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages, And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know The effective result of identification of word in other picture is the recognition result of English languages.
CN201610157743.9A 2016-03-18 2016-03-18 Character recognition method and device Expired - Fee Related CN107203763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610157743.9A CN107203763B (en) 2016-03-18 2016-03-18 Character recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610157743.9A CN107203763B (en) 2016-03-18 2016-03-18 Character recognition method and device

Publications (2)

Publication Number Publication Date
CN107203763A true CN107203763A (en) 2017-09-26
CN107203763B CN107203763B (en) 2020-03-06

Family

ID=59904263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610157743.9A Expired - Fee Related CN107203763B (en) 2016-03-18 2016-03-18 Character recognition method and device

Country Status (1)

Country Link
CN (1) CN107203763B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558875A (en) * 2018-11-14 2019-04-02 广州同略信息科技有限公司 Method, apparatus, terminal and storage medium based on image automatic identification
CN111339787A (en) * 2018-12-17 2020-06-26 北京嘀嘀无限科技发展有限公司 Language identification method and device, electronic equipment and storage medium
CN112883967A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883966A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883968A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101571922A (en) * 2008-05-04 2009-11-04 中兴通讯股份有限公司 Character recognition tool for mobile terminal automation testing and method thereof
US20090317003A1 (en) * 2008-06-22 2009-12-24 Andre Heilper Correcting segmentation errors in ocr
CN101782896A (en) * 2009-01-21 2010-07-21 汉王科技股份有限公司 PDF character extraction method combined with OCR technology
CN104156706A (en) * 2014-08-12 2014-11-19 华北电力大学句容研究中心 Chinese character recognition method based on optical character recognition technology
CN104317847A (en) * 2014-10-13 2015-01-28 孙伟力 Method and system for identifying languages in network text information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101571922A (en) * 2008-05-04 2009-11-04 中兴通讯股份有限公司 Character recognition tool for mobile terminal automation testing and method thereof
US20090317003A1 (en) * 2008-06-22 2009-12-24 Andre Heilper Correcting segmentation errors in ocr
CN101782896A (en) * 2009-01-21 2010-07-21 汉王科技股份有限公司 PDF character extraction method combined with OCR technology
CN104156706A (en) * 2014-08-12 2014-11-19 华北电力大学句容研究中心 Chinese character recognition method based on optical character recognition technology
CN104317847A (en) * 2014-10-13 2015-01-28 孙伟力 Method and system for identifying languages in network text information

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558875A (en) * 2018-11-14 2019-04-02 广州同略信息科技有限公司 Method, apparatus, terminal and storage medium based on image automatic identification
CN111339787A (en) * 2018-12-17 2020-06-26 北京嘀嘀无限科技发展有限公司 Language identification method and device, electronic equipment and storage medium
CN111339787B (en) * 2018-12-17 2023-09-19 北京嘀嘀无限科技发展有限公司 Language identification method and device, electronic equipment and storage medium
CN112883967A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883966A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883968A (en) * 2021-02-24 2021-06-01 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883966B (en) * 2021-02-24 2023-02-24 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883968B (en) * 2021-02-24 2023-02-28 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment
CN112883967B (en) * 2021-02-24 2023-02-28 北京有竹居网络技术有限公司 Image character recognition method, device, medium and electronic equipment

Also Published As

Publication number Publication date
CN107203763B (en) 2020-03-06

Similar Documents

Publication Publication Date Title
CN107203763A (en) Character recognition method and device
CN109145152B (en) Method for adaptively and intelligently generating image-text video thumbnail based on query word
CN109272043B (en) Training data generation method and system for optical character recognition and electronic equipment
US11363344B2 (en) Method and system of displaying subtitles, computing device, and readable storage medium
US8627203B2 (en) Method and apparatus for capturing, analyzing, and converting scripts
CN110555136B (en) Video tag generation method and device and computer storage medium
CN109766883B (en) Method for rapidly extracting network video subtitles based on deep neural network
CN108596180A (en) Parameter identification, the training method of parameter identification model and device in image
CN107480670A (en) A kind of method and apparatus of caption extraction
CN111709406A (en) Text line identification method and device, readable storage medium and electronic equipment
CN204537126U (en) A kind of image text identification translation glasses
CN106598923A (en) Online document format conversion method and apparatus based on font object library loading
CN106372216A (en) Method and device for improving subject finding accuracy
Tymoshenko et al. Real-Time Ukrainian Text Recognition and Voicing.
CN114581926A (en) Multi-line text recognition method, device, equipment and medium
CN114821613A (en) Extraction method and system of table information in PDF
CN111611788B (en) Data processing method and device, electronic equipment and storage medium
CN105095826B (en) A kind of character recognition method and device
Rasheed et al. A deep learning-based method for Turkish text detection from videos
CN112749696A (en) Text detection method and device
CN111814508B (en) Character recognition method, system and equipment
US20140212039A1 (en) Efficient Verification or Disambiguation of Character Recognition Results
CN114579796B (en) Machine reading understanding method and device
CN109522921A (en) Statement similarity method of discrimination and equipment
CN113255829B (en) Zero sample image target detection method and device based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220624

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: Peking University

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District, Cheng Fu Road, No. 298, Zhongguancun Fangzheng building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: Peking University

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200306