The content of the invention
The present invention provides a kind of character recognition method and device, artificial to solve to need in the prior art
After the languages that word in picture is determined to each picture, it can just use corresponding with each languages
OCR core libraries carry out Text region, so as to need substantial amounts of artificial operation, the time of identification is longer, know
The problem of other efficiency is low.
It is an aspect of the present invention to provide a kind of character recognition method, including:
Obtain picture to be identified;
Text region, generation are carried out to the picture to be identified using the OCR core libraries of at least one languages
Include at least one character in the recognition result of each languages, the recognition result;
Determine the significant character ratio of the recognition result of each languages;
According to the significant character ratio of the recognition result of each languages, the word in the picture to be identified is judged
Languages and the word in the picture to be identified the effective result of identification.
In method as described above, the significant character ratio of the recognition result for determining each languages, including:
The character number in the recognition result of each languages is determined, and is determined each in the recognition result of each languages
The character code of character;
The character code of each character in the recognition result of each languages is determined, the character for belonging to each languages is compiled
Significant character number in code interval;
According to the significant character of the character number of the recognition result of each languages, and the recognition result of each languages
Number, determines the significant character ratio of the recognition result of each languages.
In method as described above, the significant character ratio of the recognition result according to each languages judges
The identification of the languages of word in the picture to be identified and the word in the picture to be identified is effective
As a result, including:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio
The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio
The recognition result of languages is the effective result of identification of the word of the picture to be identified.
In method as described above, the OCR core libraries using at least one languages are to described to be identified
Picture carries out Text region, generates the recognition result of each languages, including:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated
Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries,
The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the significant character ratio of the recognition result according to each languages, judges described to be identified
The effective result of identification of the languages of word in picture and the word in the picture to be identified, including:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged
State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified
Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages
The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages,
The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages
Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described
The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages
Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described
The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages
Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know
The effective result of identification of word in other picture is the recognition result of English languages.
Another aspect of the present invention there is provided a kind of character recognition device, including:
Acquisition module, for obtaining picture to be identified;
Identification module, is carried out for the OCR core libraries using at least one languages to the picture to be identified
Text region, generates in the recognition result of each languages, the recognition result and includes at least one character;
Determining module, the significant character ratio of the recognition result for determining each languages;
Determination module, for the significant character ratio of the recognition result according to each languages, waits to know described in judgement
The effective result of identification of the languages of word in other picture and the word in the picture to be identified.
In device as described above, the determining module, including:
First determination sub-module, the character number in recognition result for determining each languages, and determine each
The character code of each character in the recognition result of languages;
Second determination sub-module, the character code of each character in recognition result for determining each languages,
The significant character number belonged in the character code interval of each languages;
Calculating sub module, the knowledge for the character number of the recognition result according to each languages, and each languages
The significant character number of other result, determines the significant character ratio of the recognition result of each languages.
In device as described above, the determination module, specifically for:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio
The languages of example are the languages of the word in the picture to be identified, and determining maximum significant character ratio
The recognition result of languages is the effective result of identification of the word of the picture to be identified.
In device as described above, the identification module, specifically for:
Text region is carried out to the picture to be identified using the OCR core libraries of three kinds of languages, each language is generated
Kind recognition result, wherein the OCR core libraries of three kinds of languages be respectively Chinese languages OCR core libraries,
The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages;
Accordingly, the determination module, specifically for:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, institute is judged
State the identification of word of the languages of word in picture to be identified for Tibetan language languages, in the picture to be identified
Effective result is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages
The significant character ratio R3 of fruit, then judge the languages of word in the picture to be identified as Tibetan language languages,
The effective result of identification of word in the picture to be identified is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages
Significant character ratio R3, then judge the languages of word in the picture to be identified as English languages, described
The effective result of identification of word in picture to be identified is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages
Significant character ratio R3, then judge the languages of word in the picture to be identified as Chinese languages, described
The effective result of identification of word in picture to be identified is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages
Character ratio R3, then judge the languages of the word in the picture to be identified as English languages, described wait to know
The effective result of identification of word in other picture is the recognition result of English languages.
The present invention is by obtaining picture to be identified, using the OCR core libraries of at least one languages to be identified
Picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one word
Symbol;The significant character ratio of the recognition result of each languages is calculated, according to having for the recognition result of each languages
Character ratio is imitated, the languages and the word in picture to be identified of word in picture to be identified are judged
Recognize effective result.So as to not need the artificial word determined to picture to be identified in picture
After languages, then carry out Text region;The languages of the word in picture to be identified can automatically be judged,
The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification
Time, improve recognition efficiency.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with this hair
Accompanying drawing in bright embodiment, the technical scheme in the embodiment of the present invention is clearly and completely described,
Obviously, described embodiment is a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained under the premise of creative work is not made
The every other embodiment obtained, belongs to the scope of protection of the invention.
Fig. 1 is the flow chart for the character recognition method that the embodiment of the present invention one is provided, as shown in figure 1, this
The method of embodiment includes:
Step 101, acquisition picture to be identified.
In the present embodiment, specifically, in the multimedia messages such as picture, video, meeting is equipped with a large amount of
Text information, such as in picture have explanatory note, in the picture of microblogging have long microblogging word
Picture, in video have captions and other caption informations.
Picture to be identified is obtained first, and picture to be identified includes a pictures, or in video
Single frames picture.It is then possible to split to picture to be identified, and then after being easy to subsequent step to segmentation
Picture to be identified in word be identified one by one;Picture pretreatment can also be carried out to picture to be identified
Work, for example adjust the brightness of picture to be identified and comparison diagram, picture to be identified be adjusted to black and white values
Picture etc..
Step 102, using at least one languages OCR core libraries to picture to be identified carry out Text region,
Generate and include at least one character in the recognition result of each languages, recognition result.
In the present embodiment, specifically, OCR identification technologies have the OCR core libraries of a variety of languages,
So as to carry out Text region to picture to be identified using the OCR core libraries of at least one languages, and then
The recognition result of each languages is generated, includes at least one character respectively in variant recognition result.
For example, can using the OCR core libraries of Chinese languages, the OCR core libraries of English languages,
The OCR core libraries of Tibetan language languages, the OCR core libraries of German languages, the OCR core libraries of French languages
Deng Text region is carried out to picture to be identified, so as to the recognition result for generating Chinese languages, English language
The recognition result of kind, the recognition result of Tibetan language languages, the recognition result of German languages, the knowledge of French languages
Other result.
Step 103, determine each languages recognition result significant character ratio.
In the present embodiment, specifically, for the recognition result of each languages generated in step 102,
Calculate the significant character ratio of the recognition result of each languages.So as to obtain the recognition result of different languages
Respective significant character ratio.
Step 104, the significant character ratio according to the recognition results of each languages, judge in picture to be identified
Word the effective result of the identification of languages and the word in picture to be identified.
In the present embodiment, specifically, having according to the recognition result of each languages generated in step 103
Character ratio is imitated, according to default judgement makes policy come the word in the picture to be identified in determination step 101
Languages, and which kind of languages recognition result be picture to be identified in word the effective result of identification.
For example, picture to be identified can be carried out using the OCR core libraries of at least one languages multiple
After identification, to the multiple significant character ratio of the obtained recognition result of each languages, average is asked for,
So that it is determined that the languages of mean-max be picture to be identified in word languages, determine mean-max
Recognition result be picture to be identified in word the effective result of identification.
The present embodiment treats knowledge by obtaining picture to be identified using the OCR core libraries of at least one languages
Other picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one
Character;The significant character ratio of the recognition result of each languages is calculated, according to the recognition result of each languages
Word in the languages and picture to be identified of word in significant character ratio, judgement picture to be identified
The effective result of identification.So as to the word determined to picture to be identified in picture that need not be artificial
Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged,
The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification
Time, improve recognition efficiency.
Fig. 2 is the flow chart for the character recognition method that the embodiment of the present invention two is provided, as shown in Fig. 2
On the basis of embodiment one, in the method for the present embodiment, step 103, including:
Step 1031, the character number determined in the recognition results of each languages, and determine the identification of each languages
As a result the character code of each character in.
In the present embodiment, specifically, because the recognition result of each languages is all made up of multiple characters respectively,
The character number of the recognition result of each languages generated in step 102 is may thereby determine that out, is determined simultaneously
The character code of each character gone out in the recognition result of each languages.
The character code of step 1032, each character determined in the recognition results of each languages, belongs to each language
The significant character number in character code interval planted.
In the present embodiment, specifically, under due to being encoded in unicode, different characters correspond to difference
Coding, it is interval that the character of different language also correspond to different character codes.So as to according to step
The character code of each character in the recognition result of each languages determined in 1031, determines each languages respectively
The character code of each character in recognition result, the effective word belonged in the character code interval of each languages
Accord with number.
For example, it is a kind of to be identified in the character recognition method that Fig. 3 provides for the embodiment of the present invention two
The schematic diagram of picture, as shown in figure 3, the picture to be identified provided in Fig. 3 is identified;It can adopt
With the OCR core libraries of Chinese languages, the OCR core libraries of English languages, Tibetan language languages OCR cores
Storehouse carries out Text region to the picture to be identified in Fig. 3 respectively, and Fig. 4 is what the embodiment of the present invention two was provided
The schematic diagram of the Chinese languages recognition result of Fig. 3 in character recognition method, Fig. 5 is the embodiment of the present invention
The schematic diagram of the English languages recognition result of Fig. 3 in two character recognition methods provided, Fig. 6 is this hair
The schematic diagram of the Tibetan language languages recognition result of Fig. 3 in the character recognition method that bright embodiment two is provided, such as
Shown in Fig. 4, Fig. 5 and Fig. 6, can generate the recognition results of Chinese languages, the recognition result of English languages,
The recognition result of Tibetan language languages;Determine that the character number in the recognition result of the Chinese languages generated is first
Character number in 6 characters, the recognition result of English languages is 10 characters, the identification of Tibetan language languages
As a result character number in is 6 characters, and determines each character in the recognition result of each languages respectively
Character code;The character code interval of Chinese languages is 0x4E00-0x9FA5, and the character of English languages is compiled
Interval code is 0x0000-0x00FF, and the character code interval of Tibetan language languages is 0x0F00-0x0FFF;And
Spcial character in recognition result etc., not in the character code of each languages is interval;So that it is determined that Chinese
The character code of each character in the recognition result of languages, the character code for belonging to Chinese languages is interval interior
Significant character number be 4;It is determined that the character code of each character in the recognition result of English languages,
The significant character number belonged in the character code interval of English languages is 6;Determine Tibetan language languages
The character code of each character in recognition result, is belonged to effective in the character code interval of Tibetan language languages
Character number is 6.
Step 1033, the character number according to the recognition results of each languages, and each languages recognition result
Significant character number, determine the significant character ratio of the recognition result of each languages.
In the present embodiment, specifically, according to the recognition result for each languages determined in step 1031
The significant character number of the recognition result of each languages calculated in character number, and step 1032, can
With the significant character ratio for the recognition result for calculating each languages respectively.
For example, can be according to the character number 6 of the recognition result of Chinese languages, the knowledge of Chinese languages
The significant character number 4 of other result, it is determined that the significant character ratio of the recognition result of Chinese languages is 2/3;
According to the character number 10 of the recognition result of English languages, the significant character of the recognition result of English languages
Number 6, it is determined that the significant character ratio of the recognition result of English languages is 3/5;According to the identification of Tibetan language languages
As a result character number 6, the significant character number 6 of the recognition result of Tibetan language languages, determines Tibetan language languages
The significant character ratio of recognition result is 1/1.
Step 104 is specifically included:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio
The languages of example are the languages of the word in picture to be identified, and languages of determining maximum significant character ratio
Recognition result for picture to be identified word the effective result of identification.
In the present embodiment, specifically, determine the recognition result of each languages significant character ratio it
Afterwards, the size of the significant character ratio of the recognition result of each languages can be compared, significant character ratio is taken
It is worth maximum languages as the recognition result of final languages, may thereby determine that out the language of picture to be identified
Plant the result with Text region.For accurate, using the languages of maximum significant character ratio as to be identified
The languages of word in picture, using the recognition result of the languages of maximum significant character ratio as to be identified
The effective result of identification of the word of picture.
For example, the significant character ratio of the recognition result of Chinese languages is 67%, the knowledge of English languages
The significant character ratio of other result is 60%, and the significant character ratio of the recognition result of Tibetan language languages is
100%, so that the value of the significant character ratio of the recognition result of Tibetan language languages is maximum, it may be determined that to be identified
The languages of word in picture be Tibetan language, using the recognition result of Tibetan language languages as picture to be identified word
The effective result of identification.
Character number in recognition result of the present embodiment by determining each languages;And determine the knowledge of each languages
The character code of each character in other result, the significant character belonged in the character code interval of each languages
Number;Have so as to the character number of the recognition result according to each languages, the recognition results of each languages
Character number is imitated, the significant character ratio of the recognition result of each languages can be calculated respectively;And then take most
The languages of big value significant character ratio for the word in picture to be identified languages, take maximum significant character
The recognition result of the languages of ratio is the effective result of identification of the word of picture to be identified.So as to be not required to
After the languages that the word in picture is determined to picture to be identified that will be artificial, then carry out Text region;
The languages of the word in picture to be identified can automatically be judged, while determining the word in picture to be identified
Recognition result, it is not necessary to artificial operation, shorten the time of identification, improve recognition efficiency.
Fig. 7 is the flow chart for the character recognition method that the embodiment of the present invention three is provided, as shown in fig. 7,
On the basis of embodiment one and embodiment two, in the method for the present embodiment, step 102 is specifically included:
Text region is carried out to picture to be identified using the OCR core libraries of three kinds of languages, each languages are generated
Recognition result, wherein the OCR core libraries of three kinds of languages are respectively OCR core libraries, the English of Chinese languages
The OCR core libraries of literary languages, the OCR core libraries of Tibetan language languages;
Accordingly, step 104 is specifically included:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, judgement is treated
The effective result of identification of word of the languages of word in identification picture for Tibetan language languages, in picture to be identified
For the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages
The significant character ratio R3 of fruit, then judge the languages of the word in picture to be identified as Tibetan language languages, wait to know
The effective result of identification of word in other picture is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages
Significant character ratio R3, then judge the languages of word in picture to be identified as English languages, figure to be identified
The effective result of identification of word in piece is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages
Significant character ratio R3, then judge the languages of word in picture to be identified as Chinese languages, figure to be identified
The effective result of identification of word in piece is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages
Character ratio R3, then judge the languages of word in picture to be identified as in English languages, picture to be identified
Word the effective result of identification be English languages recognition result.
In the present embodiment, specifically, obtaining picture to be identified, picture to be identified being schemed
After the work of piece pretreatment, picture to be identified can be entered using the OCR core libraries of three kinds of languages
Row Text region, wherein, the OCR core libraries of three kinds of languages be respectively the OCR core libraries of Chinese languages,
The OCR core libraries of English languages, the OCR core libraries of Tibetan language languages.So as to generate the identification of each languages
As a result:Recognition result, the recognition result of Tibetan language languages of the recognition result of Chinese languages, English languages.
The recognition result, the recognition result of English languages, the knowledge of Tibetan language languages of Chinese languages are calculated respectively
The other respective significant character ratio of result.
First, it is determined that whether the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to default ratio
Example T1.If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio T1,
Judge that the languages of the word in picture to be identified have as the identification of the word in Tibetan language languages, picture to be identified
Imitate the recognition result that result is Tibetan language languages.If the significant character ratio R1 of the recognition result of Tibetan language languages
Less than preset ratio T1, then judge whether the significant character ratio R1 of the recognition result of Tibetan language languages is more than
Equal to the significant character ratio R2 of the recognition result of Chinese languages.
Then, it is determined that it is less than preset ratio T1 in the significant character ratio R1 of the recognition result of Tibetan language languages,
And the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to the recognition result of Chinese languages
During significant character ratio R2, then go to judge the recognition result of Tibetan language languages significant character ratio R1 whether
More than or equal to the significant character ratio R3 of the recognition result of English languages, if the now identification knot of Tibetan language languages
The significant character ratio R1 of fruit is more than or equal to the significant character ratio R3 of the recognition result of English languages, then
Judge that the languages of the word in picture to be identified have as the identification of the word in Tibetan language languages, picture to be identified
The recognition result that result is Tibetan language languages is imitated, if the now significant character ratio of the recognition result of Tibetan language languages
R1 is less than the significant character ratio R3 of the recognition result of English languages, then judges the text in picture to be identified
The knowledge that the languages of word are English languages, the effective result of identification of word in picture to be identified is English languages
Other result.
It is determined that being less than preset ratio T1 in the significant character ratio R1 of the recognition result of Tibetan language languages, and hide
The significant character ratio R1 of the recognition result of literary languages is less than the significant character of the recognition result of Chinese languages
During ratio R2, then go whether the significant character ratio R2 for judging the recognition result of Chinese languages is more than or equal to
The significant character ratio R3 of the recognition result of English languages, if the now recognition result of Chinese languages is effective
Character ratio R2 is more than or equal to the significant character ratio R3 of the recognition result of English languages, then judges to wait to know
The languages of word in other picture are that Chinese languages, the effective result of identification of word in picture to be identified are
The recognition result of Chinese languages, if now the significant character ratio R2 of the recognition result of Chinese languages is less than English
The significant character ratio R3 of the recognition result of literary languages, then judge the languages of word in picture to be identified as
English languages, the recognition result that the effective result of identification of word in picture to be identified is English languages.
The present embodiment is in the OCR core libraries using Chinese languages, the OCR core libraries of English languages, Tibetan
When the OCR core libraries of literary languages carry out Text region to picture to be identified respectively, using the knowledge of Tibetan language languages
The significant character ratio of other result, the significant character ratio of the recognition result of Chinese languages, English languages
The decision-making discriminant approach that the significant character ratio of recognition result is compared to each other, is finally determined to be identified
The languages and the effective result of identification of word in picture.So as to need not be artificial to be identified
Picture is determined after the languages of the word in picture, then carries out Text region;It can automatically judge to treat
The languages of the word in picture are recognized, while determining the recognition result of the word in picture to be identified, are not required to
Manually to operate, shorten the time of identification, improve recognition efficiency.
Fig. 8 is the structural representation for the character recognition device that the embodiment of the present invention four is provided, as shown in figure 8,
The character recognition device that the present embodiment is provided, including:
Acquisition module 31, for obtaining picture to be identified;
Identification module 32, is carried out for the OCR core libraries using at least one languages to picture to be identified
Text region, generates in the recognition result of each languages, recognition result and includes at least one character;
Determining module 33, the significant character ratio of the recognition result for determining each languages;
Determination module 34, for the significant character ratio of the recognition result according to each languages, judges to be identified
The effective result of identification of the languages of word in picture and the word in picture to be identified.
The character recognition device of the present embodiment can perform the character recognition method that the embodiment of the present invention one is provided,
Its realization principle is similar, and here is omitted.
The present embodiment treats knowledge by obtaining picture to be identified using the OCR core libraries of at least one languages
Other picture carries out Text region, generates in the recognition result of each languages, recognition result and includes at least one
Character;The significant character ratio of the recognition result of each languages is calculated, according to the recognition result of each languages
Word in the languages and picture to be identified of word in significant character ratio, judgement picture to be identified
The effective result of identification.So as to the word determined to picture to be identified in picture that need not be artificial
Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged,
The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification
Time, improve recognition efficiency.
Fig. 9 is the structural representation for the character recognition device that the embodiment of the present invention five is provided, in example IV
On the basis of, as shown in figure 9, the character recognition device that the present embodiment is provided, determining module 33, including:
First determination sub-module 331, the character number in recognition result for determining each languages, and really
The character code of each character in the recognition result of fixed each languages;
Second determination sub-module 332, the character of each character in recognition result for determining each languages is compiled
Code, the significant character number belonged in the character code interval of each languages;
Calculating sub module 333, for the character number of the recognition result according to each languages, and each languages
Recognition result significant character number, determine the significant character ratio of the recognition result of each languages.
Determination module 34, specifically for:
Compare the size of the significant character ratio of the recognition result of each languages, determining maximum significant character ratio
The languages of example are the languages of the word in picture to be identified, and languages of determining maximum significant character ratio
Recognition result for picture to be identified word the effective result of identification.
Or, identification module 32, specifically for:
Text region is carried out to picture to be identified using the OCR core libraries of three kinds of languages, each languages are generated
Recognition result, wherein the OCR core libraries of three kinds of languages are respectively OCR core libraries, the English of Chinese languages
The OCR core libraries of literary languages, the OCR core libraries of Tibetan language languages;
Accordingly, determination module 34, specifically for:
If the significant character ratio R1 of the recognition result of Tibetan language languages is more than or equal to preset ratio, judgement is treated
The effective result of identification of word of the languages of word in identification picture for Tibetan language languages, in picture to be identified
For the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and identification knots of the significant character ratio R1 more than or equal to English languages of the recognition result of Tibetan language languages
The significant character ratio R3 of fruit, then judge the languages of the word in picture to be identified as Tibetan language languages, wait to know
The effective result of identification of word in other picture is the recognition result of Tibetan language languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is more than or equal to the significant character ratio of the recognition result of Chinese languages
R2, and recognition results of the significant character ratio R1 less than English languages of the recognition result of Tibetan language languages
Significant character ratio R3, then judge the languages of word in picture to be identified as English languages, figure to be identified
The effective result of identification of word in piece is the recognition result of English languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is more than or equal to the recognition result of English languages
Significant character ratio R3, then judge the languages of word in picture to be identified as Chinese languages, figure to be identified
The effective result of identification of word in piece is the recognition result of Chinese languages;
If the significant character ratio R1 of the recognition result of Tibetan language languages is less than preset ratio, and Tibetan language languages
The significant character ratio R1 of recognition result is less than the significant character ratio R2 of the recognition result of Chinese languages,
And the significant character ratio R2 of the recognition result of Chinese languages is less than the effective of the recognition result of English languages
Character ratio R3, then judge the languages of word in picture to be identified as in English languages, picture to be identified
Word the effective result of identification be English languages recognition result.
The character recognition device of the present embodiment can perform the word that the embodiment of the present invention one and embodiment two are provided
Recognition methods, its realization principle is similar, and here is omitted.
Character number in recognition result of the present embodiment by determining each languages;And determine the knowledge of each languages
The character code of each character in other result, the significant character belonged in the character code interval of each languages
Number;Have so as to the character number of the recognition result according to each languages, the recognition results of each languages
Character number is imitated, the significant character ratio of the recognition result of each languages can be calculated respectively;And then take most
The languages of big value significant character ratio for the word in picture to be identified languages, take maximum significant character
The recognition result of the languages of ratio is the effective result of identification of the word of picture to be identified.And in use
The OCR core libraries of literary languages, the OCR core libraries of English languages, the OCR core libraries point of Tibetan language languages
It is other to picture to be identified carry out Text region when, using the significant character ratio of the recognition result of Tibetan language languages,
The significant character ratio of the significant character ratio of the recognition result of Chinese languages, the recognition result of English languages
The decision-making discriminant approach being compared to each other, finally determine word in picture to be identified languages, with
And recognize effective result.So as to the word determined to picture to be identified in picture that need not be artificial
Languages after, then carry out Text region;The languages of the word in picture to be identified can automatically be judged,
The recognition result of the word in picture to be identified is determined simultaneously, it is not necessary to artificial operation, shorten identification
Time, improve recognition efficiency.
One of ordinary skill in the art will appreciate that:Realize all or part of step of above-mentioned each method embodiment
Suddenly it can be completed by the related hardware of programmed instruction.Foregoing program can be stored in a computer can
Read in storage medium.The program upon execution, performs the step of including above-mentioned each method embodiment;And
Foregoing storage medium includes:ROM, RAM, magnetic disc or CD etc. are various can be with store program codes
Medium.
Finally it should be noted that:The above embodiments are merely illustrative of the technical solutions of the present invention, rather than to it
Limitation;Although the present invention is described in detail with reference to the foregoing embodiments, the ordinary skill of this area
Personnel should be understood:It can still modify to the technical scheme described in foregoing embodiments, or
Person carries out equivalent substitution to which part technical characteristic;And these modifications or replacement, do not make corresponding skill
The essence of art scheme departs from the spirit and scope of various embodiments of the present invention technical scheme.