CN103577818B

CN103577818B - A kind of method and apparatus of pictograph identification

Info

Publication number: CN103577818B
Application number: CN201210279370.4A
Authority: CN
Inventors: 韩钧宇; 丁二锐; 吴中勤; 文林福
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2012-08-07
Filing date: 2012-08-07
Publication date: 2018-09-04
Anticipated expiration: 2032-08-07
Also published as: CN103577818A

Abstract

The present invention provides a kind of method and apparatus of pictograph identification, wherein method includes：Character area in S1, acquisition images to be recognized；S2, each block in character area is identified respectively and records the location information of each block；S3, the location information based on each block carry out printed page analysis and obtain sentence structure distribution；S4, the correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, the recognition result after being corrected.The present invention efficiently utilizes the semantic information between word and is modified to the recognition result of each block, improves the precision of pictograph identification, preferably meets the identification demand of user.

Description

A kind of method and apparatus of pictograph identification

【Technical field】

The present invention relates to computer application technology, more particularly to a kind of method and apparatus of pictograph identification.

【Background technology】

With the rapid development of mobile Internet, the application based on mobile terminal camera the image collected is more and more wider It is general.Wherein the word in image is identified pictograph identification technology, is converted to text, defeated to alleviate user The burden for entering corresponding text information facilitates user's storage, edits corresponding text information.But pictograph identification technology is one A sufficiently complex technical problem, especially in the case of picture material complexity, Text region precision often cannot be satisfied use The demand at family.

Existing image character recognition method mainly includes the following steps that：

1）Determine the character zone in image；2）Character segmentation is carried out to character zone, obtains each block；3）To each Block carries out feature extraction, the feature of extraction is matched with property data base, to obtain matched each character conduct Recognition result.

Although above-mentioned image character recognition method has stronger Text region ability, by being then based on single word Identification, therefore be susceptible to identification error and there is no effective correction measure, Text region precision relatively low.

【Invention content】

In view of this, the present invention provides a kind of method and apparatus of pictograph identification, in order to improve pictograph The precision of identification.

Specific technical solution is as follows：

A kind of pictograph knowledge method for distinguishing, this method include：

Character area in S1, acquisition images to be recognized；

S2, each block in character area is identified respectively and records the location information of each block；

S3, the location information based on each block carry out printed page analysis and obtain sentence structure distribution；

S4, the correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, is corrected Recognition result afterwards.

According to one preferred embodiment of the present invention, the step S1 is specifically included：

Server receives the images to be recognized that mobile terminal is sent, and character area is extracted from the images to be recognized； Alternatively,

Server receives the character area that mobile terminal is extracted and sent from images to be recognized.

According to one preferred embodiment of the present invention, the step S3 is specifically included：

It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized Literal line of the block of one threshold value as a horizontal direction；Alternatively,

Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized Literal line of the block of two threshold values as a vertical direction；Alternatively,

It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized One threshold value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction；Alternatively,

Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized Two threshold values and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.

According to one preferred embodiment of the present invention, the step S4 is specifically included：

S41, the recognition result of each block in literal line is matched with word library, obtains the identification knot for constituting word Fruit；

S42, it is combined by block sequence using the recognition result for constituting word and the recognition result for not constituting word Obtain each sentence；

S43, the semantic confidence degree for determining each sentence, and each sentence is matched with sentence database, according to matching Situation determines the matching confidence level of each sentence；

S44, the semantic confidence degree of each sentence and matching confidence level are combined to the total confidence level for determining each sentence, selection Total highest sentence of confidence level is as the recognition result after correction.

According to one preferred embodiment of the present invention, further include in the step S41：By the block of non-first place in literal line The recognition result that word can not be formed with the recognition result close to block in recognition result is deleted, but for can be independently at semantic Or close to block recognition result lack recognition result except.

According to one preferred embodiment of the present invention, further include in the step S2：According to recognition result and block in picture Similarity determines the confidence level of the recognition result of each block；

In the step S43 using sentence in the confidence level of each recognition result sum to obtain the semantic confidence degree of sentence, The confidence level for the recognition result for constituting word is wherein improved in summation.

According to one preferred embodiment of the present invention, the step S43 is specifically included：N1 are selected before semantic confidence degree comes Sentence, n1 are preset positive integer, and the sentence selected is matched with sentence database, and each sentence is determined according to matching state Matching confidence level.

According to one preferred embodiment of the present invention, the matching confidence of sentence i is determined using following formula in the step S43 Spend C_m：

C_m=N_i×α×P_i

Wherein, N_iFor the number of words that sentence i includes, α is preset coefficient, P_iFor the maximum continuous coupling of sentence i and sentence L The ratio of the total number of word of word number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.

According to one preferred embodiment of the present invention, this method further includes：

S5, it is scanned for, is determined optimal with the recognition result matching state after correction using the recognition result after correction Network documentation intercepts in the network documentation with the matched network character content of recognition result after correction as the identification after extension As a result.

According to one preferred embodiment of the present invention, in described interception network documentation with the matched net of recognition result after correction Network word content as extension after recognition result be：

After minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph are used as extension Recognition result.

A kind of device of image recognition, the device include：

Area acquisition unit, for obtaining the character area in images to be recognized；

Word recognition unit, for each block in character area to be identified respectively；

Position recording unit, the location information for recording each block；

Printed page analysis unit carries out printed page analysis for the location information based on each block and obtains sentence structure distribution；

Semantic analysis unit, for being carried out the recognition result of each block based on semantic analysis based on sentence structure distribution Correction, the recognition result after being corrected.

According to one preferred embodiment of the present invention, the area acquisition unit receives the figure to be identified that mobile terminal is sent Picture extracts character area from the images to be recognized；It extracts and sends from images to be recognized alternatively, receiving mobile terminal Character area.

According to one preferred embodiment of the present invention, the printed page analysis unit is specifically configured to：

According to one preferred embodiment of the present invention, the semantic analysis unit specifically includes：

Dictionary coupling subelement obtains structure for matching the recognition result of each block in literal line with word library At the recognition result of word；

Sentence determination subelement, for using constitute word recognition result and do not constitute the recognition result of word by word Block sequence is combined to obtain each sentence；

Semantic confidence degree determination subelement, the semantic confidence degree for determining each sentence；

Confidence level determination subelement is matched, it is true according to matching state for matching each sentence with sentence database The matching confidence level of fixed each sentence；

Subelement is corrected, for the semantic confidence degree of each sentence and matching confidence level to be combined and determine the total of each sentence Confidence level selects the highest sentence of total confidence level as the recognition result after correction.

According to one preferred embodiment of the present invention, the dictionary coupling subelement is additionally operable to the word of non-first place in literal line The recognition result that word can not be formed with the recognition result close to block in the recognition result of block is deleted, but for can independent Chinese idiom Except the recognition result of justice or close to block recognition result missing.

According to one preferred embodiment of the present invention, the word recognition unit is additionally operable to according to recognition result and word in picture The similarity of block determines the confidence level of the recognition result of each block；

The semantic confidence degree determination subelement is specifically configured to：It is summed using the confidence level of each recognition result in sentence The semantic confidence degree of sentence is obtained, wherein improving the confidence level for the recognition result for constituting word in summation.

According to one preferred embodiment of the present invention, the matching confidence level determination subelement is specifically configured to：Select semanteme N1 sentences before confidence level comes, n1 are preset positive integer, the sentence selected are matched with sentence database, foundation Matching state determines the matching confidence level of each sentence.

According to one preferred embodiment of the present invention, the matching confidence level determination subelement determines sentence i using following formula Matching confidence level C_m：

C_m=N_i×α×P_i

According to one preferred embodiment of the present invention, which further includes：Network expanding element, for utilizing the identification after correction As a result scan for, determine with the optimal network documentation of recognition result matching state after correction, intercept in the network documentation with The matched network character content of recognition result after correction is as the recognition result after extension.

According to one preferred embodiment of the present invention, the network expanding element is when executing the operation of the interception, specifically from Minimum sentence of the interception comprising the recognition result after correction or minimum paragraph are as the recognition result after extension in the network documentation.

As can be seen from the above technical solutions, the present invention obtains sentence structure distribution by printed page analysis, is based on sentence knot Structure is distributed the correction that semantic analysis is carried out to the recognition result of each block, to efficiently utilize the semantic information between word The recognition result of each block is modified, the precision of pictograph identification is improved, preferably meeting the identification of user needs It asks.

【Description of the drawings】

Fig. 1 is the method flow diagram for the pictograph identification that the embodiment of the present invention one provides；

Fig. 2 is the character area instance graph that the embodiment of the present invention one provides；

Fig. 3 is the correction course schematic diagram based on semantic analysis that the embodiment of the present invention one provides；

Fig. 4 is system schematic provided in an embodiment of the present invention；

Fig. 5 is the structure drawing of device of image recognition provided by Embodiment 2 of the present invention；

Fig. 6 is the structure chart of semantic analysis unit provided by Embodiment 2 of the present invention.

【Specific implementation mode】

To make the objectives, technical solutions, and advantages of the present invention clearer, right in the following with reference to the drawings and specific embodiments The present invention is described in detail.

Embodiment one,

Fig. 1 is the method flow diagram of pictograph provided in an embodiment of the present invention identification, as shown in Figure 1, this method can be with Include the following steps：

Step 101：Obtain the character area in images to be recognized.

Server obtains the image for including text information that mobile terminal is sent, which can be mobile terminal shooting The original image arrived, server extract the character area in images to be recognized in this step.Alternatively, the image can be mobile After terminal taking to original image, the character area in images to be recognized is sent out after extracting the character area in images to be recognized Give server.

Existing mode may be used when extracting character area, extract character area after removing image background, can adopt With but be not limited to following manner：

Mode one, first according to colored Euclidean distance carry out color run-length coding, then carry out color cluster, based on cluster As a result the generation and selection of character layer are carried out, such as Retention area is more than the connected domain of certain value, based on connected domain and each color The Euclidean distance of cluster centre generates each image level, finally according to the number of pixels of each image level and this layer of segmentation threshold The relationship of number of pixels determines word level, noise level or background level, finally takes out after noise level and background level just Obtain word level, i.e. character area.

Mode two, a large amount of word sample image of selection and the picture without word, use this two class of canny operator extractions Training sample of the marginal information of picture as rarefaction representation classifying dictionary；Two class training samples are inputted into classification rarefaction representation word Allusion quotation training algorithm obtains word rarefaction representation classifying dictionary and non-legible rarefaction representation classifying dictionary；Images to be recognized is switched into ash Image is spent, the marginal information of canny operator extraction gray level images is used；Gray scale is extracted using the rarefaction representation based on classifying dictionary Candidate character region in image edge information；In the horizontal direction in vertical direction respectively use run-length smoothing algorithm will wait The edge for selecting character area isolated is connected as larger region, then carries out Projection Analysis and find out corresponding literal line, casts out simultaneously Isolated edge in candidate character region other than literal line；The character area detected is identified.

If mobile terminal carries out the extraction of character area, existing character area extraction software or hand may be used Dynamic mode carries out the extraction of character area.

In addition, the character area obtained in this step can be one, can also be two or more.Due in this step Content is the prior art, and details are not described herein.

Step 102：Each block in character area is identified respectively and records the location information of each block.

The process that wherein each block in character area is identified respectively can include equally following with the prior art Step：Binaryzation is carried out to character area；Each block is divided into the character area after binaryzation；Extract the feature letter of each block It ceases and is matched with character features database, using matching result as the recognition result of each block.Specific implementation repeats no more.

The location information of the block of record can be：The coordinate information of block center in the picture, can also further wrap Include the size information etc. of block.

It should be noted that the recognition result of each block there may be multiple, i.e., for a block, there may be multiple Recognition result is usually to determine the recognition result for meeting preset requirement with block similarity in picture, and each according to similarity All there is a confidence level in recognition result.By taking each block in image shown in Fig. 2 as an example：

The recognition result of first character block is：In（44）, in（44）；

The recognition result of second block is：State（32）, enclose（31）, ware（29）；

The recognition result of third block is：Nosebleed（41）, it is bright（40）, postal（39）；

The recognition result of 4th block is：Political affairs（67）, attack（48）, change（46）.

Number in wherein above-mentioned bracket is the confidence level of each recognition result.

Step 103：Location information based on each block carries out printed page analysis and obtains sentence structure distribution.

It will be less than default first in vertical upper position gap using the coordinate information of block center in the picture in this step Threshold value（I.e. approximation is on a horizontal line）Literal line of the block as a horizontal direction, alternatively, will in level position Gap is less than default second threshold（It is i.e. approximate to be arranged vertically at one）Literal line of the block as a vertical direction.As for The layout for taking the literal line of horizontal direction that the literal line of vertical direction is still taken to depend on image context word, if it is lateral writing , then the literal line of horizontal direction is taken in this step, if it is what is longitudinally write, then takes the text of vertical direction in this step Word row, this can be set in advance.

More preferably, it can also will be less than default the in vertical upper position gap further combined with the size information of block One threshold value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction, alternatively, will be Position gap is less than default second threshold in level and block difference in size is perpendicular as one less than the block of default size threshold value Histogram to literal line.

Above-mentioned first threshold and second threshold can be configured based on experience value, the value can as the case may be into Row adjustment.For example, for character area shown in Fig. 2, since four blocks in vertical direction presetting by position gap Within first threshold, therefore this four blocks are connected into a literal line.

The present invention is when it is implemented, the acquisition of this step sentence structure distributed intelligence can be not limited to base set out above In the method for threshold decision, while non-horizontal or vertical direction oblique literal line can also be extracted by other mechanism.

Step 104：The correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, is obtained Recognition result after correction.

The recognition result that the block of same literal line will be belonged to carries out semantic analysis correction, carries out semantic analysis timing, The recognition result of each block is matched in order with word library first, determines the recognition result institute of each block of same literal line The word combination that can be constituted；Then the semantic confidence degree for each sentence that the recognition result of same literal line can be combined into is determined, Select the highest sentence of semantic confidence degree as the recognition result after correction；Alternatively, further by each sentence and sentence database It is matched, the matching confidence level of each sentence is determined according to matching state, semantic confidence degree and matching confidence level are combined The total confidence level for determining each sentence selects the highest sentence of total confidence level as the recognition result after correction.

The correction course based on semantic analysis is described in detail with reference to Fig. 3, as shown in figure 3, the correction course It may comprise steps of：

Step 301：The recognition result of each block in literal line is matched with word library, by non-first place in literal line The recognition result that word can not be formed with the recognition result of adjacent block in the recognition result of block is deleted, but for can be independent Except recognition result at semantic or close to block recognition result missing.

It should be noted that by can not be with the identification of adjacent block in the recognition result of the block of non-first place in literal line As a result the step of recognition result of composition word is deleted is held to improve the efficiency of follow-up sentence confidence calculations and selection Capable step, it is not essential however to.

Step 302：It is carried out by block sequence using the recognition result for constituting word and the recognition result for not constituting word Combination obtains each sentence, determines the semantic confidence degree of each sentence.

It is exactly the sentence determined literal line and be possible to identify in this step, will be constituted according to the sequence of each block The recognition result of word and the recognition result for not constituting word are combined, and obtain all possible sentence.

Still by taking the recognition result of situation shown in Fig. 2 as an example, in the recognition result of second block " enclosing ", " ware " with And " nosebleed " in third recognition result cannot the recognition result of its immediate block constitute word, and itself does not have yet It is independent semantic, therefore can be deleted in step 301.

Each recognition result is matched with word library, obtained word includes：China, it is postal, bright attack, open policy.

Generating all possible sentence includes：

In state is bright attacks

In state is bright changes

In state's open policy

In state it is postal

In state's postal change

In state's postal attack

China is bright to be changed

Chinese open policy

China is bright to be attacked

China Post

Chinese postal is attacked

Chinese postal changes

When determining that the semantic confidence of each sentence is spent, the confidence level of each recognition result in sentence can be utilized to sum, wherein The confidence level for the recognition result for constituting word can be improved, such as the confidence level for the recognition result for constituting word is doubled, will be asked Semantic confidence degree of the confidence level as sentence with after.

The semantic confidence degree of example in connecting, each sentence is as follows, and the number wherein in round bracket is the confidence of each recognition result It spends, the number in bracket is the semantic confidence degree of sentence.

In（44）State（32）It is bright to attack（40*2+48*2）【268】

In (44) state (32) it is bright（40）Change（46）【162】

In (44) state (32) open policy（40*2+67*2）【214】

In (44) state (32) it is postal（39*2+67*2）【288】

In (44) state (32) postal（39）Change（46）【161】

In (44) state (32) postal（39）It attacks（48）【163】

China（44*2+32*2）It is bright（40）Change（46）【238】

China（44*2+32*2）Open policy（40*2+67*2）【366】

China（44*2+32*2）It is bright to attack（40*2+48*2）【328】

China（44*2+32*2）It is postal（39*2+67*2）【364】

China（44*2+32*2）Postal（39）It attacks（48）【239】

China（44*2+32*2）Postal（39）Change（46）【237】

Step 303：N1 before semantic confidence degree comes sentences are selected, n1 is preset positive integer, the sentence that will be selected It is matched with sentence database, the matching confidence level of each sentence is determined according to matching state.

Several sentence before semantic confidence degree comes can be selected in this step to be matched with sentence database, this Sample is done more efficient, naturally it is also possible to be matched all sentences with sentence database, be determined the matching of all sentences Confidence level.

Example in connecting, it is assumed that select semantic confidence degree and come preceding 3 sentences, is i.e. " Chinese open policy ", " Chinese postal Political affairs " and " China is bright to be attacked ".

Match in the sentence database used includes common sentence, the matching journey of the sentence picked out and common sentence Degree is higher, illustrates that it is bigger as the possibility of recognition result, therefore, foundation and sentence when determining matching confidence level here The matching state of database determines.So-called matching state can be embodied in the number of words and sentence and sentence number of sentence itself According to the maximum continuous coupling word of the match statement in library and the information such as word ratio of the match statement.

Assuming that the matching confidence level C of sentence i_mIt is determined using following formula：

C_m=N_i×α×P_i（1）

Wherein, N_iFor the number of words that sentence i includes, α is preset coefficient, such as takes 100, P_iMost for sentence i and sentence L The ratio of the total number of word of big continuous matching literal number and sentence L, wherein sentence L is with sentence i in sentence database With sentence, that is to say, that when being matched the sentence selected with sentence database, sentence L, sentence L can be obtained first Can be completely with sentence i on word matched sentence, can also be matched and match language on most words with sentence i The most sentence of sentence i numbers of words, that is to say, that certain journey can be reached with the characters matching degree with sentence i in case statement database It spends and the maximum sentence of matching degree is as match statement, i.e. sentence L.In specific implementation process of the present invention, the acquisition of sentence L also may be used To be obtained by other statement matching strategies.

For example, sentence " Chinese open policy " is matched to " the China Unicom open policy cooperation Room " in sentence database, according to formula （1）The matching confidence level of calculating is：

Sentence " China Post " is matched to " China Post " in sentence database, according to formula（1）The matching of calculating is set Reliability is：

Sentence " China is bright to be attacked " is not matched in sentence database, and matching confidence level is 0.

Step 304：The semantic confidence degree of each sentence and matching confidence level are combined to the total confidence level for determining each sentence, Select the highest sentence of total confidence level as the recognition result after correction.

Assuming that the value that semantic confidence degree and matching confidence level are summed in this step is as the total of each sentence Confidence level, then connecting upper example：

Total confidence level of sentence " China Post " is：366+88=454.

Total confidence level of sentence " China Post " is：364+400=764.

Total confidence level of sentence " China bright attacks " is：328+0=328.

Final choice " China Post " is as the recognition result after correction.

After having carried out semantic analysis correction, have been able to ensure that recognition result has certain accuracy, server can It is shown so that the recognition result after correction is returned to mobile terminal, but in order to further improve accuracy of identification, Ke Yijie The mode for closing web search is extended, you can further to execute step 105.

With continued reference to Fig. 1, step 105：Determine that the network documentation optimal with recognition result matching state after correction, interception should With the matched network character content of recognition result after correction as the recognition result after extension in network documentation.

Recognition result after correction is scanned in a network, each document that search obtains is calculated and is tied with identification after correction The matching state of fruit determines the wherein optimal document of matching state.Optimal so-called matching state can be that matched number of words is most, Can also be that matched number of words accounts for number of words ratio maximum of network character content etc..

In intercept network word content, the network text of recognition result after correction is included from interception in determining network documentation Word content, specifically can based in network documentation punctuate or carriage return, interception comprising correction after recognition result minimum sentence or Minimum paragraph is as the recognition result after extension.So far server the recognition result after extension can be returned to mobile terminal into Row display.

Example in connecting, " China Post " is scanned in a network, obtains that with current recognition result to match number of words most Network character content, interception comprising current recognition result minimum sentence be " group company of China Post " as extend after knowledge Other result.

Can the recognition result after correction and the recognition result after extension be selected one to be shown, can also all shown.Example It is such as shown as " China Post-group company of China Post ".

It is the detailed description carried out to method provided by the present invention above, the present invention is carried with reference to embodiment two The device of confession is described in detail, which is arranged on the server, is mainly used for system architecture as shown in Figure 4, the system It is made of mobile terminal and server, wherein mobile terminal can be using the image comprising word taken as images to be recognized It is sent to server, character area is therefrom extracted by server, alternatively, mobile terminal makees the image comprising word taken After images to be recognized, character area is therefrom extracted, which is sent to server.Server executes implementation later Flow shown in example one, finally by the recognition result after the correction based on semantic analysis and the identification after network extends As a result one kind or combination in return to mobile terminal.

Embodiment two,

Fig. 5 is the structure drawing of device of image recognition provided by Embodiment 2 of the present invention, as shown in figure 5, the device includes：Area Domain acquiring unit 500, word recognition unit 510, position recording unit 520, printed page analysis unit 530 and semantic analysis unit 540。

Area acquisition unit 500 obtains the character area in images to be recognized first, and wherein area acquisition unit 500 can be with The images to be recognized that mobile terminal is sent is received, character area is extracted from images to be recognized；Alternatively, receive mobile terminal from The character area for extracting and sending in images to be recognized.

Each block in character area is identified in word recognition unit 510 respectively, and existing identification side may be used Formula, such as specifically include：Binaryzation is carried out to character area；Each block is divided into the character area after binaryzation；Extraction is each The characteristic information of block is simultaneously matched with property data base, using matching result as the recognition result of each block.

Position recording unit 520 records the location information of each block, and the location information of record can be：Scheming at block center Coordinate information as in, can further include the size information etc. of block.

Location information of the printed page analysis unit 530 based on each block carries out printed page analysis and obtains sentence structure distribution.The space of a whole page Analytic unit 530 can be specifically configured to：

It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in images to be recognized Literal line of the block of value as a horizontal direction；Alternatively,

Position gap it will be less than default second threshold in level using coordinate information of the block center in images to be recognized Literal line of the block of value as a vertical direction；Alternatively,

It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in images to be recognized Value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction；Alternatively,

Position gap it will be less than default second threshold in level using coordinate information of the block center in images to be recognized Value and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.

As for the layout for taking the literal line of horizontal direction that the literal line of vertical direction is still taken to depend on image context word, such as Fruit is laterally write, then takes the literal line of horizontal direction in this step, if it is what is longitudinally write, is then taken in this step The literal line of vertical direction, this can be set in advance.Above-mentioned first threshold and second threshold can based on experience value into Row setting, the value can be adjusted as the case may be.

The present invention is based on threshold decision when it is implemented, the implementation of printed page analysis unit can be not limited to set out above obtain Method, while non-horizontal or vertical direction oblique literal line can also be extracted by other mechanism.

Semantic analysis unit 540 carries out the school based on semantic analysis based on sentence structure distribution to the recognition result of each block Just, the recognition result after being corrected.

The structure of semantic analysis unit 540 is described in detail below, as shown in fig. 6, semantic analysis unit 540 can be with It specifically includes：Dictionary coupling subelement 541, sentence determination subelement 542, semantic confidence degree determination subelement 543, matching confidence Spend determination subelement 544 and correction subelement 545.

Dictionary coupling subelement 541 matches the recognition result of each block in literal line with word library, is constituted The recognition result of word.

Preferably, dictionary coupling subelement 541 can also be by can not in the recognition result of the block of non-first place in literal line The recognition result that word is formed with the recognition result close to block is deleted, but for can be independently at semantic or close to block Except the recognition result of recognition result missing.

Sentence determination subelement 542 is using the recognition result for constituting word and does not constitute the recognition result of word by block Sequence is combined to obtain each sentence.

Semantic confidence degree determination subelement 543 determines the semantic confidence degree of each sentence.Semantic confidence degree is to be based on each identification What confidence level as a result determined, in this case, word recognition unit 510 is also according to recognition result and picture shown in Fig. 5 The similarity of middle block determines the confidence level of the recognition result of each block.Semantic confidence degree determination subelement 543 utilizes language at this time The confidence level of each recognition result sums to obtain the semantic confidence degree of sentence in sentence, wherein improving the identification for constituting word in summation As a result confidence level.

Matching confidence level determination subelement 544 matches each sentence with sentence database, is determined according to matching state The matching confidence level of each sentence.

In order to improve the efficiency of matching confidence calculations, matching confidence level determination subelement 544 can select semantic confidence N1 sentences before degree comes, n1 are preset positive integer, and the sentence selected is matched with sentence database, according to matching Situation determines the matching confidence level of each sentence.

The matching confidence level C that following formula determines sentence i specifically may be used_m：

C_m=N_i×α×P_i

Wherein, N_iFor the number of words that sentence i includes, α is preset coefficient, P_iFor the maximum continuous coupling of sentence i and sentence L The ratio of the total number of word of word number and sentence L, wherein sentence L are the match statement with sentence i in sentence database, also Be to say, when being matched the sentence selected with sentence database, sentence L can be obtained first, sentence L can be completely with Sentence i matched sentences on word can also be that matched and match statement i numbers of words are most on most words with sentence i Sentence, that is to say, that can be reached a certain level with the characters matching degree with sentence i in case statement database and matching degree most Big sentence is as match statement, i.e. sentence L.

The semantic confidence degree of each sentence and matching confidence level are combined and determine always setting for each sentence by correction subelement 545 Reliability selects the highest sentence of total confidence level as the recognition result after correction.

With continued reference to Fig. 5, in order to further improve accuracy of identification, which can also include：Network expanding element 550, for being scanned for using the recognition result after correction, determine the network optimal with the recognition result matching state after correction Document intercepts in the network documentation with the matched network character content of recognition result after correction as the identification knot after extension Fruit.

Specifically when executing the operation of interception, it can be intercepted from the network documentation comprising the recognition result after correction most Small sentence or minimum paragraph are as the recognition result after extension.

In the server due to device setting, server can by after correction that above-mentioned apparatus obtains recognition result and expansion Recognition result after exhibition, which selects one and returns to mobile terminal, to be shown, and can also all be returned to mobile terminal and is shown.

Method and apparatus provided by the invention have following advantages it can be seen from above description：

1）The semantic information efficiently utilized between word is modified the recognition result of each block, improves image The precision of Text region preferably meets the identification demand of user.

2）A large amount of network character resources present in internet are taken full advantage of, recognition result are extended, more into one Step excavates user view, promotes the use demand of user.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention With within principle, any modification, equivalent substitution, improvement and etc. done should be included within the scope of protection of the invention god.

Claims

1. a kind of pictograph knows method for distinguishing, which is characterized in that this method includes：

Character area in S1, acquisition images to be recognized；

S41, the recognition result of each block in literal line is matched with word library, obtains the recognition result for constituting word；

S42, it is combined to obtain by block sequence using the recognition result for constituting word and the recognition result for not constituting word Each sentence；

S43, the semantic confidence degree for determining each sentence, or further match each sentence with sentence database, foundation The matching confidence level of each sentence is determined with situation；

S44, the semantic confidence degree according to sentence or total confidence level, case statement is as the recognition result after correction, wherein sentence Total confidence level by sentence semantic confidence degree and matching confidence level combine determine.

2. according to the method described in claim 1, it is characterized in that, the S1 is specifically included：

Server receives the images to be recognized that mobile terminal is sent, and character area is extracted from the images to be recognized；Alternatively,

3. according to the method described in claim 1, it is characterized in that, the S3 is specifically included：

It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized Literal line of the block of value as a horizontal direction；Alternatively,

Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized Literal line of the block of value as a vertical direction；Alternatively,

It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized Value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction；Alternatively,

Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized Value and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.

4. according to the method described in claim 1, it is characterized in that, further including in the S41：By non-first place in literal line In the recognition result of block can not with close to block recognition result form word recognition result delete, but for can independently at Except the recognition result of semantic or close to block recognition result missing.

5. according to the method described in claim 1, it is characterized in that, further including in the S2：According in recognition result and picture The similarity of block determines the confidence level of the recognition result of each block；

In the S43 using sentence in the confidence level of each recognition result sum to obtain the semantic confidence degree of sentence, wherein asking With when improve constitute word recognition result confidence level.

6. according to the method described in claim 1, it is characterized in that, by each sentence and sentence database progress in the S43 Match, determines that the matching confidence level of each sentence specifically includes according to matching state：N1 before semantic confidence degree comes sentences are selected, N1 is preset positive integer, and the sentence selected is matched with sentence database, and of each sentence is determined according to matching state With confidence level.

7. according to the method described in claim 1, it is characterized in that, of sentence i is determined using following formula in the S43 With confidence level C_m：

C_m=N_i×α×P_i

Wherein, N_iFor the number of words that sentence i includes, α is preset coefficient, P_iFor the maximum continuous coupling word of sentence i and sentence L The ratio of the total number of word of number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.

8. according to the method described in claim 1, it is characterized in that, this method further includes：

S5, it is scanned for using the recognition result after correction, determines the network optimal with the recognition result matching state after correction Document intercepts in the network documentation with the matched network character content of recognition result after correction as the identification knot after extension Fruit.

9. according to the method described in claim 8, it is characterized in that, in described interception network documentation with the identification knot after correction The matched network character content of fruit as extension after recognition result be：

From minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph as the knowledge after extension Other result.

10. a kind of device of image recognition, which is characterized in that the device includes：

Position recording unit, the location information for recording each block；

Semantic analysis unit obtains constituting word for matching the recognition result of each block in literal line with word library Recognition result；It is combined by block sequence using the recognition result for constituting word and the recognition result for not constituting word To each sentence；It determines the semantic confidence degree of each sentence, or further matches each sentence with sentence database, foundation The matching confidence level of each sentence is determined with situation；Semantic confidence degree according to sentence or total confidence level, case statement is as correction Recognition result afterwards, wherein total confidence level of sentence are combined and are determined by the semantic confidence degree and matching confidence level of sentence.

11. device according to claim 10, which is characterized in that the area acquisition unit receives mobile terminal and sends Images to be recognized, extract character area from the images to be recognized；It is carried from images to be recognized alternatively, receiving mobile terminal The character area for taking and sending.

12. device according to claim 10, which is characterized in that the printed page analysis unit is specifically configured to：

13. the device according to claim 10 or 12, which is characterized in that the semantic analysis unit specifically includes：

Dictionary coupling subelement obtains constituting word for matching the recognition result of each block in literal line with word library The recognition result of language；

Sentence determination subelement, for suitable by block using the recognition result for constituting word and the recognition result for not constituting word Sequence is combined to obtain each sentence；

Confidence level determination subelement is matched, for matching each sentence with sentence database, is determined according to matching state each The matching confidence level of sentence；

Subelement is corrected, for the semantic confidence degree of each sentence and matching confidence level to be combined to the total confidence for determining each sentence Degree selects the highest sentence of total confidence level as the recognition result after correction.

14. device according to claim 13, which is characterized in that the dictionary coupling subelement is additionally operable to literal line In non-first place block recognition result in can not form the recognition result of word with the recognition result close to block and delete, but it is right Except the recognition result that can be independently lacked at semantic or close to block recognition result.

15. device according to claim 13, which is characterized in that the word recognition unit is additionally operable to according to identification knot The similarity of fruit and block in picture determines the confidence level of the recognition result of each block；

The semantic confidence degree determination subelement is specifically configured to：It sums to obtain using the confidence level of each recognition result in sentence The semantic confidence degree of sentence, wherein improving the confidence level for the recognition result for constituting word in summation.

16. device according to claim 13, which is characterized in that the matching confidence level determination subelement is specifically configured For：Select the sentence of n1 before semantic confidence degree comes, n1 is preset positive integer, by the sentence selected and sentence database into Row matching, the matching confidence level of each sentence is determined according to matching state.

17. device according to claim 13, which is characterized in that the matching confidence level determination subelement is using following public Formula determines the matching confidence level C of sentence i_m：

C_m=N_i×α×P_i

18. device according to claim 10, which is characterized in that the device further includes：Network expanding element, for utilizing Recognition result after correction scans for, and determines that the network documentation optimal with the recognition result matching state after correction, interception should With the matched network character content of recognition result after correction as the recognition result after extension in network documentation.

19. device according to claim 18, which is characterized in that the network expanding element is in the behaviour for executing the interception When making, specifically after minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph are used as extension Recognition result.