CN103577818B - A kind of method and apparatus of pictograph identification - Google Patents
A kind of method and apparatus of pictograph identification Download PDFInfo
- Publication number
- CN103577818B CN103577818B CN201210279370.4A CN201210279370A CN103577818B CN 103577818 B CN103577818 B CN 103577818B CN 201210279370 A CN201210279370 A CN 201210279370A CN 103577818 B CN103577818 B CN 103577818B
- Authority
- CN
- China
- Prior art keywords
- sentence
- recognition result
- block
- word
- confidence level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention provides a kind of method and apparatus of pictograph identification, wherein method includes:Character area in S1, acquisition images to be recognized;S2, each block in character area is identified respectively and records the location information of each block;S3, the location information based on each block carry out printed page analysis and obtain sentence structure distribution;S4, the correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, the recognition result after being corrected.The present invention efficiently utilizes the semantic information between word and is modified to the recognition result of each block, improves the precision of pictograph identification, preferably meets the identification demand of user.
Description
【Technical field】
The present invention relates to computer application technology, more particularly to a kind of method and apparatus of pictograph identification.
【Background technology】
With the rapid development of mobile Internet, the application based on mobile terminal camera the image collected is more and more wider
It is general.Wherein the word in image is identified pictograph identification technology, is converted to text, defeated to alleviate user
The burden for entering corresponding text information facilitates user's storage, edits corresponding text information.But pictograph identification technology is one
A sufficiently complex technical problem, especially in the case of picture material complexity, Text region precision often cannot be satisfied use
The demand at family.
Existing image character recognition method mainly includes the following steps that:
1)Determine the character zone in image;2)Character segmentation is carried out to character zone, obtains each block;3)To each
Block carries out feature extraction, the feature of extraction is matched with property data base, to obtain matched each character conduct
Recognition result.
Although above-mentioned image character recognition method has stronger Text region ability, by being then based on single word
Identification, therefore be susceptible to identification error and there is no effective correction measure, Text region precision relatively low.
【Invention content】
In view of this, the present invention provides a kind of method and apparatus of pictograph identification, in order to improve pictograph
The precision of identification.
Specific technical solution is as follows:
A kind of pictograph knowledge method for distinguishing, this method include:
Character area in S1, acquisition images to be recognized;
S2, each block in character area is identified respectively and records the location information of each block;
S3, the location information based on each block carry out printed page analysis and obtain sentence structure distribution;
S4, the correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, is corrected
Recognition result afterwards.
According to one preferred embodiment of the present invention, the step S1 is specifically included:
Server receives the images to be recognized that mobile terminal is sent, and character area is extracted from the images to be recognized;
Alternatively,
Server receives the character area that mobile terminal is extracted and sent from images to be recognized.
According to one preferred embodiment of the present invention, the step S3 is specifically included:
It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized
Literal line of the block of one threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized
Literal line of the block of two threshold values as a vertical direction;Alternatively,
It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized
One threshold value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized
Two threshold values and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.
According to one preferred embodiment of the present invention, the step S4 is specifically included:
S41, the recognition result of each block in literal line is matched with word library, obtains the identification knot for constituting word
Fruit;
S42, it is combined by block sequence using the recognition result for constituting word and the recognition result for not constituting word
Obtain each sentence;
S43, the semantic confidence degree for determining each sentence, and each sentence is matched with sentence database, according to matching
Situation determines the matching confidence level of each sentence;
S44, the semantic confidence degree of each sentence and matching confidence level are combined to the total confidence level for determining each sentence, selection
Total highest sentence of confidence level is as the recognition result after correction.
According to one preferred embodiment of the present invention, further include in the step S41:By the block of non-first place in literal line
The recognition result that word can not be formed with the recognition result close to block in recognition result is deleted, but for can be independently at semantic
Or close to block recognition result lack recognition result except.
According to one preferred embodiment of the present invention, further include in the step S2:According to recognition result and block in picture
Similarity determines the confidence level of the recognition result of each block;
In the step S43 using sentence in the confidence level of each recognition result sum to obtain the semantic confidence degree of sentence,
The confidence level for the recognition result for constituting word is wherein improved in summation.
According to one preferred embodiment of the present invention, the step S43 is specifically included:N1 are selected before semantic confidence degree comes
Sentence, n1 are preset positive integer, and the sentence selected is matched with sentence database, and each sentence is determined according to matching state
Matching confidence level.
According to one preferred embodiment of the present invention, the matching confidence of sentence i is determined using following formula in the step S43
Spend Cm:
Cm=Ni×α×Pi
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, PiFor the maximum continuous coupling of sentence i and sentence L
The ratio of the total number of word of word number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.
According to one preferred embodiment of the present invention, this method further includes:
S5, it is scanned for, is determined optimal with the recognition result matching state after correction using the recognition result after correction
Network documentation intercepts in the network documentation with the matched network character content of recognition result after correction as the identification after extension
As a result.
According to one preferred embodiment of the present invention, in described interception network documentation with the matched net of recognition result after correction
Network word content as extension after recognition result be:
After minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph are used as extension
Recognition result.
A kind of device of image recognition, the device include:
Area acquisition unit, for obtaining the character area in images to be recognized;
Word recognition unit, for each block in character area to be identified respectively;
Position recording unit, the location information for recording each block;
Printed page analysis unit carries out printed page analysis for the location information based on each block and obtains sentence structure distribution;
Semantic analysis unit, for being carried out the recognition result of each block based on semantic analysis based on sentence structure distribution
Correction, the recognition result after being corrected.
According to one preferred embodiment of the present invention, the area acquisition unit receives the figure to be identified that mobile terminal is sent
Picture extracts character area from the images to be recognized;It extracts and sends from images to be recognized alternatively, receiving mobile terminal
Character area.
According to one preferred embodiment of the present invention, the printed page analysis unit is specifically configured to:
It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized
Literal line of the block of one threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized
Literal line of the block of two threshold values as a vertical direction;Alternatively,
It will be less than default the in vertical upper position gap using coordinate information of the block center in the images to be recognized
One threshold value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default the in level using coordinate information of the block center in the images to be recognized
Two threshold values and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.
According to one preferred embodiment of the present invention, the semantic analysis unit specifically includes:
Dictionary coupling subelement obtains structure for matching the recognition result of each block in literal line with word library
At the recognition result of word;
Sentence determination subelement, for using constitute word recognition result and do not constitute the recognition result of word by word
Block sequence is combined to obtain each sentence;
Semantic confidence degree determination subelement, the semantic confidence degree for determining each sentence;
Confidence level determination subelement is matched, it is true according to matching state for matching each sentence with sentence database
The matching confidence level of fixed each sentence;
Subelement is corrected, for the semantic confidence degree of each sentence and matching confidence level to be combined and determine the total of each sentence
Confidence level selects the highest sentence of total confidence level as the recognition result after correction.
According to one preferred embodiment of the present invention, the dictionary coupling subelement is additionally operable to the word of non-first place in literal line
The recognition result that word can not be formed with the recognition result close to block in the recognition result of block is deleted, but for can independent Chinese idiom
Except the recognition result of justice or close to block recognition result missing.
According to one preferred embodiment of the present invention, the word recognition unit is additionally operable to according to recognition result and word in picture
The similarity of block determines the confidence level of the recognition result of each block;
The semantic confidence degree determination subelement is specifically configured to:It is summed using the confidence level of each recognition result in sentence
The semantic confidence degree of sentence is obtained, wherein improving the confidence level for the recognition result for constituting word in summation.
According to one preferred embodiment of the present invention, the matching confidence level determination subelement is specifically configured to:Select semanteme
N1 sentences before confidence level comes, n1 are preset positive integer, the sentence selected are matched with sentence database, foundation
Matching state determines the matching confidence level of each sentence.
According to one preferred embodiment of the present invention, the matching confidence level determination subelement determines sentence i using following formula
Matching confidence level Cm:
Cm=Ni×α×Pi
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, PiFor the maximum continuous coupling of sentence i and sentence L
The ratio of the total number of word of word number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.
According to one preferred embodiment of the present invention, which further includes:Network expanding element, for utilizing the identification after correction
As a result scan for, determine with the optimal network documentation of recognition result matching state after correction, intercept in the network documentation with
The matched network character content of recognition result after correction is as the recognition result after extension.
According to one preferred embodiment of the present invention, the network expanding element is when executing the operation of the interception, specifically from
Minimum sentence of the interception comprising the recognition result after correction or minimum paragraph are as the recognition result after extension in the network documentation.
As can be seen from the above technical solutions, the present invention obtains sentence structure distribution by printed page analysis, is based on sentence knot
Structure is distributed the correction that semantic analysis is carried out to the recognition result of each block, to efficiently utilize the semantic information between word
The recognition result of each block is modified, the precision of pictograph identification is improved, preferably meeting the identification of user needs
It asks.
【Description of the drawings】
Fig. 1 is the method flow diagram for the pictograph identification that the embodiment of the present invention one provides;
Fig. 2 is the character area instance graph that the embodiment of the present invention one provides;
Fig. 3 is the correction course schematic diagram based on semantic analysis that the embodiment of the present invention one provides;
Fig. 4 is system schematic provided in an embodiment of the present invention;
Fig. 5 is the structure drawing of device of image recognition provided by Embodiment 2 of the present invention;
Fig. 6 is the structure chart of semantic analysis unit provided by Embodiment 2 of the present invention.
【Specific implementation mode】
To make the objectives, technical solutions, and advantages of the present invention clearer, right in the following with reference to the drawings and specific embodiments
The present invention is described in detail.
Embodiment one,
Fig. 1 is the method flow diagram of pictograph provided in an embodiment of the present invention identification, as shown in Figure 1, this method can be with
Include the following steps:
Step 101:Obtain the character area in images to be recognized.
Server obtains the image for including text information that mobile terminal is sent, which can be mobile terminal shooting
The original image arrived, server extract the character area in images to be recognized in this step.Alternatively, the image can be mobile
After terminal taking to original image, the character area in images to be recognized is sent out after extracting the character area in images to be recognized
Give server.
Existing mode may be used when extracting character area, extract character area after removing image background, can adopt
With but be not limited to following manner:
Mode one, first according to colored Euclidean distance carry out color run-length coding, then carry out color cluster, based on cluster
As a result the generation and selection of character layer are carried out, such as Retention area is more than the connected domain of certain value, based on connected domain and each color
The Euclidean distance of cluster centre generates each image level, finally according to the number of pixels of each image level and this layer of segmentation threshold
The relationship of number of pixels determines word level, noise level or background level, finally takes out after noise level and background level just
Obtain word level, i.e. character area.
Mode two, a large amount of word sample image of selection and the picture without word, use this two class of canny operator extractions
Training sample of the marginal information of picture as rarefaction representation classifying dictionary;Two class training samples are inputted into classification rarefaction representation word
Allusion quotation training algorithm obtains word rarefaction representation classifying dictionary and non-legible rarefaction representation classifying dictionary;Images to be recognized is switched into ash
Image is spent, the marginal information of canny operator extraction gray level images is used;Gray scale is extracted using the rarefaction representation based on classifying dictionary
Candidate character region in image edge information;In the horizontal direction in vertical direction respectively use run-length smoothing algorithm will wait
The edge for selecting character area isolated is connected as larger region, then carries out Projection Analysis and find out corresponding literal line, casts out simultaneously
Isolated edge in candidate character region other than literal line;The character area detected is identified.
If mobile terminal carries out the extraction of character area, existing character area extraction software or hand may be used
Dynamic mode carries out the extraction of character area.
In addition, the character area obtained in this step can be one, can also be two or more.Due in this step
Content is the prior art, and details are not described herein.
Step 102:Each block in character area is identified respectively and records the location information of each block.
The process that wherein each block in character area is identified respectively can include equally following with the prior art
Step:Binaryzation is carried out to character area;Each block is divided into the character area after binaryzation;Extract the feature letter of each block
It ceases and is matched with character features database, using matching result as the recognition result of each block.Specific implementation repeats no more.
The location information of the block of record can be:The coordinate information of block center in the picture, can also further wrap
Include the size information etc. of block.
It should be noted that the recognition result of each block there may be multiple, i.e., for a block, there may be multiple
Recognition result is usually to determine the recognition result for meeting preset requirement with block similarity in picture, and each according to similarity
All there is a confidence level in recognition result.By taking each block in image shown in Fig. 2 as an example:
The recognition result of first character block is:In(44), in(44);
The recognition result of second block is:State(32), enclose(31), ware(29);
The recognition result of third block is:Nosebleed(41), it is bright(40), postal(39);
The recognition result of 4th block is:Political affairs(67), attack(48), change(46).
Number in wherein above-mentioned bracket is the confidence level of each recognition result.
Step 103:Location information based on each block carries out printed page analysis and obtains sentence structure distribution.
It will be less than default first in vertical upper position gap using the coordinate information of block center in the picture in this step
Threshold value(I.e. approximation is on a horizontal line)Literal line of the block as a horizontal direction, alternatively, will in level position
Gap is less than default second threshold(It is i.e. approximate to be arranged vertically at one)Literal line of the block as a vertical direction.As for
The layout for taking the literal line of horizontal direction that the literal line of vertical direction is still taken to depend on image context word, if it is lateral writing
, then the literal line of horizontal direction is taken in this step, if it is what is longitudinally write, then takes the text of vertical direction in this step
Word row, this can be set in advance.
More preferably, it can also will be less than default the in vertical upper position gap further combined with the size information of block
One threshold value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction, alternatively, will be
Position gap is less than default second threshold in level and block difference in size is perpendicular as one less than the block of default size threshold value
Histogram to literal line.
Above-mentioned first threshold and second threshold can be configured based on experience value, the value can as the case may be into
Row adjustment.For example, for character area shown in Fig. 2, since four blocks in vertical direction presetting by position gap
Within first threshold, therefore this four blocks are connected into a literal line.
The present invention is when it is implemented, the acquisition of this step sentence structure distributed intelligence can be not limited to base set out above
In the method for threshold decision, while non-horizontal or vertical direction oblique literal line can also be extracted by other mechanism.
Step 104:The correction based on semantic analysis is carried out to the recognition result of each block based on sentence structure distribution, is obtained
Recognition result after correction.
The recognition result that the block of same literal line will be belonged to carries out semantic analysis correction, carries out semantic analysis timing,
The recognition result of each block is matched in order with word library first, determines the recognition result institute of each block of same literal line
The word combination that can be constituted;Then the semantic confidence degree for each sentence that the recognition result of same literal line can be combined into is determined,
Select the highest sentence of semantic confidence degree as the recognition result after correction;Alternatively, further by each sentence and sentence database
It is matched, the matching confidence level of each sentence is determined according to matching state, semantic confidence degree and matching confidence level are combined
The total confidence level for determining each sentence selects the highest sentence of total confidence level as the recognition result after correction.
The correction course based on semantic analysis is described in detail with reference to Fig. 3, as shown in figure 3, the correction course
It may comprise steps of:
Step 301:The recognition result of each block in literal line is matched with word library, by non-first place in literal line
The recognition result that word can not be formed with the recognition result of adjacent block in the recognition result of block is deleted, but for can be independent
Except recognition result at semantic or close to block recognition result missing.
It should be noted that by can not be with the identification of adjacent block in the recognition result of the block of non-first place in literal line
As a result the step of recognition result of composition word is deleted is held to improve the efficiency of follow-up sentence confidence calculations and selection
Capable step, it is not essential however to.
Step 302:It is carried out by block sequence using the recognition result for constituting word and the recognition result for not constituting word
Combination obtains each sentence, determines the semantic confidence degree of each sentence.
It is exactly the sentence determined literal line and be possible to identify in this step, will be constituted according to the sequence of each block
The recognition result of word and the recognition result for not constituting word are combined, and obtain all possible sentence.
Still by taking the recognition result of situation shown in Fig. 2 as an example, in the recognition result of second block " enclosing ", " ware " with
And " nosebleed " in third recognition result cannot the recognition result of its immediate block constitute word, and itself does not have yet
It is independent semantic, therefore can be deleted in step 301.
Each recognition result is matched with word library, obtained word includes:China, it is postal, bright attack, open policy.
Generating all possible sentence includes:
In state is bright attacks
In state is bright changes
In state's open policy
In state it is postal
In state's postal change
In state's postal attack
China is bright to be changed
Chinese open policy
China is bright to be attacked
China Post
Chinese postal is attacked
Chinese postal changes
When determining that the semantic confidence of each sentence is spent, the confidence level of each recognition result in sentence can be utilized to sum, wherein
The confidence level for the recognition result for constituting word can be improved, such as the confidence level for the recognition result for constituting word is doubled, will be asked
Semantic confidence degree of the confidence level as sentence with after.
The semantic confidence degree of example in connecting, each sentence is as follows, and the number wherein in round bracket is the confidence of each recognition result
It spends, the number in bracket is the semantic confidence degree of sentence.
In(44)State(32)It is bright to attack(40*2+48*2)【268】
In (44) state (32) it is bright(40)Change(46)【162】
In (44) state (32) open policy(40*2+67*2)【214】
In (44) state (32) it is postal(39*2+67*2)【288】
In (44) state (32) postal(39)Change(46)【161】
In (44) state (32) postal(39)It attacks(48)【163】
China(44*2+32*2)It is bright(40)Change(46)【238】
China(44*2+32*2)Open policy(40*2+67*2)【366】
China(44*2+32*2)It is bright to attack(40*2+48*2)【328】
China(44*2+32*2)It is postal(39*2+67*2)【364】
China(44*2+32*2)Postal(39)It attacks(48)【239】
China(44*2+32*2)Postal(39)Change(46)【237】
Step 303:N1 before semantic confidence degree comes sentences are selected, n1 is preset positive integer, the sentence that will be selected
It is matched with sentence database, the matching confidence level of each sentence is determined according to matching state.
Several sentence before semantic confidence degree comes can be selected in this step to be matched with sentence database, this
Sample is done more efficient, naturally it is also possible to be matched all sentences with sentence database, be determined the matching of all sentences
Confidence level.
Example in connecting, it is assumed that select semantic confidence degree and come preceding 3 sentences, is i.e. " Chinese open policy ", " Chinese postal
Political affairs " and " China is bright to be attacked ".
Match in the sentence database used includes common sentence, the matching journey of the sentence picked out and common sentence
Degree is higher, illustrates that it is bigger as the possibility of recognition result, therefore, foundation and sentence when determining matching confidence level here
The matching state of database determines.So-called matching state can be embodied in the number of words and sentence and sentence number of sentence itself
According to the maximum continuous coupling word of the match statement in library and the information such as word ratio of the match statement.
Assuming that the matching confidence level C of sentence imIt is determined using following formula:
Cm=Ni×α×Pi(1)
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, such as takes 100, PiMost for sentence i and sentence L
The ratio of the total number of word of big continuous matching literal number and sentence L, wherein sentence L is with sentence i in sentence database
With sentence, that is to say, that when being matched the sentence selected with sentence database, sentence L, sentence L can be obtained first
Can be completely with sentence i on word matched sentence, can also be matched and match language on most words with sentence i
The most sentence of sentence i numbers of words, that is to say, that certain journey can be reached with the characters matching degree with sentence i in case statement database
It spends and the maximum sentence of matching degree is as match statement, i.e. sentence L.In specific implementation process of the present invention, the acquisition of sentence L also may be used
To be obtained by other statement matching strategies.
For example, sentence " Chinese open policy " is matched to " the China Unicom open policy cooperation Room " in sentence database, according to formula
(1)The matching confidence level of calculating is:
Sentence " China Post " is matched to " China Post " in sentence database, according to formula(1)The matching of calculating is set
Reliability is:
Sentence " China is bright to be attacked " is not matched in sentence database, and matching confidence level is 0.
Step 304:The semantic confidence degree of each sentence and matching confidence level are combined to the total confidence level for determining each sentence,
Select the highest sentence of total confidence level as the recognition result after correction.
Assuming that the value that semantic confidence degree and matching confidence level are summed in this step is as the total of each sentence
Confidence level, then connecting upper example:
Total confidence level of sentence " China Post " is:366+88=454.
Total confidence level of sentence " China Post " is:364+400=764.
Total confidence level of sentence " China bright attacks " is:328+0=328.
Final choice " China Post " is as the recognition result after correction.
After having carried out semantic analysis correction, have been able to ensure that recognition result has certain accuracy, server can
It is shown so that the recognition result after correction is returned to mobile terminal, but in order to further improve accuracy of identification, Ke Yijie
The mode for closing web search is extended, you can further to execute step 105.
With continued reference to Fig. 1, step 105:Determine that the network documentation optimal with recognition result matching state after correction, interception should
With the matched network character content of recognition result after correction as the recognition result after extension in network documentation.
Recognition result after correction is scanned in a network, each document that search obtains is calculated and is tied with identification after correction
The matching state of fruit determines the wherein optimal document of matching state.Optimal so-called matching state can be that matched number of words is most,
Can also be that matched number of words accounts for number of words ratio maximum of network character content etc..
In intercept network word content, the network text of recognition result after correction is included from interception in determining network documentation
Word content, specifically can based in network documentation punctuate or carriage return, interception comprising correction after recognition result minimum sentence or
Minimum paragraph is as the recognition result after extension.So far server the recognition result after extension can be returned to mobile terminal into
Row display.
Example in connecting, " China Post " is scanned in a network, obtains that with current recognition result to match number of words most
Network character content, interception comprising current recognition result minimum sentence be " group company of China Post " as extend after knowledge
Other result.
Can the recognition result after correction and the recognition result after extension be selected one to be shown, can also all shown.Example
It is such as shown as " China Post-group company of China Post ".
It is the detailed description carried out to method provided by the present invention above, the present invention is carried with reference to embodiment two
The device of confession is described in detail, which is arranged on the server, is mainly used for system architecture as shown in Figure 4, the system
It is made of mobile terminal and server, wherein mobile terminal can be using the image comprising word taken as images to be recognized
It is sent to server, character area is therefrom extracted by server, alternatively, mobile terminal makees the image comprising word taken
After images to be recognized, character area is therefrom extracted, which is sent to server.Server executes implementation later
Flow shown in example one, finally by the recognition result after the correction based on semantic analysis and the identification after network extends
As a result one kind or combination in return to mobile terminal.
Embodiment two,
Fig. 5 is the structure drawing of device of image recognition provided by Embodiment 2 of the present invention, as shown in figure 5, the device includes:Area
Domain acquiring unit 500, word recognition unit 510, position recording unit 520, printed page analysis unit 530 and semantic analysis unit
540。
Area acquisition unit 500 obtains the character area in images to be recognized first, and wherein area acquisition unit 500 can be with
The images to be recognized that mobile terminal is sent is received, character area is extracted from images to be recognized;Alternatively, receive mobile terminal from
The character area for extracting and sending in images to be recognized.
Each block in character area is identified in word recognition unit 510 respectively, and existing identification side may be used
Formula, such as specifically include:Binaryzation is carried out to character area;Each block is divided into the character area after binaryzation;Extraction is each
The characteristic information of block is simultaneously matched with property data base, using matching result as the recognition result of each block.
Position recording unit 520 records the location information of each block, and the location information of record can be:Scheming at block center
Coordinate information as in, can further include the size information etc. of block.
Location information of the printed page analysis unit 530 based on each block carries out printed page analysis and obtains sentence structure distribution.The space of a whole page
Analytic unit 530 can be specifically configured to:
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in images to be recognized
Literal line of the block of value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in images to be recognized
Literal line of the block of value as a vertical direction;Alternatively,
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.
As for the layout for taking the literal line of horizontal direction that the literal line of vertical direction is still taken to depend on image context word, such as
Fruit is laterally write, then takes the literal line of horizontal direction in this step, if it is what is longitudinally write, is then taken in this step
The literal line of vertical direction, this can be set in advance.Above-mentioned first threshold and second threshold can based on experience value into
Row setting, the value can be adjusted as the case may be.
The present invention is based on threshold decision when it is implemented, the implementation of printed page analysis unit can be not limited to set out above obtain
Method, while non-horizontal or vertical direction oblique literal line can also be extracted by other mechanism.
Semantic analysis unit 540 carries out the school based on semantic analysis based on sentence structure distribution to the recognition result of each block
Just, the recognition result after being corrected.
The structure of semantic analysis unit 540 is described in detail below, as shown in fig. 6, semantic analysis unit 540 can be with
It specifically includes:Dictionary coupling subelement 541, sentence determination subelement 542, semantic confidence degree determination subelement 543, matching confidence
Spend determination subelement 544 and correction subelement 545.
Dictionary coupling subelement 541 matches the recognition result of each block in literal line with word library, is constituted
The recognition result of word.
Preferably, dictionary coupling subelement 541 can also be by can not in the recognition result of the block of non-first place in literal line
The recognition result that word is formed with the recognition result close to block is deleted, but for can be independently at semantic or close to block
Except the recognition result of recognition result missing.
Sentence determination subelement 542 is using the recognition result for constituting word and does not constitute the recognition result of word by block
Sequence is combined to obtain each sentence.
Semantic confidence degree determination subelement 543 determines the semantic confidence degree of each sentence.Semantic confidence degree is to be based on each identification
What confidence level as a result determined, in this case, word recognition unit 510 is also according to recognition result and picture shown in Fig. 5
The similarity of middle block determines the confidence level of the recognition result of each block.Semantic confidence degree determination subelement 543 utilizes language at this time
The confidence level of each recognition result sums to obtain the semantic confidence degree of sentence in sentence, wherein improving the identification for constituting word in summation
As a result confidence level.
Matching confidence level determination subelement 544 matches each sentence with sentence database, is determined according to matching state
The matching confidence level of each sentence.
In order to improve the efficiency of matching confidence calculations, matching confidence level determination subelement 544 can select semantic confidence
N1 sentences before degree comes, n1 are preset positive integer, and the sentence selected is matched with sentence database, according to matching
Situation determines the matching confidence level of each sentence.
Match in the sentence database used includes common sentence, the matching journey of the sentence picked out and common sentence
Degree is higher, illustrates that it is bigger as the possibility of recognition result, therefore, foundation and sentence when determining matching confidence level here
The matching state of database determines.So-called matching state can be embodied in the number of words and sentence and sentence number of sentence itself
According to the maximum continuous coupling word of the match statement in library and the information such as word ratio of the match statement.
The matching confidence level C that following formula determines sentence i specifically may be usedm:
Cm=Ni×α×Pi
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, PiFor the maximum continuous coupling of sentence i and sentence L
The ratio of the total number of word of word number and sentence L, wherein sentence L are the match statement with sentence i in sentence database, also
Be to say, when being matched the sentence selected with sentence database, sentence L can be obtained first, sentence L can be completely with
Sentence i matched sentences on word can also be that matched and match statement i numbers of words are most on most words with sentence i
Sentence, that is to say, that can be reached a certain level with the characters matching degree with sentence i in case statement database and matching degree most
Big sentence is as match statement, i.e. sentence L.
The semantic confidence degree of each sentence and matching confidence level are combined and determine always setting for each sentence by correction subelement 545
Reliability selects the highest sentence of total confidence level as the recognition result after correction.
With continued reference to Fig. 5, in order to further improve accuracy of identification, which can also include:Network expanding element
550, for being scanned for using the recognition result after correction, determine the network optimal with the recognition result matching state after correction
Document intercepts in the network documentation with the matched network character content of recognition result after correction as the identification knot after extension
Fruit.
Specifically when executing the operation of interception, it can be intercepted from the network documentation comprising the recognition result after correction most
Small sentence or minimum paragraph are as the recognition result after extension.
In the server due to device setting, server can by after correction that above-mentioned apparatus obtains recognition result and expansion
Recognition result after exhibition, which selects one and returns to mobile terminal, to be shown, and can also all be returned to mobile terminal and is shown.
Method and apparatus provided by the invention have following advantages it can be seen from above description:
1)The semantic information efficiently utilized between word is modified the recognition result of each block, improves image
The precision of Text region preferably meets the identification demand of user.
2)A large amount of network character resources present in internet are taken full advantage of, recognition result are extended, more into one
Step excavates user view, promotes the use demand of user.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
With within principle, any modification, equivalent substitution, improvement and etc. done should be included within the scope of protection of the invention god.
Claims (19)
1. a kind of pictograph knows method for distinguishing, which is characterized in that this method includes:
Character area in S1, acquisition images to be recognized;
S2, each block in character area is identified respectively and records the location information of each block;
S3, the location information based on each block carry out printed page analysis and obtain sentence structure distribution;
S41, the recognition result of each block in literal line is matched with word library, obtains the recognition result for constituting word;
S42, it is combined to obtain by block sequence using the recognition result for constituting word and the recognition result for not constituting word
Each sentence;
S43, the semantic confidence degree for determining each sentence, or further match each sentence with sentence database, foundation
The matching confidence level of each sentence is determined with situation;
S44, the semantic confidence degree according to sentence or total confidence level, case statement is as the recognition result after correction, wherein sentence
Total confidence level by sentence semantic confidence degree and matching confidence level combine determine.
2. according to the method described in claim 1, it is characterized in that, the S1 is specifically included:
Server receives the images to be recognized that mobile terminal is sent, and character area is extracted from the images to be recognized;Alternatively,
Server receives the character area that mobile terminal is extracted and sent from images to be recognized.
3. according to the method described in claim 1, it is characterized in that, the S3 is specifically included:
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized
Literal line of the block of value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized
Literal line of the block of value as a vertical direction;Alternatively,
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.
4. according to the method described in claim 1, it is characterized in that, further including in the S41:By non-first place in literal line
In the recognition result of block can not with close to block recognition result form word recognition result delete, but for can independently at
Except the recognition result of semantic or close to block recognition result missing.
5. according to the method described in claim 1, it is characterized in that, further including in the S2:According in recognition result and picture
The similarity of block determines the confidence level of the recognition result of each block;
In the S43 using sentence in the confidence level of each recognition result sum to obtain the semantic confidence degree of sentence, wherein asking
With when improve constitute word recognition result confidence level.
6. according to the method described in claim 1, it is characterized in that, by each sentence and sentence database progress in the S43
Match, determines that the matching confidence level of each sentence specifically includes according to matching state:N1 before semantic confidence degree comes sentences are selected,
N1 is preset positive integer, and the sentence selected is matched with sentence database, and of each sentence is determined according to matching state
With confidence level.
7. according to the method described in claim 1, it is characterized in that, of sentence i is determined using following formula in the S43
With confidence level Cm:
Cm=Ni×α×Pi
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, PiFor the maximum continuous coupling word of sentence i and sentence L
The ratio of the total number of word of number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.
8. according to the method described in claim 1, it is characterized in that, this method further includes:
S5, it is scanned for using the recognition result after correction, determines the network optimal with the recognition result matching state after correction
Document intercepts in the network documentation with the matched network character content of recognition result after correction as the identification knot after extension
Fruit.
9. according to the method described in claim 8, it is characterized in that, in described interception network documentation with the identification knot after correction
The matched network character content of fruit as extension after recognition result be:
From minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph as the knowledge after extension
Other result.
10. a kind of device of image recognition, which is characterized in that the device includes:
Area acquisition unit, for obtaining the character area in images to be recognized;
Word recognition unit, for each block in character area to be identified respectively;
Position recording unit, the location information for recording each block;
Printed page analysis unit carries out printed page analysis for the location information based on each block and obtains sentence structure distribution;
Semantic analysis unit obtains constituting word for matching the recognition result of each block in literal line with word library
Recognition result;It is combined by block sequence using the recognition result for constituting word and the recognition result for not constituting word
To each sentence;It determines the semantic confidence degree of each sentence, or further matches each sentence with sentence database, foundation
The matching confidence level of each sentence is determined with situation;Semantic confidence degree according to sentence or total confidence level, case statement is as correction
Recognition result afterwards, wherein total confidence level of sentence are combined and are determined by the semantic confidence degree and matching confidence level of sentence.
11. device according to claim 10, which is characterized in that the area acquisition unit receives mobile terminal and sends
Images to be recognized, extract character area from the images to be recognized;It is carried from images to be recognized alternatively, receiving mobile terminal
The character area for taking and sending.
12. device according to claim 10, which is characterized in that the printed page analysis unit is specifically configured to:
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized
Literal line of the block of value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized
Literal line of the block of value as a vertical direction;Alternatively,
It will be less than default first threshold in vertical upper position gap using coordinate information of the block center in the images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a horizontal direction;Alternatively,
Position gap it will be less than default second threshold in level using coordinate information of the block center in the images to be recognized
Value and block difference in size are less than literal line of the block of default size threshold value as a vertical direction.
13. the device according to claim 10 or 12, which is characterized in that the semantic analysis unit specifically includes:
Dictionary coupling subelement obtains constituting word for matching the recognition result of each block in literal line with word library
The recognition result of language;
Sentence determination subelement, for suitable by block using the recognition result for constituting word and the recognition result for not constituting word
Sequence is combined to obtain each sentence;
Semantic confidence degree determination subelement, the semantic confidence degree for determining each sentence;
Confidence level determination subelement is matched, for matching each sentence with sentence database, is determined according to matching state each
The matching confidence level of sentence;
Subelement is corrected, for the semantic confidence degree of each sentence and matching confidence level to be combined to the total confidence for determining each sentence
Degree selects the highest sentence of total confidence level as the recognition result after correction.
14. device according to claim 13, which is characterized in that the dictionary coupling subelement is additionally operable to literal line
In non-first place block recognition result in can not form the recognition result of word with the recognition result close to block and delete, but it is right
Except the recognition result that can be independently lacked at semantic or close to block recognition result.
15. device according to claim 13, which is characterized in that the word recognition unit is additionally operable to according to identification knot
The similarity of fruit and block in picture determines the confidence level of the recognition result of each block;
The semantic confidence degree determination subelement is specifically configured to:It sums to obtain using the confidence level of each recognition result in sentence
The semantic confidence degree of sentence, wherein improving the confidence level for the recognition result for constituting word in summation.
16. device according to claim 13, which is characterized in that the matching confidence level determination subelement is specifically configured
For:Select the sentence of n1 before semantic confidence degree comes, n1 is preset positive integer, by the sentence selected and sentence database into
Row matching, the matching confidence level of each sentence is determined according to matching state.
17. device according to claim 13, which is characterized in that the matching confidence level determination subelement is using following public
Formula determines the matching confidence level C of sentence im:
Cm=Ni×α×Pi
Wherein, NiFor the number of words that sentence i includes, α is preset coefficient, PiFor the maximum continuous coupling word of sentence i and sentence L
The ratio of the total number of word of number and sentence L, wherein the sentence L is match statements of the sentence i in sentence database.
18. device according to claim 10, which is characterized in that the device further includes:Network expanding element, for utilizing
Recognition result after correction scans for, and determines that the network documentation optimal with the recognition result matching state after correction, interception should
With the matched network character content of recognition result after correction as the recognition result after extension in network documentation.
19. device according to claim 18, which is characterized in that the network expanding element is in the behaviour for executing the interception
When making, specifically after minimum sentence of the interception comprising the recognition result after correction in the network documentation or minimum paragraph are used as extension
Recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210279370.4A CN103577818B (en) | 2012-08-07 | 2012-08-07 | A kind of method and apparatus of pictograph identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210279370.4A CN103577818B (en) | 2012-08-07 | 2012-08-07 | A kind of method and apparatus of pictograph identification |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103577818A CN103577818A (en) | 2014-02-12 |
CN103577818B true CN103577818B (en) | 2018-09-04 |
Family
ID=50049568
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210279370.4A Active CN103577818B (en) | 2012-08-07 | 2012-08-07 | A kind of method and apparatus of pictograph identification |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103577818B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021142765A1 (en) * | 2020-01-17 | 2021-07-22 | Microsoft Technology Licensing, Llc | Text line detection |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104951741A (en) * | 2014-03-31 | 2015-09-30 | 阿里巴巴集团控股有限公司 | Character recognition method and device thereof |
CN104143084A (en) * | 2014-07-17 | 2014-11-12 | 武汉理工大学 | Auxiliary reading glasses for visual impairment people |
CN105574530B (en) * | 2014-10-08 | 2019-11-22 | 富士通株式会社 | The method and apparatus for extracting the line of text in document |
CN105631393A (en) * | 2014-11-06 | 2016-06-01 | 阿里巴巴集团控股有限公司 | Information recognition method and device |
CN105678207A (en) * | 2014-11-19 | 2016-06-15 | 富士通株式会社 | Device and method for identifying content of target nameplate image from given image |
CN106709489B (en) * | 2015-07-13 | 2020-03-03 | 腾讯科技(深圳)有限公司 | Character recognition processing method and device |
CN108399405B (en) * | 2017-02-07 | 2023-06-27 | 腾讯科技(上海)有限公司 | Business license identification method and device |
US10275687B2 (en) * | 2017-02-16 | 2019-04-30 | International Business Machines Corporation | Image recognition with filtering of image classification output distribution |
GB2571530B (en) * | 2018-02-28 | 2020-09-23 | Canon Europa Nv | An image processing method and an image processing system |
CN109308476B (en) * | 2018-09-06 | 2019-08-27 | 邬国锐 | Billing information processing method, system and computer readable storage medium |
CN109033798B (en) * | 2018-09-14 | 2020-07-07 | 北京金堤科技有限公司 | Click verification code identification method and device based on semantics |
CN111615702B (en) * | 2018-12-07 | 2023-10-17 | 华为云计算技术有限公司 | Method, device and equipment for extracting structured data from image |
CN109934210B (en) | 2019-05-17 | 2019-08-09 | 上海肇观电子科技有限公司 | Printed page analysis method, reading aids, circuit and medium |
CN112183513B (en) * | 2019-07-03 | 2023-09-05 | 杭州海康威视数字技术股份有限公司 | Method and device for recognizing characters in image, electronic equipment and storage medium |
CN110490190B (en) * | 2019-07-04 | 2021-10-26 | 贝壳技术有限公司 | Structured image character recognition method and system |
CN111539412B (en) * | 2020-04-21 | 2021-02-26 | 上海云从企业发展有限公司 | Image analysis method, system, device and medium based on OCR |
CN112541496B (en) * | 2020-12-24 | 2023-08-22 | 北京百度网讯科技有限公司 | Method, device, equipment and computer storage medium for extracting POI (point of interest) names |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1604073A (en) * | 2004-11-22 | 2005-04-06 | 北京北大方正技术研究院有限公司 | Method for conducting title and text logic connection for newspaper pages |
CN101447017A (en) * | 2008-11-27 | 2009-06-03 | 浙江工业大学 | Method and system for quickly identifying and counting votes on the basis of layout analysis |
CN101493896A (en) * | 2008-01-24 | 2009-07-29 | 夏普株式会社 | Document image processing apparatus and method |
CN101770576A (en) * | 2008-12-31 | 2010-07-07 | 北京新岸线网络技术有限公司 | Method and device for extracting characters |
CN102456136A (en) * | 2010-10-29 | 2012-05-16 | 方正国际软件(北京)有限公司 | Image-text splitting method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006092346A (en) * | 2004-09-24 | 2006-04-06 | Fuji Xerox Co Ltd | Equipment, method, and program for character recognition |
-
2012
- 2012-08-07 CN CN201210279370.4A patent/CN103577818B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1604073A (en) * | 2004-11-22 | 2005-04-06 | 北京北大方正技术研究院有限公司 | Method for conducting title and text logic connection for newspaper pages |
CN101493896A (en) * | 2008-01-24 | 2009-07-29 | 夏普株式会社 | Document image processing apparatus and method |
CN101447017A (en) * | 2008-11-27 | 2009-06-03 | 浙江工业大学 | Method and system for quickly identifying and counting votes on the basis of layout analysis |
CN101770576A (en) * | 2008-12-31 | 2010-07-07 | 北京新岸线网络技术有限公司 | Method and device for extracting characters |
CN102456136A (en) * | 2010-10-29 | 2012-05-16 | 方正国际软件(北京)有限公司 | Image-text splitting method and system |
Non-Patent Citations (1)
Title |
---|
印刷体汉字识别系统研究与实现;刘聚宁;《中国优秀硕士学位论文全文数据库》;20110915(第9期);第3-4、11、14页 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021142765A1 (en) * | 2020-01-17 | 2021-07-22 | Microsoft Technology Licensing, Llc | Text line detection |
Also Published As
Publication number | Publication date |
---|---|
CN103577818A (en) | 2014-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103577818B (en) | A kind of method and apparatus of pictograph identification | |
CN110569832B (en) | Text real-time positioning and identifying method based on deep learning attention mechanism | |
US20200065601A1 (en) | Method and system for transforming handwritten text to digital ink | |
Burie et al. | ICDAR2015 competition on smartphone document capture and OCR (SmartDoc) | |
JP5031741B2 (en) | Grammatical analysis of document visual structure | |
EP1598770B1 (en) | Low resolution optical character recognition for camera acquired documents | |
US8494273B2 (en) | Adaptive optical character recognition on a document with distorted characters | |
JP2020184109A (en) | Learning model generation device, character recognition device, learning model generation method, character recognition method, and program | |
CN108805076A (en) | The extracting method and system of environmental impact assessment report table word | |
CN111191649A (en) | Method and equipment for identifying bent multi-line text image | |
CN112434690A (en) | Method, system and storage medium for automatically capturing and understanding elements of dynamically analyzing text image characteristic phenomena | |
US20150055866A1 (en) | Optical character recognition by iterative re-segmentation of text images using high-level cues | |
CN111046760A (en) | Handwriting identification method based on domain confrontation network | |
CN111753120A (en) | Method and device for searching questions, electronic equipment and storage medium | |
CN115131804A (en) | Document identification method and device, electronic equipment and computer readable storage medium | |
CN104408403B (en) | A kind of referee method that secondary typing is inconsistent and device | |
CN108304815A (en) | A kind of data capture method, device, server and storage medium | |
CN112949649B (en) | Text image identification method and device and computing equipment | |
CN109508712A (en) | A kind of Chinese written language recognition methods based on image | |
RU2657181C1 (en) | Method of improving quality of separate frame recognition | |
CN106339726A (en) | Method and device for handwriting recognition | |
CN112560849B (en) | Neural network algorithm-based grammar segmentation method and system | |
CN115050025A (en) | Knowledge point extraction method and device based on formula recognition | |
Su et al. | HITHCD-2018: Handwritten Chinese Character Database of 21K-Category | |
Dimov et al. | CBIR approach to the recognition of a sign language alphabet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |