CN103679165B - OCR (optical character recognition) character recognition method and system - Google Patents
OCR (optical character recognition) character recognition method and system Download PDFInfo
- Publication number
- CN103679165B CN103679165B CN201310752624.4A CN201310752624A CN103679165B CN 103679165 B CN103679165 B CN 103679165B CN 201310752624 A CN201310752624 A CN 201310752624A CN 103679165 B CN103679165 B CN 103679165B
- Authority
- CN
- China
- Prior art keywords
- word string
- noise
- character
- ocr
- less
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The invention provides an OCR (optical character recognition) character recognition method. The method comprises the following steps of executing the OCR character recognition for an image in a target area selected by a user so as to obtain a recognized word string; calculating the quantity of sub-word strings in the recognized word string; judging whether the number of characters in a first sub-word string W1 and the number of characters in a kth sub-word string WK are smaller than a preset value or not when the quantity of the sub-word strings in the word string is more than 2; judging the noise probability score of W1 and/or the noise probability score of the WK is greater than a preset noise or not if the number of the characters in the W1 and/or the number of characters in the WK is smaller than the preset value; determining the W1 and/or WK is noise if the noise probability score of W1 and/or the noise probability score of WK is greater than the preset noise, and deleting W1 and/or WK from the word string so as to obtain a novel word string. According to the embodiment, the OCR translation accuracy for the OCR recognition result can be enhanced. The invention also provides an OCR character recognition system.
Description
Technical field
The present invention relates to character recognition technologies field, particularly to a kind of OCR character identifying method and system.
Background technology
Much translation APP product all supports interpretative function of taking pictures at present, and its operating procedure is for example:User holds mobile terminal
(As smart mobile phone)Take pictures against foreign language to be translated, the photo of take is covered last layer gray scale;User is coveing with gray scale with finger
Photo on slide, want translate word " wiping " out;The region that user is clashed carries out OCR identification, obtains foreign language literary composition
This;Call machine translation module, OCR result is translated, be ultimately rendered to user.
Whole operation process is as shown in Figure 1.But have a problem in said process, user when " wiping " word, by
Block screen in finger, often left and right or neighbouring word " have been wiped " in OCR scope also together.As above in figure institute
Show, user this expect translation this word of Obama, but in practical operation left and right each marked several letters, lead to the knot of OCR
Fruit is " it Obama I ", and through machine translation, the final translation result obtaining is " Obama, I ".Such translation result
User can be caused to perplex, affect Consumer's Experience.
Content of the invention
The purpose of the present invention is intended at least solve one of described technological deficiency.
For this reason, it is an object of the present invention to proposing a kind of OCR character identifying method.The method can be lifted to be known to OCR
The accuracy of the OCR translation of other result.
Further object is that proposing a kind of OCR character recognition system.
For reaching above-mentioned purpose, the embodiment of first aspect present invention discloses a kind of OCR character identifying method, including with
Lower step:The word string that the image in target area that user is selected carries out OCR character recognition to be identified, wherein, described
Word string includes K sub- word string, and every sub- word string at least includes 1 character, and described K is positive integer;Calculate the word string of described identification
The quantity of neutron word string;If the quantity of described word string neutron word string is more than 2, judge described 1st sub- word string W1Middle character
Number and sub- word string W of described k-thKWhether the number of middle character is less than preset value;If described W1The number of middle character and/
Or WKThe number of middle character is less than described preset value, then judge described W1Noise probability score and/or WKNoise probability score
Whether more than default noise;If it is, judging described W1And/or described WKDelete described W for noise and from described word string1
And/or described WKTo obtain new word string.
OCR character identifying method according to embodiments of the present invention, the result for OCR identification in OCR translation carries out noise reduction
Process, thus, can recognize that and delete the OCR noise being typically due to that user misoperation is brought.So, after denoising, can be lifted
With purification translation result, make translation result more accurate, improve Consumer's Experience.
In addition, OCR character identifying method according to the above embodiment of the present invention can also have the technology spy adding as follows
Levy:
In some instances, also include:If the quantity of described word string neutron word string is equal to 2, judge described W1Middle word
Whether the number of symbol is less than described WKThe number of middle character;If described W1The number of middle character is less than described WKMiddle character
Number, then determine whether described W1Whether the number of middle character is less than preset value;If described W1The number of middle character is less than described
Preset value, then determine whether described W1Noise probability score whether more than default noise;If it is, judging described W1For
Noise simultaneously deletes described W from described word string1To obtain new word string.
In some instances, also include:If described W1The number of middle character is more than described WKThe number of middle character, then enter
One step judges described WKWhether the number of middle character is less than preset value;If described WKThe number of middle character is less than described preset value,
Then determine whether described WKNoise probability score whether more than default noise;If it is, judging described WKFor noise and from
Described W is deleted in described word stringKTo obtain new word string.
In some instances, described noise is obtained by equation below:
Pleft=α logp (W1)+βlogp(W2|W1),
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
In some instances, also include:OCR translation is carried out to described new word string.
The embodiment of second aspect present invention provides a kind of OCR character recognition system, including:Identification module, for right
The word string that the image in target area that user selects carries out OCR character recognition to be identified, wherein, described word string includes K
Individual sub- word string, every sub- word string at least includes 1 character, and described K is positive integer;Computing module, for calculating described identification
The quantity of word string neutron word string;Denoising module, is more than 2 for the quantity in described word string neutron word string, judges described 1st son
Word string W1The number of middle character and sub- word string W of described k-thKWhether the number of middle character is less than preset value, if less than described pre-
If during value, judging described W1Noise probability score and/or described WKNoise probability score whether more than default noise, if
More than described default noise, then judge described W1And/or described WKDelete described W for noise and from described word string1And/or institute
State WKTo obtain new word string.
OCR character recognition system according to embodiments of the present invention, the result for OCR identification in OCR translation carries out noise reduction
Process, thus, can recognize that and delete the OCR noise being typically due to that user misoperation is brought.So, after denoising, can be lifted
With purification translation result, make translation result more accurate, improve Consumer's Experience.
In addition, OCR character identifying method according to the above embodiment of the present invention can also have the technology spy adding as follows
Levy:
In some instances, described denoising module is additionally operable to:If the quantity of described word string neutron word string is equal to 2, sentence
Break described W1Whether the number of middle character is less than described WKThe number of middle character;If described W1The number of middle character is less than described
WKThe number of middle character, then determine whether described W1Whether the number of middle character is less than preset value;If described W1Middle character
Number is less than described preset value, then determine whether described W1Noise probability score whether more than default noise;If it is,
Judge described W1Delete described W for noise and from described word string1To obtain new word string.
In some instances, described denoising module is additionally operable to:If described W1The number of middle character is more than described WKMiddle character
Number, then determine whether described WKWhether the number of middle character is less than preset value;If described WKThe number of middle character is less than
Described preset value, then determine whether described WKNoise probability score whether more than default noise;If it is, judging described
WKDelete described W for noise and from described word stringKTo obtain new word string.
In some instances, described noise is obtained by equation below:
Pleft=α logp (W1)+βlogp(W2|W1),
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
In some instances, also include:Translation module, for carrying out OCR translation to described new word string.
The aspect that the present invention adds and advantage will be set forth in part in the description, and partly will become from the following description
Obtain substantially, or recognized by the practice of the present invention.
Brief description
Of the present invention and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments
Substantially and easy to understand, wherein:
Fig. 1 is a kind of interface schematic diagram of OCR identification translation;
Fig. 2 is the flow chart of OCR character identifying method according to an embodiment of the invention;
Fig. 3 is the flow chart of OCR character identifying method in accordance with another embodiment of the present invention;And
Fig. 4 is the structure chart of OCR character recognition system according to an embodiment of the invention.
Specific embodiment
Embodiments of the invention are described below in detail, the example of described embodiment is shown in the drawings, wherein from start to finish
The element that same or similar label represents same or similar element or has same or like function.Below with reference to attached
The embodiment of figure description is exemplary, is only used for explaining the present invention, and is not construed as limiting the claims.
In describing the invention it is to be understood that term " longitudinal ", " horizontal ", " on ", D score, "front", "rear",
The orientation of instruction such as "left", "right", " vertical ", " level ", " top ", " bottom " " interior ", " outward " or position relationship are based on accompanying drawing institute
The orientation showing or position relationship, are for only for ease of the description present invention and simplify description, rather than the dress of instruction or hint indication
Put or element must have specific orientation, with specific azimuth configuration and operation, therefore it is not intended that limit to the present invention
System.
In describing the invention, it should be noted that unless otherwise prescribed and limit, term " installation ", " being connected ",
" connection " should be interpreted broadly, for example, it may be the connection of mechanical connection or electrical connection or two element internals, can
To be to be joined directly together it is also possible to be indirectly connected to by intermediary, for the ordinary skill in the art, can basis
Concrete condition understands the concrete meaning of described term.
Below in conjunction with Description of Drawings OCR character identifying method according to embodiments of the present invention and system.
Fig. 2 is the flow chart of OCR character identifying method according to an embodiment of the invention.
As shown in Fig. 2 OCR character identifying method according to an embodiment of the invention, comprise the following steps:
Step S201:The word string that the image in target area that user is selected carries out OCR character recognition to be identified,
Wherein, word string includes K sub- word string, and every sub- word string at least includes 1 character, and K is positive integer.
Step S202:Calculate the quantity of the word string neutron word string of identification.
Step S203:If the quantity of word string neutron word string is more than 2, judge the 1st sub- word string W1The number of middle character
Sub- word string W with k-thKWhether the number of middle character is less than preset value.
Step S204:If W1The number of middle character and/or WKThe number of middle character is less than preset value, then judge W1Make an uproar
Sound probability score and/or WKNoise probability score whether more than default noise.
Step S205:If it is, judging W1And/or WKFor noise and from word string delete W1And/or WKNew to obtain
Word string.
In one embodiment of the invention, this OCR character identifying method, further comprising the steps of:
If the quantity of 1 word string neutron word string is equal to 2, judge W1Whether the number of middle character is less than WKMiddle character
Number.
If 2 W1The number of middle character is less than WKThe number of middle character, then determine whether W1Whether the number of middle character
Less than preset value.
If 3 W1The number of middle character is less than preset value, then determine whether W1Noise probability score whether more than pre-
If noise.
4 if it is, judge W1For noise and from word string delete W1To obtain new word string.
Further, methods described also includes:
If 1 W1The number of middle character is more than WKThe number of middle character, then determine whether WKWhether the number of middle character
Less than preset value.
If 2 WKThe number of middle character is less than preset value, then determine whether WKNoise probability score whether more than pre-
If noise.
3 if it is, judge WKFor noise and from word string delete WKTo obtain new word string.
In one embodiment of the invention, noise is obtained by equation below:
Pleft=α logp (W1)+βlogp(W2|W1),
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
The OCR character identifying method of the embodiment of the present invention, after obtaining new word string, also includes:New word string is carried out
OCR translates.
As a specific example it is assumed that OCR translation in, OCR recognition result(Identify the word string obtaining)It is one
Comprise word string W of k wordk:W1W2W3W4…Wk-2Wk-1Wk.WkMiddle W1And WkIt is probably the noise that user misoperation is brought.Generally
In the case of, the length of noise typically will not more than one word.OCR recognition result is carried out with noise reduction is exactly to calculate W respectively1And Wk
Noise probability score, if noise probability score be more than a certain threshold value(I.e. default noise in above-mentioned example), then judge W1
And/or WkIt is noise.
In conjunction with shown in Fig. 3, specifically determining whether that the step of noise includes:
Step S301:Start, input Wk=W1…Wk.
Step S302:Whether judge K equal to 1, if it is execution step S303, otherwise execution step S304.
Step S303:Return W1.
Step S304:Whether judge K equal to 2, if it is execution step S305, otherwise execution step S308.
Step S305:Judge W1Including the number of character whether be less than W2(I.e. Wk, K is equal to 2)Including character
Number, i.e. len (W1)<len(W2), if it is, execution step S306, otherwise execution step S307.
Step S306:Another T={ W1, wherein, T represents sub- word string W of inclusion1Set.
Step S307, another T={ Wk, wherein, T represents sub- word string W of inclusionkSet.
Step S308:Another T={ W1, Wk, wherein, T represents sub- word string W of inclusion1With sub- word string WkSet.In conjunction with figure
Shown in 1, then T={ it, I }.
Step S309:Delete character length in set T(The i.e. number of character)More than the word of preset value, wherein, due to
It is more than 3 for the alphabetical number that the English word needing to be translated generally includes, therefore, this preset value can be set to but not limit
In 3.
Step S310:For the word of set T, calculate noise probability score NoisyScore (), if noise probability obtains
Divide and be more than threshold θ(Preset noise)Then it is assumed that the sub- word string that set T includes is noise.
Step S311:Terminate.
In above-mentioned example, the computational methods of noise probability score NoiseScore () can be using similar statistical language mould
The method of type, if leftmost word(I.e. W1), then calculate Pleft, if rightmost word(I.e. Wk), then calculate
Pright, concrete formula is:
Pleft=α logp (W1)+βlogp(W2|W1);
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
Wherein p (wi|wi-1) represent binary phrase wi-1wiProbability, its statistical method is:
And p (wi) represent unitary word wiProbability, its statistical method is:
Wherein, α and β is unitary word and the weight of binary phrase, and value is respectively but is not limited to -1 and -0.5.
By experiment statisticses, the threshold θ of noise probability score NoisyScore () can be set(Preset noise)For 10.5.
OCR character identifying method according to embodiments of the present invention, the result for OCR identification in OCR translation carries out noise reduction
Process, thus, can recognize that and delete the OCR noise being typically due to that user misoperation is brought.So, after denoising, can be lifted
With purification translation result, make translation result more accurate, improve Consumer's Experience.
Fig. 4 is the structure chart of OCR character recognition system according to an embodiment of the invention.As shown in figure 4, according to this
The OCR character recognition system 400 of a bright embodiment, including:Identification module 410, computing module 420 and denoising module 430.
Wherein, identification module 410 is used for carrying out OCR character recognition to obtain to the image in the target area of user's selection
The word string of identification, wherein, word string includes K sub- word string, and every sub- word string at least includes 1 character, and K is positive integer.Calculate mould
Block 420 is used for calculating the quantity of the word string neutron word string of identification.Denoising module 430 is used for being more than in the quantity of word string neutron word string
2, judge the 1st sub- word string W1The number of middle character and sub- word string W of k-thKWhether the number of middle character is less than preset value, if
During less than preset value, judge W1Noise probability score and/or WKNoise probability score whether more than default noise, if greatly
In default noise, then judge W1And/or WKFor noise and from word string delete W1And/or WKTo obtain new word string.
In one embodiment of the invention, denoising module 430 is additionally operable to:If the quantity of word string neutron word string is equal to 2,
Then judge W1Whether the number of middle character is less than WKThe number of middle character;If W1The number of middle character is less than WKMiddle character
Number, then determine whether W1Whether the number of middle character is less than preset value;If W1The number of middle character is less than described preset value,
Then determine whether W1Noise probability score whether more than default noise;If it is, judging W1Delete for noise and from word string
Except W1To obtain new word string.
Further, denoising module 430 is additionally operable to:If W1The number of middle character is more than WKThe number of middle character, then enter one
Step judges WKWhether the number of middle character is less than preset value;If WKThe number of middle character is less than preset value, then determine whether WK
Noise probability score whether more than default noise;If it is, judging WKFor noise and from word string delete WKNew to obtain
Word string.
Wherein, noise can be obtained by equation below:
Pleft=α logp (W1)+βlogp(W2|W1),
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
Certainly, the OCR character recognition system 400 of the embodiment of the present invention, also includes:Translation module(In figure is not shown), turn over
Translate module for OCR translation is carried out to new word string.
Specifically, in conjunction with shown in Fig. 3, the processing procedure of the OCR character recognition system 400 of the embodiment of the present invention is as follows:
Assume in OCR translation, OCR recognition result(Identify the word string obtaining)It is word string W comprising k wordk:
W1W2W3W4…Wk-2Wk-1Wk.WkMiddle W1And WkIt is probably the noise that user misoperation is brought.Under normal circumstances, the length one of noise
As will not more than one word.OCR recognition result is carried out with noise reduction is exactly to calculate W respectively1And WkNoise probability score, if
Noise probability score is more than a certain threshold value(I.e. default noise in above-mentioned example), then judge W1And/or WkIt is noise.
In conjunction with shown in Fig. 3, specific processing procedure includes:
Step S301:Start, input Wk=W1…Wk.
Step S302:Whether judge K equal to 1, if it is execution step S303, otherwise execution step S304.
Step S303:Return W1.
Step S304:Whether judge K equal to 2, if it is execution step S305, otherwise execution step S308.
Step S305:Judge W1Including the number of character whether be less than W2(I.e. Wk, K is equal to 2)Including character
Number, i.e. len (W1)<len(W2), if it is, execution step S306, otherwise execution step S307.
Step S306:Another T={ W1, wherein, T represents sub- word string W of inclusion1Set.
Step S307, another T={ Wk, wherein, T represents sub- word string W of inclusionkSet.
Step S308:Another T={ W1, Wk, wherein, T represents sub- word string W of inclusion1With sub- word string WkSet.In conjunction with figure
Shown in 1, then T={ it, I }.
Step S309:Delete character length in set T(The i.e. number of character)More than the word of preset value, wherein, due to
It is more than 3 for the alphabetical number that the English word needing to be translated generally includes, therefore, this preset value can be set to but not limit
In 3.
Step S310:For the word of set T, calculate noise probability score NoisyScore (), if noise probability obtains
Divide and be more than threshold θ(Preset noise)Then it is assumed that the sub- word string that set T includes is noise.
Step S311:Terminate.
In above-mentioned example, the computational methods of noise probability score NoiseScore () can be using similar statistical language mould
The method of type, if leftmost word(I.e. W1), then calculate Pleft, if rightmost word(I.e. Wk), then calculate
Pright, concrete formula is:
Pleft=α logp (W1)+βlogp(W2|W1);
Pright=α logp (Wk)+βlogp(Wk|Wk-1).
Wherein p (wi|wi-1) represent binary phrase wi-1wiProbability, its statistical method is:
And p (wi) represent unitary word wiProbability, its statistical method is:
Wherein, α and β is unitary word and the weight of binary phrase, and value is respectively but is not limited to -1 and -0.5.
By experiment statisticses, the threshold θ of noise probability score NoisyScore () can be set(Preset noise)For 10.5.
OCR character recognition system according to embodiments of the present invention, the result for OCR identification in OCR translation carries out noise reduction
Process, thus, can recognize that and delete the OCR noise being typically due to that user misoperation is brought.So, after denoising, can be lifted
With purification translation result, make translation result more accurate, improve Consumer's Experience.
In the description of this specification, reference term " embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or the spy describing with reference to this embodiment or example
Point is contained at least one embodiment or the example of the present invention.In this manual, to the schematic representation of described term not
Necessarily refer to identical embodiment or example.And, the specific features of description, structure, material or feature can be any
One or more embodiments or example in combine in an appropriate manner.
Although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, permissible
Understand and can carry out multiple changes, modification, replacement to these embodiments without departing from the principles and spirit of the present invention
And modification, the scope of the present invention by claims and its equivalent limits.
Claims (10)
1. a kind of OCR character identifying method is it is characterised in that comprise the following steps:
The word string that the image in target area that user is selected carries out OCR character recognition to be identified, wherein, described word string
Including K sub- word string, every sub- word string at least includes 1 character, and described K is positive integer;
Calculate the quantity of the word string neutron word string of described identification;
If the quantity of described word string neutron word string is more than 2, judge described 1st sub- word string W1The number of middle character and described
Sub- word string W of k-thKWhether the number of middle character is less than preset value;
If described W1The number of middle character and/or WKThe number of middle character is less than described preset value, then judge described W1Noise
Probability score and/or WKNoise probability score whether more than default noise, wherein, described noise probability score is used for evaluating son
Whether word string is noise;
If it is, judging described W1And/or described WKDelete described W for noise and from described word string1And/or described WKWith
To new word string.
2. OCR character identifying method according to claim 1 is it is characterised in that also include:
If the quantity of described word string neutron word string is equal to 2, judge described W1Whether the number of middle character is less than described WKMiddle word
The number of symbol;
If described W1The number of middle character is less than described WKThe number of middle character, then determine whether described W1Middle character
Whether number is less than preset value;
If described W1The number of middle character is less than described preset value, then determine whether described W1Noise probability score whether
More than default noise;
If it is, judging described W1Delete described W for noise and from described word string1To obtain new word string.
3. OCR character identifying method according to claim 2 is it is characterised in that also include:
If described W1The number of middle character is more than described WKThe number of middle character, then determine whether described WKMiddle character
Whether number is less than preset value;
If described WKThe number of middle character is less than described preset value, then determine whether described WKNoise probability score whether
More than default noise;
If it is, judging described WKDelete described W for noise and from described word stringKTo obtain new word string.
4. OCR character identifying method according to claim 1 is it is characterised in that described noise probability score is by as follows
Formula obtains:Pleft=α logp (W1)+βlogp(W2|W1), Pright=α logp (Wk)+βlog(Wk|Wk-1), wherein, α and β is
Unitary word and the weight of binary phrase, p (wi|wi-1) it is binary phrase wi-1wiProbability, p (wi) it is unitary word wiGeneral
Rate.
5. the OCR character identifying method according to any one of claim 1-4 is it is characterised in that also include:To described new
Word string carries out OCR translation.
6. a kind of OCR character recognition system is it is characterised in that include:
Identification module, the word string carrying out OCR character recognition to be identified for the image in target area that user is selected,
Wherein, described word string includes K sub- word string, and every sub- word string at least includes 1 character, and described K is positive integer;
Computing module, for calculating the quantity of the word string neutron word string of described identification;
Denoising module, is more than 2 for the quantity in described word string neutron word string, judges described 1st sub- word string W1Middle character
Number and sub- word string W of described k-thKWhether the number of middle character is less than preset value, during if less than described preset value, judges described
W1Noise probability score and/or described WKNoise probability score whether more than default noise, default make an uproar if greater than described
Sound, then judge described W1And/or described WKDelete described W for noise and from described word string1And/or described WKNew to obtain
Word string, wherein, described noise probability score is used for evaluating whether sub- word string is noise.
7. OCR character recognition system according to claim 6 is it is characterised in that described denoising module is additionally operable to:
If the quantity of described word string neutron word string is equal to 2, judge described W1Whether the number of middle character is less than described WKMiddle word
The number of symbol;
If described W1The number of middle character is less than described WKThe number of middle character, then determine whether described W1Middle character
Whether number is less than preset value;
If described W1The number of middle character is less than described preset value, then determine whether described W1Noise probability score whether
More than default noise;
If it is, judging described W1Delete described W for noise and from described word string1To obtain new word string.
8. OCR character recognition system according to claim 7 is it is characterised in that described denoising module is additionally operable to:
If described W1The number of middle character is more than described WKThe number of middle character, then determine whether described WKMiddle character
Whether number is less than preset value;
If described WKThe number of middle character is less than described preset value, then determine whether described WKNoise probability score whether
More than default noise;
If it is, judging described WKDelete described W for noise and from described word stringKTo obtain new word string.
9. OCR character recognition system according to claim 6 is it is characterised in that described noise probability score is by as follows
Formula obtains:Pleft=α logp (W1)+βlogp(W2|W1), Pright=α logp (Wk)+βlogp(Wk|Wk-1), wherein, α and β is
Unitary word and the weight of binary phrase, p (wi|wi-1) it is binary phrase wi-1wiProbability, p (wi) it is unitary word wiGeneral
Rate.
10. the OCR character recognition system according to any one of claim 6-9 is it is characterised in that also include:Translation module,
For OCR translation is carried out to described new word string.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310752624.4A CN103679165B (en) | 2013-12-31 | 2013-12-31 | OCR (optical character recognition) character recognition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310752624.4A CN103679165B (en) | 2013-12-31 | 2013-12-31 | OCR (optical character recognition) character recognition method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103679165A CN103679165A (en) | 2014-03-26 |
CN103679165B true CN103679165B (en) | 2017-02-08 |
Family
ID=50316655
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310752624.4A Active CN103679165B (en) | 2013-12-31 | 2013-12-31 | OCR (optical character recognition) character recognition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103679165B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599857A (en) * | 2016-12-20 | 2017-04-26 | 广东欧珀移动通信有限公司 | Image identification method, apparatus, computer-readable storage medium and terminal device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5448474A (en) * | 1993-03-03 | 1995-09-05 | International Business Machines Corporation | Method for isolation of Chinese words from connected Chinese text |
CN1477559A (en) * | 2002-08-23 | 2004-02-25 | 华为技术有限公司 | Method for implementing long character string prefix matching |
CN103186587A (en) * | 2011-12-30 | 2013-07-03 | 牟颖 | Method for quickly translating English word of book through mobile phone |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2144189A3 (en) * | 2008-07-10 | 2014-03-05 | Samsung Electronics Co., Ltd. | Method for recognizing and translating characters in camera-based image |
-
2013
- 2013-12-31 CN CN201310752624.4A patent/CN103679165B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5448474A (en) * | 1993-03-03 | 1995-09-05 | International Business Machines Corporation | Method for isolation of Chinese words from connected Chinese text |
CN1477559A (en) * | 2002-08-23 | 2004-02-25 | 华为技术有限公司 | Method for implementing long character string prefix matching |
CN103186587A (en) * | 2011-12-30 | 2013-07-03 | 牟颖 | Method for quickly translating English word of book through mobile phone |
Also Published As
Publication number | Publication date |
---|---|
CN103679165A (en) | 2014-03-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9262412B2 (en) | Techniques for predictive input method editors | |
US9928831B2 (en) | Speech data recognition method, apparatus, and server for distinguishing regional accent | |
CN114399769B (en) | Training method of text recognition model, and text recognition method and device | |
CN110718226B (en) | Speech recognition result processing method and device, electronic equipment and medium | |
WO2013052330A2 (en) | Interactive text editing | |
CN112559800B (en) | Method, apparatus, electronic device, medium and product for processing video | |
CN104133561A (en) | Auxiliary information display method and device based on input method | |
US10504508B2 (en) | Response generation device, dialog control system, and response generation method | |
CN115438650B (en) | Contract text error correction method, system, equipment and medium fusing multi-source characteristics | |
US20100153110A1 (en) | Voice recognition system and method of a mobile communication device | |
KR102618483B1 (en) | Device and method to filter text | |
CN109978044B (en) | Training data generation method and device, and model training method and device | |
CN103679165B (en) | OCR (optical character recognition) character recognition method and system | |
CN110796115B (en) | Image detection method and device, electronic equipment and readable storage medium | |
CN113033346A (en) | Text detection method and device and electronic equipment | |
CN112559725A (en) | Text matching method, device, terminal and storage medium | |
CN103778210B (en) | Method and device for judging specific file type of file to be analyzed | |
JP2012093968A (en) | Character recognition apparatus and character recognition method, recognition character correction apparatus and recognition character correction method and program | |
CN115700519A (en) | Text-to-image generation method and device, storage medium and terminal | |
CN114580391A (en) | Chinese error detection model training method, device, equipment and storage medium | |
CN113378541A (en) | Text punctuation prediction method, device, system and storage medium | |
CN110728137B (en) | Method and device for word segmentation | |
CN104134064A (en) | Character recognition method and device | |
CN103455162A (en) | Input processing method and device | |
WO2020166125A1 (en) | Translation data generating system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |