CN106682671A - Image character recognition system - Google Patents
Image character recognition system Download PDFInfo
- Publication number
- CN106682671A CN106682671A CN201611254376.0A CN201611254376A CN106682671A CN 106682671 A CN106682671 A CN 106682671A CN 201611254376 A CN201611254376 A CN 201611254376A CN 106682671 A CN106682671 A CN 106682671A
- Authority
- CN
- China
- Prior art keywords
- pictures
- sub
- picture
- character
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Character Discrimination (AREA)
Abstract
The invention relates to the field of image recognition processing, in particular to an image character recognition system. The system comprises an image character segmentation module, a feature image generation module, a storage module, a normalization processing module and an image character recognition module. The image character segmentation module segments a to-be-processed image into sub-pictures, wherein each sub-picture comprises a single character and is stored in the storage module. The feature image generation module manufactures a corresponding character feature picture and stores the character feature picture into the storage module according to the typeface of a to-be-recognized image character. The normalization processing module extracts the feature picture and the to-be-recognized sub-picture stored in the storage module, carries out normalization processing according to corresponding types, and stores the processed picture information in the storage module. The image character recognition module extracts the sub-picture in the storage module, and calculates the coincidence degree of the sub-picture and the feature picture by use of the exclusive OR algorithm, thereby achieving recognition of character contents of the sub-picture and inputting the recognition results.
Description
Technical field
Field of image recognition of the present invention, more particularly to system for recognizing characters from image.
Background technology
With the development and the progress of science and technology of society, the knowledge that the mankind create just is increased with exponential quantity, in electronics
Before books occur, most knowledge is passed in the way of books, Chinese 5,000-year and down, is generated a large amount of outstanding
Books, these books in the long korneforos of history, more or less all suffered it is different degrees of damage, therefore these books are carried out
Digitized storage is extremely urgent;In taking care of books field, the fast search of book contents is helpful for quick positioning book,
And because books quantity is too many, adding the books of early stage printing does not have the electronic manuscript of author, therefore the electronization of paper book
It is necessary.
Optical character recognition is exactly to process the sharp weapon that this paper book is converted to electronic document, and it is mainly using big
The character sample of amount, through the study of complex network, generates corresponding model file, so as to reach the mesh for recognizing character in picture
's.
Optical character recognition major function is the character in identification shooting, scanned picture, is being carried out in prior art
In image during the identification of word, it is necessary first to open the character string cutting in image, the little picture comprising single word is formed, so
Afterwards the word after cutting is identified using certain method.And carry out character segmentation most common method for sciagraphy, i.e.,
Be by pictograph binary conversion treatment after, the demarcation line between two words is found by vertical projection method, according to demarcation line will
Character segmentation comes.But when the Chinese character comprising tiled configuration during there is adhesion, and image between the word in image, simply
Projecting method be difficult to realize preferable cutting effect;Exactly because this reason causes cutting to be always the difficulty of OCR identifications
Point, the quality of cutting will directly influence the recognition effect of word.
In addition optical character recognition major function be identification shoot, the character in scanned picture, for some are special
The scanned copy of font, official seal is taken pictures, such as the books of early stage printing, certificate etc. that government unit makes, due to historical reasons with
And secrecy and safety need, its font is often special, and existing optical character recognition focuses primarily upon machine learning
Method, model calculation amount is big, and due to training font sample to be not covered with sytlized font, causes the identification of sytlized font
Accuracy rate is not high, has a strong impact on the electronization of paper document.
Prior art is identified using neural network machine learning algorithm to character mostly, needs to make substantial amounts of sample
This, takes a substantial amount of time and is trained, and the model file for generating is very huge, and for the character of different fonts, identification
Rate is not quite similar, for some sytlized font characters, discrimination than relatively low, it is difficult to the character met under some special screnes is known
Not.
The content of the invention
It is an object of the invention to overcome the above-mentioned deficiency in the presence of prior art, there is provided system for recognizing characters from image,
Corresponding feature image is generated according to the font that user selects, on the basis of effective cutting is carried out to pictograph to be identified,
The automatic identification of images to be recognized word is realized with reference to targetedly character feature picture.Quick work is provided for pictograph identification
Tool.
In order to realize foregoing invention purpose, the invention provides technical scheme below:System for recognizing characters from image, the system
System pictograph identification includes implemented below step:
(1) by images to be recognized character segmentation into the sub-pictures for only including single character;By numeral therein, letter and mark
Point symbol, word subgraph is marked respectively;
(2) a sub-pictures are selected in each numeral, letter and the corresponding sub-pictures of punctuate, by the character in subgraph,
Respectively up and down, left and right, upper left, lower-left, upper right and bottom right movement setpoint distance l, makes corresponding feature image, and to system
Into feature image carry out corresponding mark;
Correspondence font is selected according to images to be recognized, samples pictures are generated, to the character in samples pictures respectively upwards,
Under, left and right, upper left, lower-left, upper right and bottom right movement setpoint distance l, make corresponding feature image, and the feature to made by
Picture carries out corresponding mark;
(3) feature image and picture to be identified are normalized:
The dimension of picture of feature image and sub-pictures to be identified is adjusted to into formed objects, and to each picture in each picture
Plain gray value change into 0 or 1 respectively according to the threshold value for arranging (by the gray value of 0-255 in picture, according to the threshold value for arranging,
It is converted into 0 or 1) the pixel value opsition dependent after conversion is stored in memory module;
(4) sub-pictures to be identified are contrasted with the feature image of corresponding types, the value execution of same location of pixels is different
Or process, the number of times that statistics 1 occurs, the error frequency is designated as, using the corresponding mark of the minimum feature image of the error frequency as knowledge
Other result is exported.
Specifically, the system is in the step (4), by digital, alphabetical and punctuate sub-pictures to be identified and numeral,
Letter and punctuate feature image are contrasted, and the value of same location of pixels performs XOR and processes, and the number of times that statistics 1 occurs is designated as
The error frequency, the corresponding mark of the minimum feature image of the error frequency is exported as recognition result;
Alphabetic character sub-pictures to be identified are contrasted with corresponding character features picture, the value of same location of pixels is held
Row XOR process, the number of times that statistics 1 occurs is designated as the error frequency, and the corresponding mark of the minimum feature image of the error frequency is made
Exported for recognition result.
Further, n*h < l < N*h.
Further, n≤1/4.
Further, the cutting of alphabetic character picture includes implemented below process:
The initial dicing position of alphabetic character picture is found out using sciagraphy, according to initial dicing position by images to be recognized
Piece is cut into initial sub-pictures sequence;
Initial sub-pictures in sequence are processed using following rule:
A, cutting is carried out using sciagraphy images to be recognized word, be cut into sub-pictures sequence;By numeral therein, letter
With punctuation mark out;
B, unlabelled sub-pictures are judged:Whether L≤M*h is met, and L is the width of sub-pictures character projection, and M is
Coefficient, h is high for row;
For the sub-pictures of the condition that is unsatisfactory for carry out cutting, dicing position is determined according to below equation:
F (x)=g (x) t (x)
Step B is repeated, unlabelled sub-pictures are satisfied by condition in sequence:L≤M*h;
The overall width of adjacent two sub-pictures beyond C, letter digital in sequence and punctuate word picture judges:
Whether L is metClose≤M*h;
If it is satisfied, sequentially the adjacent sub-pictures to meeting condition are merged;
Step C is repeated until the adjacent sub-pictures overall width in addition to numeral, letter and punctuate is unsatisfactory for LClose≤
M*h;
D, unlabelled sub-pictures in sequence are judged:If there are three adjacent sub-pictures in sequence, and three
Individual sub-pictures meet:Width L≤the 0.5h of the first sub-pictures and the 3rd sub-pictures, and the width L >=h of middle sub-pictures, then will
Middle sub-pictures are according to formula:
F (x)=g (x) t (x)
Determined by cut-off carry out cutting;According to the cut-off for determining, middle sub-pictures are cut into into son in the middle of first
Picture and the second middle sub-pictures;
First sub-pictures and the first middle sub-pictures are merged;
Second middle sub-pictures and the 3rd sub-pictures are merged.
Further, 0.9≤M≤1.3.
As a kind of preferred:M=1.2.
Further, the system is the computer or server for being loaded with above-mentioned pictograph identification function program.
Compared with prior art, beneficial effects of the present invention:The present invention provides system for recognizing characters from image, is selected according to user
The font selected, constructs primitive character picture, on the basis of primitive character picture, by the character in picture respectively to different directions
The distance of mobile setting, makes corresponding feature templates;Feature templates made by so can preferably adapt to character picture and cut
Divide faulty situation, thus with more preferable fault-tolerance.On the basis of feature image, recognized with XOR algorithm to be identified
The similarity degree of sub-pictures and feature templates, calculating process straightforward procedure, recognition efficiency and reliability it is higher.
Additionally, present invention employs step by step to judge cutting after sub-pictures cutting quality, and to the son after cutting
Picture is processed accordingly, the mode screened layer by layer and process, it is ensured that the cutting quality of sub-pictures;For final discrimination,
Condition is further prepared.In addition compared to traditional cutting method, present system introduces amendment on the basis of amplitude
Value, by the distance of dicing position and character edge as the Consideration for determining cut-off, therefore with higher accuracy,
And occur multiple smaller values when special construction character is run into, or during extreme point, can quickly be found out by this formula
Optimized cut-off, increased the accuracy of cutting, improve the efficiency of cutting;It is more preferable to the cutting effect of adhesion character.
On the basis of feature image and image character, sub-pictures to be identified and feature templates are recognized using XOR algorithm
Similarity degree, calculating process straightforward procedure, recognition efficiency and reliability it is higher.
Description of the drawings:
Fig. 1 is the system structure diagram of this system for recognizing characters from image.
Fig. 2 realizes step or signal flow schematic diagram for what the pictograph of the system was recognized.
Fig. 3 is the making schematic diagram of digital template.
Fig. 4 is the making schematic diagram of word template.
Specific embodiment
With reference to test example and specific embodiment, the present invention is described in further detail.But this should not be understood
Scope for above-mentioned theme of the invention is only limitted to below example, and all technologies realized based on present invention belong to this
The scope of invention.
Present system provides system for recognizing characters from image, as shown in figure 1, including pictograph cutting module, characteristic pattern
Piece generation module, memory module, normalized module and pictograph identification module;
Character in pending image is carried out cutting by described image character segmentation module, is cut into each only comprising single
The sub-pictures of character, and the sub-pictures sequence after cutting is stored in memory module;
The feature image generation module, the font of the images to be recognized word selected according to user, produces corresponding
Character feature picture, and the feature image being fabricated to is stored in the memory module;
The normalized module extracts the feature image and sub-pictures to be identified being stored in memory module, according to right
The type answered, is normalized, and the pictorial information after process is stored in memory module;
Described image Text region module, extract memory module in sub-pictures, using XOR algorithm calculate sub-pictures with
The matching degree of feature image, and then the identification of sub-pictures character content is realized, and recognition result is input into.
The Text region of the system includes implemented below step as shown in Figure 2:
(1) by images to be recognized character segmentation into the sub-pictures for only including single character;By numeral therein, letter and mark
Point symbol, word subgraph is marked respectively that (labelling of this step, the only type of labelling sub-pictures, are not specifically known
Not).When realizing, cutting is carried out using sciagraphy to pictograph to be identified, is cut into sub-pictures sequence, by it is therein numeral,
Letter and punctuation mark are out;Such as the narrower width (being such as set to < 0.4h) of projection, the area of projection is less
(0.5h*0.8h) the distance between adjacent sub-pictures for, being formed after cutting are significantly greater than distance of general character picture etc., utilize
Features described above, first can cut out the sub-pictures for belonging to numeral, letter and punctuate.In numeral, letter and punctuation mark
Sub-pictures and it is labeled on the basis of, cutting is carried out to unlabelled sub-pictures (alphabetic character picture), be cut into and only include
The sub-pictures of single character.The sub-pictures cutting for carrying out step by step can reach more preferable cutting effect.
(2) a sub-pictures are selected in each numeral, letter and the corresponding sub-pictures of punctuate, by the character in subgraph,
Respectively up and down, left and right, upper left, lower-left, upper right and bottom right movement setpoint distance l, makes corresponding feature image, such as Fig. 3
It is shown, and the feature image to made by carries out corresponding mark (this mark referred to, by the corresponding character content mark of feature image
Remember out, such as 9 feature images in Fig. 2 are labeled as " 8 ");
Correspondence font is selected according to images to be recognized, samples pictures are generated, to the character in samples pictures respectively upwards,
Under, left and right, upper left, lower-left, upper right and bottom right movement setpoint distance l, make corresponding feature image, and the feature to made by
Picture carries out corresponding mark, and (this mark refers to, the corresponding character content of feature image is marked, such as in Fig. 4
9 feature images be labeled as:" word ");Character in template is moved into respectively the distance of setting, more than sub-pictures frame scope
Character portion will be removed, the picture and artwork piece set a trap apart from rear formation to the movement of above-mentioned direction together form same word
The sample for reference picture of the different cutting situations of 9 of symbol is as shown in figure 3, this may not be advised with character picture cutting in practical operation
Then, faulty situation is corresponding, therefore the character recognition realized based on the feature templates of this method formation, with more preferable
Fault-tolerance.
(3) feature image and picture to be identified are normalized:
The dimension of picture of feature image and sub-pictures to be identified is adjusted to into formed objects, and to each picture in each picture
Plain gray value change into 0 or 1 respectively according to the threshold value for arranging (by the gray value of 0-255 in picture, according to the threshold value for arranging,
It is converted into 0 or 1) the pixel value opsition dependent after conversion is stored in memory module;
(4) sub-pictures to be identified are contrasted with the feature image of corresponding types, the value execution of same location of pixels is different
Or process (if feature image is identical with the value of picture corresponding pixel points to be identified, XOR calculate after value be 0;If feature
Picture is different with the value of picture corresponding pixel points to be identified, and the value after XOR is calculated is the 1) number of times that statistics 1 occurs, and is designated as missing
Difference frequency, the corresponding mark of the minimum feature image of the error frequency is exported as recognition result.
Specifically, in the step (4), by digital, alphabetical and punctuate sub-pictures to be identified and numeral, letter and punctuate
Feature image is contrasted, and the value of same location of pixels performs XOR and processes, and the number of times that statistics 1 occurs is designated as the error frequency, will
The corresponding mark of the minimum feature image of the error frequency is exported as recognition result;
Alphabetic character sub-pictures to be identified are contrasted with corresponding character features picture, the value of same location of pixels is held
Row XOR process, the number of times that statistics 1 occurs is designated as the error frequency, and the corresponding mark of the minimum feature image of the error frequency is made
Exported for recognition result.
Present system recognizes the similarity degree of sub-pictures to be identified and feature templates, calculating process using XOR algorithm
Straightforward procedure, recognition efficiency and reliability it is higher.
Further, the cutting of alphabetic character picture includes implemented below process:
The initial dicing position of alphabetic character picture is found out using sciagraphy, according to initial dicing position by images to be recognized
Piece is cut into initial sub-pictures sequence;
Initial sub-pictures in sequence are processed using following rule:
A, cutting is carried out using sciagraphy images to be recognized word, be cut into sub-pictures sequence;By numeral therein, letter
With punctuation mark out;
B, unlabelled sub-pictures are judged:Whether L≤M*h is met, and L is the width of sub-pictures character projection, and M is
Coefficient, h is high for row;
For the sub-pictures of the condition that is unsatisfactory for carry out cutting, dicing position is determined according to below equation:
F (x)=g (x) t (x)
Step B is repeated, unlabelled sub-pictures are satisfied by condition in sequence:L≤M*h.
F (x) is amplitude in formula, and x is row subpoint coordinate in the row direction, and h is high for the row of current character, and g (x) is to repair
On the occasion of t (x) is row projection value, and both together decide on the amplitude of subpoint, when amplitude is minimum, between as two characters
Cut point;Minimum amplitude point is found as cut-off, through the amendment of g (x) compared to simple minimum row projection value, we
The cut-off found in method method, introduces the considerations of cut-off position and character edge distance, therefore with higher
Accuracy, and occur multiple smaller values when special construction character is run into, or during extreme point, can be fast by this formula
Fast finds out optimized cut-off, increased the accuracy of cutting, improves the efficiency of cutting.
The overall width of adjacent two sub-pictures beyond C, letter digital in sequence and punctuate word picture judges:
Whether L is metClose≤M*h;
If it is satisfied, sequentially the adjacent sub-pictures to meeting condition are merged;
Step C is repeated until the adjacent sub-pictures overall width in addition to numeral, letter and punctuate is unsatisfactory for LClose≤
M*h;
D, unlabelled sub-pictures in sequence are judged:If there are three adjacent sub-pictures in sequence, and three
Individual sub-pictures meet:Width L≤the 0.5h of the first sub-pictures and the 3rd sub-pictures, and the width L >=h of middle sub-pictures, then will
Middle sub-pictures are according to formula:
F (x)=g (x) t (x)
Determined by cut-off carry out cutting;According to the cut-off for determining, middle sub-pictures are cut into into son in the middle of first
Picture and the second middle sub-pictures;
First sub-pictures and the first middle sub-pictures are merged;
Second middle sub-pictures and the 3rd sub-pictures are merged.
In some cases:The character picture of continuous two tiled configurations, centre has adhesion, then using projection
When method carries out cutting, the radical in the middle of Qian Hou character may be cut, but for the radical of adhesion between two characters is recognized
Not, character cutting situation out is treated as;Present system has in this case preferable treatment effect, for
The mid portion of adhesion searches out optimal cut-off by above-mentioned formula, and by cutting after before and after the radical of character carry out weight
New integration, has reached preferable cutting effect.
Above-mentioned rule is sequentially recycled, and through continuous iteration, ultimately forms the only sub-pictures comprising single character,
Good cutting effect is that pictograph identification has prepared condition.
Further, 0.9≤M≤1.3.Being arranged in the range of this for sub-pictures width threshold value, can realize preferably cutting
Divide and recognition effect.
As a kind of preferred:M=1.2.Verify repeatedly through experiment, when M is set to into 1.2, can realize preferably cutting
Divide effect.
Further, the system is the computer or server for being loaded with above-mentioned pictograph identification function program.
Claims (8)
1. system for recognizing characters from image, it is characterised in that the system realizes pictograph identification comprising implemented below step:
(1) by images to be recognized character segmentation into the sub-pictures for only including single character;By numeral therein, letter and punctuate symbol
Number, word subgraph is marked respectively;
(2) a sub-pictures are selected in each numeral, letter and the corresponding sub-pictures of punctuate, by the character in subgraph, difference
Up and down, left and right, upper left, lower-left, upper right and bottom right movement setpoint distance l, makes corresponding feature image, and to made by
Feature image carries out corresponding mark;
Correspondence font is selected according to images to be recognized, samples pictures are generated, to the character in samples pictures respectively up and down, it is left,
The right side, upper left, lower-left, upper right and bottom right movement setpoint distance l, makes corresponding feature image, and feature image is entered to made by
The corresponding mark of row;
(3) feature image and picture to be identified are normalized, and by the pixel respective value of each picture, step-by-step is stored in
In memory module;
(4) sub-pictures to be identified are contrasted with the feature image of corresponding types, the value of same location of pixels is performed at XOR
Reason, the number of times that statistics 1 occurs, is designated as the error frequency;Using the corresponding mark of the minimum feature image of the error frequency as identification knot
Fruit is exported.
2. the system as claimed in claim 1, it is characterised in that n*h < l < N*h.
3. system as claimed in claim 2, it is characterised in that n≤1/4.
4. the system as claimed in claim 1, it is characterised in that the system, in normalized process include:By feature
The dimension of picture of picture and sub-pictures to be identified is adjusted to formed objects;
0 or 1 is changed into respectively according to the threshold value for arranging to each grey scale pixel value in each picture, by the pixel value after conversion
Opsition dependent is stored in memory module.
5. the system as described in one of Claims 1-4, it is characterised in that the cutting of alphabetic character picture includes implemented below
Process:
A, by digital, the alphabetical and punctuation mark in sequence of pictures out;
B, unlabelled sub-pictures are judged:Whether L≤M*h is met, and L is the width of sub-pictures character projection, and M is to be
Number, h is high for row;
For the sub-pictures of the condition that is unsatisfactory for carry out cutting, dicing position is determined according to below equation:
F (x)=g (x) t (x)
Step B is repeated, unlabelled sub-pictures are satisfied by condition in sequence:L≤M*h;
The overall width of adjacent two sub-pictures beyond C, letter digital in sequence and punctuate word picture judges:Whether
Meet LClose≤M*h;
If it is satisfied, sequentially the adjacent sub-pictures to meeting condition are merged;
Step C is repeated until the adjacent sub-pictures overall width in addition to numeral, letter and punctuate is unsatisfactory for LClose≤M*h;
D, unlabelled sub-pictures in sequence are judged:If there are three adjacent sub-pictures in sequence, and three sons
Picture meets:Width L≤the 0.5h of the first sub-pictures and the 3rd sub-pictures, and the width L >=h of middle sub-pictures, then by centre
Sub-pictures are according to formula:
F (x)=g (x) t (x)
Determined by cut-off carry out cutting;According to the cut-off for determining, middle sub-pictures are cut into into the first middle sub-pictures
With the second middle sub-pictures;
First sub-pictures and the first middle sub-pictures are merged;
Second middle sub-pictures and the 3rd sub-pictures are merged.
6. system as claimed in claim 5, it is characterised in that 0.9≤M≤1.3.
7. system as claimed in claim 6, it is characterised in that M=1.2.
8. system as claimed in claim 7, it is characterised in that the system is to be loaded with above-mentioned pictograph identification function journey
The computer or server of sequence.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611254376.0A CN106682671A (en) | 2016-12-29 | 2016-12-29 | Image character recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611254376.0A CN106682671A (en) | 2016-12-29 | 2016-12-29 | Image character recognition system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682671A true CN106682671A (en) | 2017-05-17 |
Family
ID=58872298
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611254376.0A Pending CN106682671A (en) | 2016-12-29 | 2016-12-29 | Image character recognition system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682671A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106682698A (en) * | 2016-12-29 | 2017-05-17 | 成都数联铭品科技有限公司 | OCR identification method based on template matching |
CN107292280A (en) * | 2017-07-04 | 2017-10-24 | 盛世贞观(北京)科技有限公司 | A kind of seal automatic font identification method and identifying device |
CN107545391A (en) * | 2017-09-07 | 2018-01-05 | 安徽共生物流科技有限公司 | A kind of logistics document intellectual analysis and automatic storage method based on image recognition |
CN109034149A (en) * | 2017-06-08 | 2018-12-18 | 北京君正集成电路股份有限公司 | A kind of character identifying method and device |
CN110390508A (en) * | 2019-06-10 | 2019-10-29 | 平安科技(深圳)有限公司 | Schedule method, apparatus and storage medium are created based on OCR |
CN110942074A (en) * | 2018-09-25 | 2020-03-31 | 京东数字科技控股有限公司 | Character segmentation recognition method and device, electronic equipment and storage medium |
CN113627849A (en) * | 2021-08-12 | 2021-11-09 | 深圳市全景世纪科技有限公司 | Method and system for improving automatic goods customer information acquisition recognition rate |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07131622A (en) * | 1993-10-29 | 1995-05-19 | Matsushita Graphic Commun Syst Inc | Facsimile equipment |
JPH08129443A (en) * | 1994-07-13 | 1996-05-21 | Yashima Denki Co Ltd | Holograph storing and reproducing device, holograph reproducing method, and picture reproducing method |
JP2001351065A (en) * | 2000-06-05 | 2001-12-21 | Japan Science & Technology Corp | Method for recognizing character, computer readable recording medium recording character recognition program and character recognition device |
JP2002230482A (en) * | 2000-11-28 | 2002-08-16 | Fujitsu Ltd | Character recognizing method and device |
JP2004334913A (en) * | 2004-08-19 | 2004-11-25 | Matsushita Electric Ind Co Ltd | Document recognition device and document recognition method |
JP2006331354A (en) * | 2005-05-30 | 2006-12-07 | Sharp Corp | Character recognition device, character recognition method, its program and recording medium |
JP2008004116A (en) * | 2007-08-02 | 2008-01-10 | Hitachi Ltd | Method and device for retrieving character in video |
CN101571921A (en) * | 2008-04-28 | 2009-11-04 | 富士通株式会社 | Method and device for identifying key words |
CN102663378A (en) * | 2012-03-22 | 2012-09-12 | 杭州新锐信息技术有限公司 | Method for indentifying joined-up handwritten characters |
CN102915440A (en) * | 2011-08-03 | 2013-02-06 | 汉王科技股份有限公司 | Method and device for character segmentation |
JP2015215893A (en) * | 2014-05-08 | 2015-12-03 | 株式会社Nttドコモ | Identification method and facility for sign character of exercise participant |
CN105447522A (en) * | 2015-11-25 | 2016-03-30 | 成都数联铭品科技有限公司 | Complex image character identification system |
CN105654072A (en) * | 2016-03-24 | 2016-06-08 | 哈尔滨工业大学 | Automatic character extraction and recognition system and method for low-resolution medical bill image |
CN105678292A (en) * | 2015-12-30 | 2016-06-15 | 成都数联铭品科技有限公司 | Complex optical text sequence identification system based on convolution and recurrent neural network |
CN106682698A (en) * | 2016-12-29 | 2017-05-17 | 成都数联铭品科技有限公司 | OCR identification method based on template matching |
-
2016
- 2016-12-29 CN CN201611254376.0A patent/CN106682671A/en active Pending
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07131622A (en) * | 1993-10-29 | 1995-05-19 | Matsushita Graphic Commun Syst Inc | Facsimile equipment |
JPH08129443A (en) * | 1994-07-13 | 1996-05-21 | Yashima Denki Co Ltd | Holograph storing and reproducing device, holograph reproducing method, and picture reproducing method |
JP2001351065A (en) * | 2000-06-05 | 2001-12-21 | Japan Science & Technology Corp | Method for recognizing character, computer readable recording medium recording character recognition program and character recognition device |
JP2002230482A (en) * | 2000-11-28 | 2002-08-16 | Fujitsu Ltd | Character recognizing method and device |
JP2004334913A (en) * | 2004-08-19 | 2004-11-25 | Matsushita Electric Ind Co Ltd | Document recognition device and document recognition method |
JP2006331354A (en) * | 2005-05-30 | 2006-12-07 | Sharp Corp | Character recognition device, character recognition method, its program and recording medium |
JP2008004116A (en) * | 2007-08-02 | 2008-01-10 | Hitachi Ltd | Method and device for retrieving character in video |
CN101571921A (en) * | 2008-04-28 | 2009-11-04 | 富士通株式会社 | Method and device for identifying key words |
CN102915440A (en) * | 2011-08-03 | 2013-02-06 | 汉王科技股份有限公司 | Method and device for character segmentation |
CN102663378A (en) * | 2012-03-22 | 2012-09-12 | 杭州新锐信息技术有限公司 | Method for indentifying joined-up handwritten characters |
JP2015215893A (en) * | 2014-05-08 | 2015-12-03 | 株式会社Nttドコモ | Identification method and facility for sign character of exercise participant |
CN105447522A (en) * | 2015-11-25 | 2016-03-30 | 成都数联铭品科技有限公司 | Complex image character identification system |
CN105678292A (en) * | 2015-12-30 | 2016-06-15 | 成都数联铭品科技有限公司 | Complex optical text sequence identification system based on convolution and recurrent neural network |
CN105654072A (en) * | 2016-03-24 | 2016-06-08 | 哈尔滨工业大学 | Automatic character extraction and recognition system and method for low-resolution medical bill image |
CN106682698A (en) * | 2016-12-29 | 2017-05-17 | 成都数联铭品科技有限公司 | OCR identification method based on template matching |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106682698A (en) * | 2016-12-29 | 2017-05-17 | 成都数联铭品科技有限公司 | OCR identification method based on template matching |
CN109034149A (en) * | 2017-06-08 | 2018-12-18 | 北京君正集成电路股份有限公司 | A kind of character identifying method and device |
CN107292280A (en) * | 2017-07-04 | 2017-10-24 | 盛世贞观(北京)科技有限公司 | A kind of seal automatic font identification method and identifying device |
CN107545391A (en) * | 2017-09-07 | 2018-01-05 | 安徽共生物流科技有限公司 | A kind of logistics document intellectual analysis and automatic storage method based on image recognition |
CN110942074A (en) * | 2018-09-25 | 2020-03-31 | 京东数字科技控股有限公司 | Character segmentation recognition method and device, electronic equipment and storage medium |
CN110942074B (en) * | 2018-09-25 | 2024-04-09 | 京东科技控股股份有限公司 | Character segmentation recognition method and device, electronic equipment and storage medium |
CN110390508A (en) * | 2019-06-10 | 2019-10-29 | 平安科技(深圳)有限公司 | Schedule method, apparatus and storage medium are created based on OCR |
CN113627849A (en) * | 2021-08-12 | 2021-11-09 | 深圳市全景世纪科技有限公司 | Method and system for improving automatic goods customer information acquisition recognition rate |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106682671A (en) | Image character recognition system | |
CN106682698A (en) | OCR identification method based on template matching | |
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN111401372B (en) | Method for extracting and identifying image-text information of scanned document | |
CN104809481B (en) | A kind of natural scene Method for text detection based on adaptive Color-based clustering | |
CN106611174A (en) | OCR recognition method for unusual fonts | |
US6252988B1 (en) | Method and apparatus for character recognition using stop words | |
CN105512611A (en) | Detection and identification method for form image | |
CN105447522A (en) | Complex image character identification system | |
CN106682667A (en) | Image-text OCR (optical character recognition) system for uncommon fonts | |
Ferrer et al. | Lbp based line-wise script identification | |
CN104008384A (en) | Character identification method and character identification apparatus | |
CN105426856A (en) | Image table character identification method | |
JP2006053920A (en) | Character recognition program, method and device | |
CN105469053A (en) | Bayesian optimization-based image table character segmentation method | |
CN107463866A (en) | A kind of method of the hand-written laboratory report of identification for performance evaluation | |
Yin et al. | Decipherment of historical manuscript images | |
Yadav et al. | A robust approach for offline English character recognition | |
CN109685061A (en) | The recognition methods of mathematical formulae suitable for structuring | |
CN106778759A (en) | For the feature image automatic creation system of pictograph identification | |
Darma et al. | Segmentation of balinese script on lontar manuscripts using projection profile | |
CN106682666A (en) | Characteristic template manufacturing method for unusual font OCR identification | |
CN108062548B (en) | Braille square self-adaptive positioning method and system | |
CN110674678A (en) | Method and device for identifying sensitive mark in video | |
CN112580738B (en) | AttentionOCR text recognition method and device based on improvement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170517 |