CN109993162A - Laotian block letter text optical character recognition methods based on convolutional neural networks - Google Patents
Laotian block letter text optical character recognition methods based on convolutional neural networks Download PDFInfo
- Publication number
- CN109993162A CN109993162A CN201910156076.6A CN201910156076A CN109993162A CN 109993162 A CN109993162 A CN 109993162A CN 201910156076 A CN201910156076 A CN 201910156076A CN 109993162 A CN109993162 A CN 109993162A
- Authority
- CN
- China
- Prior art keywords
- laotian
- image
- character
- text
- neural networks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 26
- 238000012015 optical character recognition Methods 0.000 title claims abstract description 11
- 238000003062 neural network model Methods 0.000 claims abstract description 9
- 238000012545 processing Methods 0.000 claims abstract description 6
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000003058 natural language processing Methods 0.000 abstract description 5
- 238000006243 chemical reaction Methods 0.000 abstract 1
- 238000010801 machine learning Methods 0.000 abstract 1
- 230000003287 optical effect Effects 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
- G06V20/63—Scene text, e.g. street names
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
- Character Input (AREA)
Abstract
The Laotian block letter text optical character recognition methods based on convolutional neural networks that the invention discloses a kind of, belongs to natural language processing and machine learning techniques field.The present invention carries out binary conversion treatment to image first after inputting block letter textual scan picture, and carries out rotational correction.Then because Laos's text is the horizontally-arranged vowel affix text write from left to right, pretreated image cuts whole page character according to the sequence of Row Column by projection histogram method, is cut into the combination of Laos's Chinese character.Then in the four road parallel-convolution neural network models Laos's Chinese character being syncopated as input established for Laotian feature, corresponding character text is exported.Finally character text sequence is post-processed according to Laotian language rule, generates final text output.The present invention has certain application value in Laotian character recognition and papery data digitized processing.
Description
Technical field
The Laotian block letter text optical character recognition methods based on convolutional neural networks that the present invention relates to a kind of, belongs to
Rare foreign languages field of optical character recognition in natural language processing turns letter application scene suitable for Laotian image.
Background technique
Optical character identification is to be recognized to be printed on paper or people writes on text on paper automatically with computer, is pattern-recognition
An and important branch of natural language processing field.At present, in world wide optical character identification research primarily directed in
The identification of the mainstream speeches such as text, English, it is domestic also to have certain research to minority language identification, mainly Mongolian, Tibetan language,
The texts such as Balakrishnan.One of the problem of optical character identification is mainly studied is exactly the classification problem to character picture.
Laos anticipates as the country bordered on along the Belt and Road and with China, official language Laotian with larger research
Justice.But Laos's economy is more undeveloped, and information technology correlative study and industry more fall behind, and lacks Laotian natural language processing and grinds
Necessary number corpus when studying carefully, therefore there are very big Laotian papery datas to digitize demand.In Laos's text papery data number
During change, existing input method is mostly manually typing, and there are Laotian, the talent is less, and input speed is slower, essence
The problem of exactness is influenced by typing person's level.Therefore a kind of Characters method based on optical character identification is needed, record is accelerated
Enter speed, improves the accuracy of typing text.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of Laotian block letter text light based on convolutional neural networks
Character identifying method is learned, for solving the problems, such as that Laotian papery data digitizes typing.
The technical solution adopted by the present invention is that: a kind of Laotian block letter text optical character based on convolutional neural networks
Recognition methods includes the following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning
Brush product scanned picture;
Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side
Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping
Just;
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram
Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image;
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4
Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character;
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one
Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text;
Step6, the corresponding text of output input picture.
Specifically, RGB color image is first transformed into two using local auto-adaptive binarization method in the step step2
It is worth image, then opens to operate using morphology and reduce noise, then uses with Minimum Enclosing Rectangle method amendment image inclination, acquisition nothing
Distort binary image.
Specifically, every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-volume
Product-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-entirely
The hierarchical structure of full connection-output ", constructs neural network model, identifies to Laotian character.
The beneficial effects of the present invention are:
(1) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian spy
Point design four tunnels parallel neural networks, greatly reduce class categories, reduce model complexity, accelerate identification speed
Degree.
(2) should Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian this
The lesser language of gap between kind character, the traditional machines such as method or support vector machines compared to template matching method etc. based on feature
Device learning method, accuracy of identification are significantly improved.
(3) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, using convolutional Neural
Network identifies character, and batch normalization layer is added, and has better generalization ability, still have when character resolution ratio is lower compared with
Good recognition effect.
Detailed description of the invention
Fig. 1 is overall flow figure of the invention;
Fig. 2 is histogram method schematic diagram used in the present invention, and downside is pixels statistics histogram in figure, and upside character is
Laotian " northwards " word;
Fig. 3 is Laotian rules for writing based on the present invention, and kinds of characters has its different writing position, and word is in figure
Laotian " data " word;
Fig. 4 is single channel convolutional neural networks structure chart used in the present invention.It is for NatchNormalization layers in figure
Normalizing operation layer is criticized, Conv2D layers are convolution operation layer, and MaxPooling layers are pond layer, and Dropout is to abandon parameter layer,
Flatten is flattening layer, and Decse layers are full articulamentum, and level superposition constitutes neural network model, according to the old of Input input
Laos's language character generates corresponding Output character and exports result.
Fig. 5 is the every layer parameter of single channel convolutional neural networks used in the present invention, and Layer is hierarchy name, Output
Shape is every layer of output tensor size, and Kernel Size is every layer of convolution kernel size.Parameter can define specific nerve net according to this
Network model.
Specific embodiment
In the following with reference to the drawings and specific embodiments, the present invention is described further.
Embodiment 1: as shown in Figs. 1-5, a kind of Laotian block letter text optical character knowledge based on convolutional neural networks
Other method, includes the following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning
Brush product scanned picture.Specifically, the present embodiment is read from local disk using Python by using the library open CV
Printed matter scanned picture, storage is into memory in the form of image;
Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side
Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping
Just.
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation.According to histogram
Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image.
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4
Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character.
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one
Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text.
Step6, the corresponding text of output input picture.
Further, it is contemplated that scan page necessarily will appear the case where noise and rotation twist during the scanning process, described
Step step2 pre-processes input Laotian textual image by digital image processing techniques, uses local auto-adaptive two
RGB color image is transformed into bianry image by value method, is opened operation using morphology and is reduced noise, uses minimum circumscribed rectangle
Method corrects image inclination, obtains distortionless binary image, achievees the purpose that obtain feature high quality graphic outstanding.
Further, projection histogram method can analyze image on row or column level in the step step3, root
It according to histogram peak/valley feature, is divided at histogram paddy, character cutting can be carried out for Laotian printed matter scanned picture
It obtains preferable cutting effect and entire image is cut into monocase image, image format picture is then converted into array, concentrate
It is stored into file.
As shown in Fig. 2, being histogram method schematic diagram used in the present invention, downside is pixels statistics histogram in figure, on
Side character is Laotian " northwards " word.
Further, position is write for Laotian in the step step4 be divided into tetrameric feature, it is special to design four tunnels
Parallel-convolution neural network model carries out corresponding Classification and Identification to different location.For more traditional convolutional neural networks, the model
By four tunnels, singly classification replaces classifying one more, greatly reduces categorical measure, greatly reduces model parameter and complexity.
In the present embodiment, as shown in figure 4, every convolutional neural networks all the way in the step step4, according to " input-batch
Standardization-convolution-convolution-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-entirely
The hierarchical structure of connection-batch standardization-full connection-output ", constructs neural network model, identifies to Laotian character.Volume
" convolution-pond-abandons parameter " operation being added in product neural network model can play compression and extract characteristics of image, make spy
The effect that sign figure becomes smaller.Batch normalizing operation being added in model can play normalization input data, prevent over-fitting, mention
The effect of high-class effect.
The present invention constructs neural network model using the library keras based on the library tensorflow, using suitable specified in library
Sequence model method constructs neural network and, according to Fig. 5 parameter, uses the neural networks such as Conv2D predefined in library according to shown in Fig. 4
Level successively builds model, then compilation run model.When using neural network, first model should be trained, that is, read
Monocase image file and the corresponding label of each image manually marked, monocase image batch input is parallel to four tunnels
In convolutional neural networks model, tone in Laotian character/top vowel/consonant/lower part vowel is determined respectively in every road
It Deng writing in the character of different location, is exported according to neural network and is compared with label, backpropagation adjustment is carried out to neural network
Parameter, finally when neural network output accuracy reaches maintenance level, preservation model structure and parameter.In practical application rank
Section reads the model structure and parameter saved in training process, then reads monocase image file, and be inputted model,
I.e. exportable image institute's write characters differentiate as a result, being stored in memory with array form.
Further, the step step5 is judged the character that Step4 is exported by the sequence traversed from front to back,
Laotian language feature is merged on the basis of algorithm, being written on the vowel above or below consonant not for Laotian individually makes
With the characteristics of, output result is analyzed, judges whether top vowel or lower part vowel false segmentation to be a line, such as exist
The individual vowel of a line is cutting mistake, and mistake returns to Step3 and is modified.Then it is accustomed to according to Laotian regular collocation,
Induction and conclusion goes out 14 Laotian compound vowel combinations, is modified with this to output text.It is solid for Laotian combination vowel
Fixed collocation carries out discriminant analysis to neural network model output result, corrects its mistake output as a result, improving optical character identification
Total accuracy.
The present invention proposes that a kind of Laotian block letter text optical character based on convolutional neural networks knows method for distinguishing, leads to
Digital image processing techniques, convolutional neural networks and Laotian language feature are crossed, it can be achieved that the printing paper document scanning of Laos's text
The optical character identification of part substantially increases Laos's text paper document digitlization input speed, reduces the mistake that manual entry generates
Accidentally, the auxiliary of some fundamental aspects is provided for Laotian natural language processing research.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned
Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept
Put that various changes can be made.
Claims (3)
1. a kind of Laotian block letter text optical character recognition methods based on convolutional neural networks, it is characterised in that: including
Following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the printed matter obtained by scanner scanning
Scanned picture;
Step2 pre-processes image, that is, passes through digital image processing techniques, will according to local auto-adaptive binarization method
Input picture is converted to two-value black white image, eliminates noise, is then corrected according to minimum circumscribed rectangle to scalloping;
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram
Peak/valley feature is divided at histogram paddy, and entire image is cut into monocase image;
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, determines Laos respectively in every road by Step4
Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in language character;
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is a line, such as
Step3 is returned in the presence of mistake to be modified, is then accustomed to according to Laotian regular collocation, and output text is modified;
Step6, the corresponding text of output input picture.
2. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1,
It is characterized in that: RGB color image first being transformed by binary map using local auto-adaptive binarization method in the step step2
Picture, then operation is opened using morphology and reduces noise, image inclination then is corrected using with Minimum Enclosing Rectangle method, is obtained distortionless
Binary image.
3. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1,
It is characterized in that: every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-convolution-pond
Change-discarding parameter-convolution-pond-abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-Quan Lian entirely
Connect-export " hierarchical structure, construct neural network model, Laotian character is identified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910156076.6A CN109993162A (en) | 2019-03-01 | 2019-03-01 | Laotian block letter text optical character recognition methods based on convolutional neural networks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910156076.6A CN109993162A (en) | 2019-03-01 | 2019-03-01 | Laotian block letter text optical character recognition methods based on convolutional neural networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109993162A true CN109993162A (en) | 2019-07-09 |
Family
ID=67129950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910156076.6A Pending CN109993162A (en) | 2019-03-01 | 2019-03-01 | Laotian block letter text optical character recognition methods based on convolutional neural networks |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109993162A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626292A (en) * | 2020-05-09 | 2020-09-04 | 北京邮电大学 | Character recognition method of building indication mark based on deep learning technology |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102456138A (en) * | 2010-11-03 | 2012-05-16 | 汉王科技股份有限公司 | Method and device for pre-processing block Arab characters |
CN107305630A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Text sequence recognition methods and device |
CN108520274A (en) * | 2018-03-27 | 2018-09-11 | 天津大学 | High reflecting surface defect inspection method based on image procossing and neural network classification |
CN108664975A (en) * | 2018-04-24 | 2018-10-16 | 新疆大学 | A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment |
CN109034281A (en) * | 2018-07-18 | 2018-12-18 | 中国科学院半导体研究所 | The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing |
CN109190630A (en) * | 2018-08-29 | 2019-01-11 | 摩佰尔(天津)大数据科技有限公司 | Character identifying method |
-
2019
- 2019-03-01 CN CN201910156076.6A patent/CN109993162A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102456138A (en) * | 2010-11-03 | 2012-05-16 | 汉王科技股份有限公司 | Method and device for pre-processing block Arab characters |
CN107305630A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Text sequence recognition methods and device |
CN108520274A (en) * | 2018-03-27 | 2018-09-11 | 天津大学 | High reflecting surface defect inspection method based on image procossing and neural network classification |
CN108664975A (en) * | 2018-04-24 | 2018-10-16 | 新疆大学 | A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment |
CN109034281A (en) * | 2018-07-18 | 2018-12-18 | 中国科学院半导体研究所 | The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing |
CN109190630A (en) * | 2018-08-29 | 2019-01-11 | 摩佰尔(天津)大数据科技有限公司 | Character identifying method |
Non-Patent Citations (2)
Title |
---|
ITKARE: "Fashion Classification and object detection using CNN", 《INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES(ICTCS 2020)》 * |
柴伟佳: "卷积神经网络的多字体汉字识别", 《中国图象图形学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111626292A (en) * | 2020-05-09 | 2020-09-04 | 北京邮电大学 | Character recognition method of building indication mark based on deep learning technology |
CN111626292B (en) * | 2020-05-09 | 2023-06-30 | 北京邮电大学 | Text recognition method of building indication mark based on deep learning technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Singh | Optical character recognition techniques: a survey | |
Dongre et al. | A review of research on Devnagari character recognition | |
Rebelo et al. | Optical music recognition: state-of-the-art and open issues | |
CN110399798A (en) | A kind of discrete picture file information extracting system and method based on deep learning | |
Marinai | Introduction to document analysis and recognition | |
CN111652332B (en) | Deep learning handwritten Chinese character recognition method and system based on two classifications | |
Huang et al. | OBC306: A large-scale oracle bone character recognition dataset | |
CN110866388A (en) | Publishing PDF layout analysis and identification method based on mixing of multiple neural networks | |
CN112069900A (en) | Bill character recognition method and system based on convolutional neural network | |
Karimi et al. | Persian handwritten digit recognition using ensemble classifiers | |
Khedher et al. | Automatic processing of Historical Arabic Documents: a comprehensive survey | |
Nikitha et al. | Handwritten text recognition using deep learning | |
Malakar et al. | An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms | |
Cascianelli et al. | Learning to read L’Infinito: handwritten text recognition with synthetic training data | |
Wüthrich et al. | Language model integration for the recognition of handwritten medieval documents | |
Ul-Hasan | Generic text recognition using long short-term memory networks | |
Dipu et al. | Bangla optical character recognition (ocr) using deep learning based image classification algorithms | |
CN109993162A (en) | Laotian block letter text optical character recognition methods based on convolutional neural networks | |
Naz et al. | An OCR system for printed Nasta'liq script: A segmentation based approach | |
CN112036330A (en) | Text recognition method, text recognition device and readable storage medium | |
Barrere et al. | Training transformer architectures on few annotated data: an application to historical handwritten text recognition | |
CN114639106A (en) | Image-text recognition method and device, computer equipment and storage medium | |
Reul et al. | Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification: A Case Study on Daniel Sander's Wörterbuch der Deutschen Sprache | |
Adak | A study on automated handwriting understanding | |
Shafait | Geometric Layout Analysis of scanned documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190709 |