CN109993162A - Laotian block letter text optical character recognition methods based on convolutional neural networks - Google Patents

Laotian block letter text optical character recognition methods based on convolutional neural networks Download PDF

Info

Publication number
CN109993162A
CN109993162A CN201910156076.6A CN201910156076A CN109993162A CN 109993162 A CN109993162 A CN 109993162A CN 201910156076 A CN201910156076 A CN 201910156076A CN 109993162 A CN109993162 A CN 109993162A
Authority
CN
China
Prior art keywords
laotian
image
character
text
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910156076.6A
Other languages
Chinese (zh)
Inventor
周兰江
郝永彬
周枫
张建安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201910156076.6A priority Critical patent/CN109993162A/en
Publication of CN109993162A publication Critical patent/CN109993162A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/63Scene text, e.g. street names
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The Laotian block letter text optical character recognition methods based on convolutional neural networks that the invention discloses a kind of, belongs to natural language processing and machine learning techniques field.The present invention carries out binary conversion treatment to image first after inputting block letter textual scan picture, and carries out rotational correction.Then because Laos's text is the horizontally-arranged vowel affix text write from left to right, pretreated image cuts whole page character according to the sequence of Row Column by projection histogram method, is cut into the combination of Laos's Chinese character.Then in the four road parallel-convolution neural network models Laos's Chinese character being syncopated as input established for Laotian feature, corresponding character text is exported.Finally character text sequence is post-processed according to Laotian language rule, generates final text output.The present invention has certain application value in Laotian character recognition and papery data digitized processing.

Description

Laotian block letter text optical character recognition methods based on convolutional neural networks
Technical field
The Laotian block letter text optical character recognition methods based on convolutional neural networks that the present invention relates to a kind of, belongs to Rare foreign languages field of optical character recognition in natural language processing turns letter application scene suitable for Laotian image.
Background technique
Optical character identification is to be recognized to be printed on paper or people writes on text on paper automatically with computer, is pattern-recognition An and important branch of natural language processing field.At present, in world wide optical character identification research primarily directed in The identification of the mainstream speeches such as text, English, it is domestic also to have certain research to minority language identification, mainly Mongolian, Tibetan language, The texts such as Balakrishnan.One of the problem of optical character identification is mainly studied is exactly the classification problem to character picture.
Laos anticipates as the country bordered on along the Belt and Road and with China, official language Laotian with larger research Justice.But Laos's economy is more undeveloped, and information technology correlative study and industry more fall behind, and lacks Laotian natural language processing and grinds Necessary number corpus when studying carefully, therefore there are very big Laotian papery datas to digitize demand.In Laos's text papery data number During change, existing input method is mostly manually typing, and there are Laotian, the talent is less, and input speed is slower, essence The problem of exactness is influenced by typing person's level.Therefore a kind of Characters method based on optical character identification is needed, record is accelerated Enter speed, improves the accuracy of typing text.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of Laotian block letter text light based on convolutional neural networks Character identifying method is learned, for solving the problems, such as that Laotian papery data digitizes typing.
The technical solution adopted by the present invention is that: a kind of Laotian block letter text optical character based on convolutional neural networks Recognition methods includes the following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning Brush product scanned picture;
Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping Just;
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image;
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character;
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text;
Step6, the corresponding text of output input picture.
Specifically, RGB color image is first transformed into two using local auto-adaptive binarization method in the step step2 It is worth image, then opens to operate using morphology and reduce noise, then uses with Minimum Enclosing Rectangle method amendment image inclination, acquisition nothing Distort binary image.
Specifically, every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-volume Product-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-entirely The hierarchical structure of full connection-output ", constructs neural network model, identifies to Laotian character.
The beneficial effects of the present invention are:
(1) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian spy Point design four tunnels parallel neural networks, greatly reduce class categories, reduce model complexity, accelerate identification speed Degree.
(2) should Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian this The lesser language of gap between kind character, the traditional machines such as method or support vector machines compared to template matching method etc. based on feature Device learning method, accuracy of identification are significantly improved.
(3) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, using convolutional Neural Network identifies character, and batch normalization layer is added, and has better generalization ability, still have when character resolution ratio is lower compared with Good recognition effect.
Detailed description of the invention
Fig. 1 is overall flow figure of the invention;
Fig. 2 is histogram method schematic diagram used in the present invention, and downside is pixels statistics histogram in figure, and upside character is Laotian " northwards " word;
Fig. 3 is Laotian rules for writing based on the present invention, and kinds of characters has its different writing position, and word is in figure Laotian " data " word;
Fig. 4 is single channel convolutional neural networks structure chart used in the present invention.It is for NatchNormalization layers in figure Normalizing operation layer is criticized, Conv2D layers are convolution operation layer, and MaxPooling layers are pond layer, and Dropout is to abandon parameter layer, Flatten is flattening layer, and Decse layers are full articulamentum, and level superposition constitutes neural network model, according to the old of Input input Laos's language character generates corresponding Output character and exports result.
Fig. 5 is the every layer parameter of single channel convolutional neural networks used in the present invention, and Layer is hierarchy name, Output Shape is every layer of output tensor size, and Kernel Size is every layer of convolution kernel size.Parameter can define specific nerve net according to this Network model.
Specific embodiment
In the following with reference to the drawings and specific embodiments, the present invention is described further.
Embodiment 1: as shown in Figs. 1-5, a kind of Laotian block letter text optical character knowledge based on convolutional neural networks Other method, includes the following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning Brush product scanned picture.Specifically, the present embodiment is read from local disk using Python by using the library open CV Printed matter scanned picture, storage is into memory in the form of image;
Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping Just.
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation.According to histogram Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image.
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character.
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text.
Step6, the corresponding text of output input picture.
Further, it is contemplated that scan page necessarily will appear the case where noise and rotation twist during the scanning process, described Step step2 pre-processes input Laotian textual image by digital image processing techniques, uses local auto-adaptive two RGB color image is transformed into bianry image by value method, is opened operation using morphology and is reduced noise, uses minimum circumscribed rectangle Method corrects image inclination, obtains distortionless binary image, achievees the purpose that obtain feature high quality graphic outstanding.
Further, projection histogram method can analyze image on row or column level in the step step3, root It according to histogram peak/valley feature, is divided at histogram paddy, character cutting can be carried out for Laotian printed matter scanned picture It obtains preferable cutting effect and entire image is cut into monocase image, image format picture is then converted into array, concentrate It is stored into file.
As shown in Fig. 2, being histogram method schematic diagram used in the present invention, downside is pixels statistics histogram in figure, on Side character is Laotian " northwards " word.
Further, position is write for Laotian in the step step4 be divided into tetrameric feature, it is special to design four tunnels Parallel-convolution neural network model carries out corresponding Classification and Identification to different location.For more traditional convolutional neural networks, the model By four tunnels, singly classification replaces classifying one more, greatly reduces categorical measure, greatly reduces model parameter and complexity.
In the present embodiment, as shown in figure 4, every convolutional neural networks all the way in the step step4, according to " input-batch Standardization-convolution-convolution-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-entirely The hierarchical structure of connection-batch standardization-full connection-output ", constructs neural network model, identifies to Laotian character.Volume " convolution-pond-abandons parameter " operation being added in product neural network model can play compression and extract characteristics of image, make spy The effect that sign figure becomes smaller.Batch normalizing operation being added in model can play normalization input data, prevent over-fitting, mention The effect of high-class effect.
The present invention constructs neural network model using the library keras based on the library tensorflow, using suitable specified in library Sequence model method constructs neural network and, according to Fig. 5 parameter, uses the neural networks such as Conv2D predefined in library according to shown in Fig. 4 Level successively builds model, then compilation run model.When using neural network, first model should be trained, that is, read Monocase image file and the corresponding label of each image manually marked, monocase image batch input is parallel to four tunnels In convolutional neural networks model, tone in Laotian character/top vowel/consonant/lower part vowel is determined respectively in every road It Deng writing in the character of different location, is exported according to neural network and is compared with label, backpropagation adjustment is carried out to neural network Parameter, finally when neural network output accuracy reaches maintenance level, preservation model structure and parameter.In practical application rank Section reads the model structure and parameter saved in training process, then reads monocase image file, and be inputted model, I.e. exportable image institute's write characters differentiate as a result, being stored in memory with array form.
Further, the step step5 is judged the character that Step4 is exported by the sequence traversed from front to back, Laotian language feature is merged on the basis of algorithm, being written on the vowel above or below consonant not for Laotian individually makes With the characteristics of, output result is analyzed, judges whether top vowel or lower part vowel false segmentation to be a line, such as exist The individual vowel of a line is cutting mistake, and mistake returns to Step3 and is modified.Then it is accustomed to according to Laotian regular collocation, Induction and conclusion goes out 14 Laotian compound vowel combinations, is modified with this to output text.It is solid for Laotian combination vowel Fixed collocation carries out discriminant analysis to neural network model output result, corrects its mistake output as a result, improving optical character identification Total accuracy.
The present invention proposes that a kind of Laotian block letter text optical character based on convolutional neural networks knows method for distinguishing, leads to Digital image processing techniques, convolutional neural networks and Laotian language feature are crossed, it can be achieved that the printing paper document scanning of Laos's text The optical character identification of part substantially increases Laos's text paper document digitlization input speed, reduces the mistake that manual entry generates Accidentally, the auxiliary of some fundamental aspects is provided for Laotian natural language processing research.
In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept Put that various changes can be made.

Claims (3)

1. a kind of Laotian block letter text optical character recognition methods based on convolutional neural networks, it is characterised in that: including Following steps:
Step1 inputs the accessible digital picture of computer, and input picture is the printed matter obtained by scanner scanning Scanned picture;
Step2 pre-processes image, that is, passes through digital image processing techniques, will according to local auto-adaptive binarization method Input picture is converted to two-value black white image, eliminates noise, is then corrected according to minimum circumscribed rectangle to scalloping;
Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram Peak/valley feature is divided at histogram paddy, and entire image is cut into monocase image;
Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, determines Laos respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in language character;
Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is a line, such as Step3 is returned in the presence of mistake to be modified, is then accustomed to according to Laotian regular collocation, and output text is modified;
Step6, the corresponding text of output input picture.
2. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1, It is characterized in that: RGB color image first being transformed by binary map using local auto-adaptive binarization method in the step step2 Picture, then operation is opened using morphology and reduces noise, image inclination then is corrected using with Minimum Enclosing Rectangle method, is obtained distortionless Binary image.
3. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1, It is characterized in that: every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-convolution-pond Change-discarding parameter-convolution-pond-abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-Quan Lian entirely Connect-export " hierarchical structure, construct neural network model, Laotian character is identified.
CN201910156076.6A 2019-03-01 2019-03-01 Laotian block letter text optical character recognition methods based on convolutional neural networks Pending CN109993162A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910156076.6A CN109993162A (en) 2019-03-01 2019-03-01 Laotian block letter text optical character recognition methods based on convolutional neural networks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910156076.6A CN109993162A (en) 2019-03-01 2019-03-01 Laotian block letter text optical character recognition methods based on convolutional neural networks

Publications (1)

Publication Number Publication Date
CN109993162A true CN109993162A (en) 2019-07-09

Family

ID=67129950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910156076.6A Pending CN109993162A (en) 2019-03-01 2019-03-01 Laotian block letter text optical character recognition methods based on convolutional neural networks

Country Status (1)

Country Link
CN (1) CN109993162A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626292A (en) * 2020-05-09 2020-09-04 北京邮电大学 Character recognition method of building indication mark based on deep learning technology

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456138A (en) * 2010-11-03 2012-05-16 汉王科技股份有限公司 Method and device for pre-processing block Arab characters
CN107305630A (en) * 2016-04-25 2017-10-31 腾讯科技(深圳)有限公司 Text sequence recognition methods and device
CN108520274A (en) * 2018-03-27 2018-09-11 天津大学 High reflecting surface defect inspection method based on image procossing and neural network classification
CN108664975A (en) * 2018-04-24 2018-10-16 新疆大学 A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment
CN109034281A (en) * 2018-07-18 2018-12-18 中国科学院半导体研究所 The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing
CN109190630A (en) * 2018-08-29 2019-01-11 摩佰尔(天津)大数据科技有限公司 Character identifying method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456138A (en) * 2010-11-03 2012-05-16 汉王科技股份有限公司 Method and device for pre-processing block Arab characters
CN107305630A (en) * 2016-04-25 2017-10-31 腾讯科技(深圳)有限公司 Text sequence recognition methods and device
CN108520274A (en) * 2018-03-27 2018-09-11 天津大学 High reflecting surface defect inspection method based on image procossing and neural network classification
CN108664975A (en) * 2018-04-24 2018-10-16 新疆大学 A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment
CN109034281A (en) * 2018-07-18 2018-12-18 中国科学院半导体研究所 The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing
CN109190630A (en) * 2018-08-29 2019-01-11 摩佰尔(天津)大数据科技有限公司 Character identifying method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ITKARE: "Fashion Classification and object detection using CNN", 《INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES(ICTCS 2020)》 *
柴伟佳: "卷积神经网络的多字体汉字识别", 《中国图象图形学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111626292A (en) * 2020-05-09 2020-09-04 北京邮电大学 Character recognition method of building indication mark based on deep learning technology
CN111626292B (en) * 2020-05-09 2023-06-30 北京邮电大学 Text recognition method of building indication mark based on deep learning technology

Similar Documents

Publication Publication Date Title
Singh Optical character recognition techniques: a survey
Dongre et al. A review of research on Devnagari character recognition
Rebelo et al. Optical music recognition: state-of-the-art and open issues
CN110399798A (en) A kind of discrete picture file information extracting system and method based on deep learning
Marinai Introduction to document analysis and recognition
CN111652332B (en) Deep learning handwritten Chinese character recognition method and system based on two classifications
Huang et al. OBC306: A large-scale oracle bone character recognition dataset
CN110866388A (en) Publishing PDF layout analysis and identification method based on mixing of multiple neural networks
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
Karimi et al. Persian handwritten digit recognition using ensemble classifiers
Khedher et al. Automatic processing of Historical Arabic Documents: a comprehensive survey
Nikitha et al. Handwritten text recognition using deep learning
Malakar et al. An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms
Cascianelli et al. Learning to read L’Infinito: handwritten text recognition with synthetic training data
Wüthrich et al. Language model integration for the recognition of handwritten medieval documents
Ul-Hasan Generic text recognition using long short-term memory networks
Dipu et al. Bangla optical character recognition (ocr) using deep learning based image classification algorithms
CN109993162A (en) Laotian block letter text optical character recognition methods based on convolutional neural networks
Naz et al. An OCR system for printed Nasta'liq script: A segmentation based approach
CN112036330A (en) Text recognition method, text recognition device and readable storage medium
Barrere et al. Training transformer architectures on few annotated data: an application to historical handwritten text recognition
CN114639106A (en) Image-text recognition method and device, computer equipment and storage medium
Reul et al. Automatic Semantic Text Tagging on Historical Lexica by Combining OCR and Typography Classification: A Case Study on Daniel Sander's Wörterbuch der Deutschen Sprache
Adak A study on automated handwriting understanding
Shafait Geometric Layout Analysis of scanned documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190709