CN109993162A

CN109993162A - Laotian block letter text optical character recognition methods based on convolutional neural networks

Info

Publication number: CN109993162A
Application number: CN201910156076.6A
Authority: CN
Inventors: 周兰江; 郝永彬; 周枫; 张建安
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2019-03-01
Filing date: 2019-03-01
Publication date: 2019-07-09

Abstract

The Laotian block letter text optical character recognition methods based on convolutional neural networks that the invention discloses a kind of, belongs to natural language processing and machine learning techniques field.The present invention carries out binary conversion treatment to image first after inputting block letter textual scan picture, and carries out rotational correction.Then because Laos's text is the horizontally-arranged vowel affix text write from left to right, pretreated image cuts whole page character according to the sequence of Row Column by projection histogram method, is cut into the combination of Laos's Chinese character.Then in the four road parallel-convolution neural network models Laos's Chinese character being syncopated as input established for Laotian feature, corresponding character text is exported.Finally character text sequence is post-processed according to Laotian language rule, generates final text output.The present invention has certain application value in Laotian character recognition and papery data digitized processing.

Description

Laotian block letter text optical character recognition methods based on convolutional neural networks

Technical field

The Laotian block letter text optical character recognition methods based on convolutional neural networks that the present invention relates to a kind of, belongs to Rare foreign languages field of optical character recognition in natural language processing turns letter application scene suitable for Laotian image.

Background technique

Optical character identification is to be recognized to be printed on paper or people writes on text on paper automatically with computer, is pattern-recognition An and important branch of natural language processing field.At present, in world wide optical character identification research primarily directed in The identification of the mainstream speeches such as text, English, it is domestic also to have certain research to minority language identification, mainly Mongolian, Tibetan language, The texts such as Balakrishnan.One of the problem of optical character identification is mainly studied is exactly the classification problem to character picture.

Laos anticipates as the country bordered on along the Belt and Road and with China, official language Laotian with larger research Justice.But Laos's economy is more undeveloped, and information technology correlative study and industry more fall behind, and lacks Laotian natural language processing and grinds Necessary number corpus when studying carefully, therefore there are very big Laotian papery datas to digitize demand.In Laos's text papery data number During change, existing input method is mostly manually typing, and there are Laotian, the talent is less, and input speed is slower, essence The problem of exactness is influenced by typing person's level.Therefore a kind of Characters method based on optical character identification is needed, record is accelerated Enter speed, improves the accuracy of typing text.

Summary of the invention

The technical problem to be solved in the present invention is to provide a kind of Laotian block letter text light based on convolutional neural networks Character identifying method is learned, for solving the problems, such as that Laotian papery data digitizes typing.

The technical solution adopted by the present invention is that: a kind of Laotian block letter text optical character based on convolutional neural networks Recognition methods includes the following steps:

Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning Brush product scanned picture；

Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping Just；

Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image；

Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character；

Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text；

Step6, the corresponding text of output input picture.

Specifically, RGB color image is first transformed into two using local auto-adaptive binarization method in the step step2 It is worth image, then opens to operate using morphology and reduce noise, then uses with Minimum Enclosing Rectangle method amendment image inclination, acquisition nothing Distort binary image.

Specifically, every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-volume Product-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-entirely The hierarchical structure of full connection-output ", constructs neural network model, identifies to Laotian character.

The beneficial effects of the present invention are:

(1) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian spy Point design four tunnels parallel neural networks, greatly reduce class categories, reduce model complexity, accelerate identification speed Degree.

(2) should Laotian block letter text optical character recognition methods based on convolutional neural networks, for Laotian this The lesser language of gap between kind character, the traditional machines such as method or support vector machines compared to template matching method etc. based on feature Device learning method, accuracy of identification are significantly improved.

(3) it is somebody's turn to do the Laotian block letter text optical character recognition methods based on convolutional neural networks, using convolutional Neural Network identifies character, and batch normalization layer is added, and has better generalization ability, still have when character resolution ratio is lower compared with Good recognition effect.

Detailed description of the invention

Fig. 1 is overall flow figure of the invention；

Fig. 2 is histogram method schematic diagram used in the present invention, and downside is pixels statistics histogram in figure, and upside character is Laotian " northwards " word；

Fig. 3 is Laotian rules for writing based on the present invention, and kinds of characters has its different writing position, and word is in figure Laotian " data " word；

Fig. 4 is single channel convolutional neural networks structure chart used in the present invention.It is for NatchNormalization layers in figure Normalizing operation layer is criticized, Conv2D layers are convolution operation layer, and MaxPooling layers are pond layer, and Dropout is to abandon parameter layer, Flatten is flattening layer, and Decse layers are full articulamentum, and level superposition constitutes neural network model, according to the old of Input input Laos's language character generates corresponding Output character and exports result.

Fig. 5 is the every layer parameter of single channel convolutional neural networks used in the present invention, and Layer is hierarchy name, Output Shape is every layer of output tensor size, and Kernel Size is every layer of convolution kernel size.Parameter can define specific nerve net according to this Network model.

Specific embodiment

In the following with reference to the drawings and specific embodiments, the present invention is described further.

Embodiment 1: as shown in Figs. 1-5, a kind of Laotian block letter text optical character knowledge based on convolutional neural networks Other method, includes the following steps:

Step1 inputs the accessible digital picture of computer, and input picture is the print obtained by scanner scanning Brush product scanned picture.Specifically, the present embodiment is read from local disk using Python by using the library open CV Printed matter scanned picture, storage is into memory in the form of image；

Step2 pre-processes image, i.e., by digital image processing techniques, according to local auto-adaptive binaryzation side Input picture is converted to two-value black white image by method, is eliminated noise, is then rectified according to minimum circumscribed rectangle to scalloping Just.

Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation.According to histogram Figure peak/valley feature, is divided at histogram paddy, and entire image is cut into monocase image.

Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, is determined respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in Laotian character.

Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is one Row such as returns to Step3 in the presence of mistake and is modified, is then accustomed to according to Laotian regular collocation, is modified to output text.

Step6, the corresponding text of output input picture.

Further, it is contemplated that scan page necessarily will appear the case where noise and rotation twist during the scanning process, described Step step2 pre-processes input Laotian textual image by digital image processing techniques, uses local auto-adaptive two RGB color image is transformed into bianry image by value method, is opened operation using morphology and is reduced noise, uses minimum circumscribed rectangle Method corrects image inclination, obtains distortionless binary image, achievees the purpose that obtain feature high quality graphic outstanding.

Further, projection histogram method can analyze image on row or column level in the step step3, root It according to histogram peak/valley feature, is divided at histogram paddy, character cutting can be carried out for Laotian printed matter scanned picture It obtains preferable cutting effect and entire image is cut into monocase image, image format picture is then converted into array, concentrate It is stored into file.

As shown in Fig. 2, being histogram method schematic diagram used in the present invention, downside is pixels statistics histogram in figure, on Side character is Laotian " northwards " word.

Further, position is write for Laotian in the step step4 be divided into tetrameric feature, it is special to design four tunnels Parallel-convolution neural network model carries out corresponding Classification and Identification to different location.For more traditional convolutional neural networks, the model By four tunnels, singly classification replaces classifying one more, greatly reduces categorical measure, greatly reduces model parameter and complexity.

In the present embodiment, as shown in figure 4, every convolutional neural networks all the way in the step step4, according to " input-batch Standardization-convolution-convolution-pond-abandons parameter-convolution-pond-and abandons parameter-convolution-pond-batch standardization-flattening-entirely The hierarchical structure of connection-batch standardization-full connection-output ", constructs neural network model, identifies to Laotian character.Volume " convolution-pond-abandons parameter " operation being added in product neural network model can play compression and extract characteristics of image, make spy The effect that sign figure becomes smaller.Batch normalizing operation being added in model can play normalization input data, prevent over-fitting, mention The effect of high-class effect.

The present invention constructs neural network model using the library keras based on the library tensorflow, using suitable specified in library Sequence model method constructs neural network and, according to Fig. 5 parameter, uses the neural networks such as Conv2D predefined in library according to shown in Fig. 4 Level successively builds model, then compilation run model.When using neural network, first model should be trained, that is, read Monocase image file and the corresponding label of each image manually marked, monocase image batch input is parallel to four tunnels In convolutional neural networks model, tone in Laotian character/top vowel/consonant/lower part vowel is determined respectively in every road It Deng writing in the character of different location, is exported according to neural network and is compared with label, backpropagation adjustment is carried out to neural network Parameter, finally when neural network output accuracy reaches maintenance level, preservation model structure and parameter.In practical application rank Section reads the model structure and parameter saved in training process, then reads monocase image file, and be inputted model, I.e. exportable image institute's write characters differentiate as a result, being stored in memory with array form.

Further, the step step5 is judged the character that Step4 is exported by the sequence traversed from front to back, Laotian language feature is merged on the basis of algorithm, being written on the vowel above or below consonant not for Laotian individually makes With the characteristics of, output result is analyzed, judges whether top vowel or lower part vowel false segmentation to be a line, such as exist The individual vowel of a line is cutting mistake, and mistake returns to Step3 and is modified.Then it is accustomed to according to Laotian regular collocation, Induction and conclusion goes out 14 Laotian compound vowel combinations, is modified with this to output text.It is solid for Laotian combination vowel Fixed collocation carries out discriminant analysis to neural network model output result, corrects its mistake output as a result, improving optical character identification Total accuracy.

The present invention proposes that a kind of Laotian block letter text optical character based on convolutional neural networks knows method for distinguishing, leads to Digital image processing techniques, convolutional neural networks and Laotian language feature are crossed, it can be achieved that the printing paper document scanning of Laos's text The optical character identification of part substantially increases Laos's text paper document digitlization input speed, reduces the mistake that manual entry generates Accidentally, the auxiliary of some fundamental aspects is provided for Laotian natural language processing research.

In conjunction with attached drawing, the embodiment of the present invention is explained in detail above, but the present invention is not limited to above-mentioned Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept Put that various changes can be made.

Claims

1. a kind of Laotian block letter text optical character recognition methods based on convolutional neural networks, it is characterised in that: including Following steps:

Step1 inputs the accessible digital picture of computer, and input picture is the printed matter obtained by scanner scanning Scanned picture；

Step2 pre-processes image, that is, passes through digital image processing techniques, will according to local auto-adaptive binarization method Input picture is converted to two-value black white image, eliminates noise, is then corrected according to minimum circumscribed rectangle to scalloping；

Step3 analyzes projection histogram method longitudinal after the first transverse direction of distortionless image progress of binaryzation, according to histogram Peak/valley feature is divided at histogram paddy, and entire image is cut into monocase image；

Monocase image is inputted in the parallel convolutional neural networks model in four tunnels, determines Laos respectively in every road by Step4 Tone/top vowel/consonant/lower part vowel etc. is write in the character of different location, output character in language character；

Step5 judges the character of Step4 output, if by top vowel or lower part vowel false segmentation is a line, such as Step3 is returned in the presence of mistake to be modified, is then accustomed to according to Laotian regular collocation, and output text is modified；

Step6, the corresponding text of output input picture.

2. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1, It is characterized in that: RGB color image first being transformed by binary map using local auto-adaptive binarization method in the step step2 Picture, then operation is opened using morphology and reduces noise, image inclination then is corrected using with Minimum Enclosing Rectangle method, is obtained distortionless Binary image.

3. the Laotian block letter text optical character recognition methods based on convolutional neural networks according to claim 1, It is characterized in that: every convolutional neural networks all the way in the step step4, according to " input-batch standardization-convolution-convolution-pond Change-discarding parameter-convolution-pond-abandons parameter-convolution-pond-batch standardization-flattening-connection-batch standardization-Quan Lian entirely Connect-export " hierarchical structure, construct neural network model, Laotian character is identified.