CN108681735A

CN108681735A - Optical character recognition method based on convolutional neural networks deep learning model

Info

Publication number: CN108681735A
Application number: CN201810270374.3A
Authority: CN
Inventors: 陆成学
Original assignee: China Science And Technology Co Ltd (beijing) Technology Co Ltd
Current assignee: China Science And Technology Co Ltd (beijing) Technology Co Ltd
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2018-10-19

Abstract

The present invention discloses a kind of optical character recognition method based on convolutional neural networks deep learning model.This approach includes the following steps：The Chinese characters in common use and 10 Arabic numerals of collection different fonts and 26 English alphabet data sets are simultaneously converted into picture format；Slight distortion and rotation are carried out to enhance the robustness of model to picture, generate model training database；Establish the deep learning model of optical character identification；Training set image input model is continued to optimize into object function using convolutional neural networks model by the method for supervised learning, learns a multi-categorizer；For new test sample, feature extraction is carried out to it based on model obtained in the previous step and application model grader obtains final classification result.The present invention proposes new model and method to application of the deep learning based on convolutional neural networks in optical character identification, this method can be applied in general pattern classification task, especially text identification problem, the optical character identification model proposed by the present invention based on deep learning can significantly improve the recognition correct rate of character recognition.

Description

Optical character recognition method based on convolutional neural networks deep learning model

Technical field

The present invention relates to computer vision, pattern-recognition, the technical fields such as natural scene feature recognition, especially a kind of bases In the optical character recognition method of convolutional neural networks deep learning model.

Background technology

Optical character identification is because it in real-life practicability has obtained the extensive concern of domestic and foreign scholars, base at present It is concentrated mainly on scanned document character recognition in the application of optical character identification.Optical character identification streetscape identifier identify, The foreground of being widely applied also is used in bank's ID card information identification, classroom blackboard-writing identification etc..Optical character identification has height The advantages of effect property and convenience.There is a large amount of research effort just constantly promoting the development of field of optical character recognition at present.

A usual character recognition system is acquired by character, Character segmentation, feature extraction, several step structures such as characteristic matching At.Wherein feature extraction has most important influence for the accuracy of character recognition.When the feature using most identification When matching is compared to character, better discrimination can be usually obtained, it is on the contrary then will be greatly reduced character recognition system Accuracy.And the research of character recognition is also concentrated mainly in the method for character feature extraction, it is based on convolutional neural networks Deep learning method in detection feature automatically and extract characteristic aspect and have big advantage.

In recent years, the deep learning model based on convolutional neural networks is prominent in numerous computer vision problems because of it Go out performance and obtains great concern.Its basic thought is to carry an original image automatically behind multilayer convolution sum pond Take wherein most representational feature.Deep learning is all obtained in character recognition, image classification, natural language processing etc. in fields Obtained howling success.And with the development of technology, how to learn to suitable for particular problem (such as be used for image classification, character Identification) model become scholars' focus of attention.

Using the method for deep learning, a weight matrix with identification can be obtained by study and is biased towards Amount.Weight vector and biasing constitute a grader, and classification knot will be can be obtained after character input grader to be tested Fruit.Research under this theoretical frame is mainly concentrated in so that the model learnt has differentiation performance more outstanding.

However, in character recognition problem under practical application scene, it is not usually to mark that we, which can be obtained character picture, Accurate character picture.Due to intensity of illumination, the factors such as placement position, character picture usually has a degree of rotation or torsion It is bent.If the character picture of standard is directly used in above-mentioned model, is had in the model acquired and greatly represent judgement index Weaker requires character picture very stringent information, then the recognition correct rate of model can substantially reduce.And if it is intended to obtaining Obtain good recognition effect, it usually needs the additional capacity for increasing character training set is to expand its coverage area.

For deep learning model have the characteristics that good ability in feature extraction this, it is proposed that existing optics word Symbol identification model is improved, and learns a grader under deep learning frame to complete the identification to character.In this way in reality It, can be in a unification from the identification of the input character picture of non-standard (including but not limited to) to the end under the application environment of border Frame in be resolved.

Invention content

(1) technical problems to be solved

The problem of for input picture in character recognition problem under actual environment may be non-standard image, the present invention propose Character feature extraction and character recognition are placed on by a kind of optical character recognition method based on neural network deep learning model It is resolved under one unified frame so that it is correct that the interaction of above-mentioned two step improves final character recognition jointly Rate.

(2) technical solution

A kind of technical solution of optical character recognition method based on neural network deep learning model proposed by the present invention It is as follows：

Step S1 collects the Chinese character of common different fonts, 10 Arabic numerals and 26 English alphabets and generates figure The data set of piece format.

Step S2, training set and test set sample to acquisition suitably carry out slight rotation and distortion.

Step S3, each layer weight matrix parameter W and biasing b of the grader of Optimization Learning training set, passes through stochastic gradient The optimal way of descent method (SGD) minimizes object function, study optimum classifier parameter W and b.

Step S4 carries out a propagated forward, calculates the probability value of its affiliated each classification, obtains the classification of test character As a result.

Beneficial effects of the present invention：The present invention is directed to the character recognition problem under actual application environment, can directly input Non-standard character image carries out character recognition.It is placed on a unified model frame by expressing character feature, with character recognition It is solved under frame, it is hereby achieved that higher discrimination, enhances the robustness of algorithm.

Description of the drawings

Fig. 1 is the system flow chart of the optical character recognition method based on neural network deep learning model.

Specific implementation mode

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific example, and with reference to detailed Thin attached drawing, the present invention is described in more detail.But described embodiment is intended merely to facilitate the understanding of the present invention, and right It does not play any restriction effect.

Fig. 1 is flow chart of the method for the present invention, as shown in Figure 1, proposed by the present invention a kind of based on neural network depth The optical character recognition method for practising model includes following steps：

Step S1 collects the Chinese character of common different fonts, 26 English alphabets of 10 Arabic numerals and English alphabet And generate the data set of picture format.

Step S3, each layer weight matrix parameter W of the grader of Optimization Learning training set, passes through stochastic gradient descent method (SGD) optimal way minimizes object function, study optimum classifier parameter W and b.

S31 initializes weights square for multiple convolution kernels of each convolutional layer in training set by Gaussian Profile Battle array.Next, entering alternately error propagated forward and gradient back-propagation process, each of which volume is provided simultaneously by SGD algorithms The weights of product core.S32 and S33 is recycled until restraining or reaching iterations requirement.

This is the object function of a typical classification problem, and the optimization for completing this object function can be in the hope of one group Sorting parameter W and b.

S32, the value of propagated forward counting loss function：

S33, the Grad of backpropagation counting loss function pair parameters.

Wherein, f is hidden layer.

Step S4 carries out a propagated forward, calculates the probability value of its affiliated each classification.

Wherein, s=g (x_i；W, b).

Case study on implementation：

For the specific implementation mode and verification effectiveness of the invention that the present invention will be described in detail, we propose the present invention Method be applied to the database that forms of picture generated by Chinese characters in common use, 10 Arabic numerals and 26 letters.The data Library is included in the image that rotation in various degree and distortion obtain.In our embodiment, we extract every in image first A character.Using the single character after extraction as the input feature vector of training and test.

According to the step S3 in the technical detail introduced before, we first carry out all training set data input models Training, wherein training parameter W are set as Gaussian Profile, mean value 0, standard deviation 0.01.Next according to step S31, S32 and S33 completes the training to model.Grader is inputted to obtain final classification results by step S4 to new test image.

Particular embodiments described above has carried out further in detail the purpose of the present invention, technical solution and advantageous effect It describes in detail bright, it should be understood that the above is only a specific embodiment of the present invention, is not intended to restrict the invention, it is all Within the spirit and principles in the present invention, any modification, equivalent substitution, improvement and etc. done should be included in the guarantor of the present invention Within the scope of shield.

Claims

1. a kind of optical character recognition method based on convolutional neural networks deep learning model, which is characterized in that this method Specific steps include：

Step S1 collects the Chinese character of common different fonts, 10 Arabic numerals and 26 English alphabets and generates picture lattice The data set of formula.

Step S2, training set and test set sample to acquisition suitably carry out slight rotation and distortion processing.

Step S3, each layer weight matrix parameter W and biasing b of the grader of Optimization Learning training set, passes through stochastic gradient descent The optimal way of method (SGD) minimizes object function, study optimum classifier parameter W and b.

2. the optical character recognition method according to claim 1 based on convolutional neural networks deep learning model, special Sign is, in the step S1, collects the Chinese characters in common use and 10 Arabic numerals and 26 that all identity cards are related to English alphabet.

3. the optical character recognition method according to claim 1 based on convolutional neural networks deep learning model, special Sign is, in step s 2, to training set and test set sample differ mild distortion and the rotation of degree, after processing Image as input feature vector.

4. the optical character recognition method according to claim 1 based on convolutional neural networks deep learning model, special Sign is that in step s3, the optimization of deep learning model needs to complete by stochastic gradient descent method iteration optimization strategy, Specific process is summarized as follows：

S31 initializes weight matrix for multiple convolution kernels of each convolutional layer in training set by Gaussian Profile.It connects Get off, into alternately error propagated forward and gradient back-propagation process, each of which convolution kernel is provided by SGD algorithms simultaneously Weights.S32 and S33 is recycled until restraining or reaching iterations requirement.

S32, the value of propagated forward counting loss function：

This is the object function of a typical classification problem, and the optimization for completing this object function can be in the hope of one group of classification Parameter W and b.

S33, the Grad of backpropagation counting loss function pair parameters.

Wherein, f is hidden layer.

5. the optical character recognition method according to claim 1 based on convolutional neural networks deep learning model, special Sign is, in step s3, after model training, for a new test sample y_test, depth is acquired currently It practises in model and predicts its value.Its concrete operation step is as follows：

S4 carries out a propagated forward, calculates the probability value of its affiliated each classification.

Wherein, s_i=g (x_i；W, b), i=1,2 ... m.

6. the optical character recognition method according to claim 5 based on convolutional neural networks deep learning model, special Sign is, in steps of 5, after the probability value for calculating all categories, last classification results is determined according to the size of probability value.