CN110348280A - Water book character recognition method based on CNN artificial neural - Google Patents

Water book character recognition method based on CNN artificial neural Download PDF

Info

Publication number
CN110348280A
CN110348280A CN201910217488.6A CN201910217488A CN110348280A CN 110348280 A CN110348280 A CN 110348280A CN 201910217488 A CN201910217488 A CN 201910217488A CN 110348280 A CN110348280 A CN 110348280A
Authority
CN
China
Prior art keywords
box
grid
water book
classification
confidence level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910217488.6A
Other languages
Chinese (zh)
Inventor
丁琼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Industry Polytechnic College
Original Assignee
Guizhou Industry Polytechnic College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Industry Polytechnic College filed Critical Guizhou Industry Polytechnic College
Priority to CN201910217488.6A priority Critical patent/CN110348280A/en
Publication of CN110348280A publication Critical patent/CN110348280A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/333Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/36Matching; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of water book character recognition method based on CNN artificial neural, it includes step 1, acquisition water book text sample;Step 2 carries out feature extraction and classification to water book text sample using based on CNN artificial neural model;Step 3, the positioning and detection that water book character is realized using YOLO algorithm;It solves and the problems such as water book text is low there are accuracy rate is identified by conventional text identification technology in the prior art.

Description

Water book character recognition method based on CNN artificial neural
Technical field
The invention belongs to character recognition technology more particularly to a kind of water book Text regions based on CNN artificial neural Method.
Background technique:
In computer vision field, object identification and positioning are always an important research direction, the identification to water book text The problem of exactly belonging to this scope is nonstandardized technique text, in the water book sample of collection since water book text belongs to pictograph It is nearly all hand-written water book data in this, the different people of the same word writes out difference may be very big, and there are also many water book texts Word is all that water outlet book text is accurately identified from a shuffling document together with other texts (such as: Chinese character) shuffling, without Other texts or pattern are misidentified, this extracts Character segmentation, positioning and detection and character feature and is proposed with sorting algorithm Very high requirement;In addition, since water book text characteristically differs greatly from compared with natural picture, and usual nerve net Network is all the pre-training carried out on ImageNet etc. general image classification data collection, common neural network be difficult it is extensive this Difference in a little features;Therefore it is asked using the character recognition method identification water book text of the prior art there are accuracy rate is low etc. Topic.
Summary of the invention:
The technical problem to be solved in the present invention: providing a kind of water book character recognition method based on CNN artificial neural, with solution The problems such as water book text is low there are accuracy rate is certainly identified by conventional text identification technology in the prior art.
Technical solution of the present invention:
A kind of water book character recognition method based on CNN artificial neural, it includes:
Step 1, acquisition water book text sample;
Step 2 carries out feature extraction and classification to water book text sample using based on CNN artificial neural model;
Step 3, the positioning and detection that water book character is realized using YOLO algorithm.
Include: using the method that YOLO algorithm is realized the positioning of water book character and detected described in step 3
Input picture is divided into S*S grid by step 3.1;
Step 3.2, each grid predict B box and its confidence level;
Step 3.3, each box include five predicted values: x, y, w, h and confidence level, x, y represent box center relative to grid Coordinate, w, h represent box width and height, and confidence level represents the IOU of prediction box and all mark boxes;
Step 3.4, each grid will also predict C conditional probability Pr (Class i | Object), and condition is based on grid In the case where comprising object, one group of conditional probability (C) are predicted for each grid, but regardless of the quantity of B;
The conditional probability of classification, is finally multiplied by step 3.5, in the detection with the confidence level of each box:
Pr(Classi|Object) ∗Pr(Object) ∗ IOU = Pr(Classi) ∗ IOU
Each box is obtained for the confidence level of each classification, the probability that this score contains classification accuracy also contains Matching degree of the prediction block to object.
Beneficial effects of the present invention:
The convolutional neural networks (CNN) that the present invention uses are a kind of neural network models of special deep layer, it is by BP and depth Spend the New BP Neural that learning network is combined and generated.Convolutional neural networks are to set the Multilayer Perception of juice to identify two-dimensional shapes Device has local experiences, the feature for the overall situation training that hierarchical structure, feature extraction and assorting process combine.This network structure There can be height invariance to the deformation of translation, rotation, scaling, inclination or other forms;Image is utilized in CNN network Spatial information, enhance the feature in image, simultaneously as share weight, parameter amount is greatly reduced, so that CNN structure It more can be reduced over-fitting in training, improve the generalization ability of model;
The YOLO(You only look once that the present invention uses) it is a kind of feature to be learnt using depth convolutional neural networks To detect and position the algorithm of target detection of object, i.e. the target detection recognizer based on deep learning.YOLO algorithm imitates The mankind identify the mode of object, are concluded according to priori knowledge, fast and accurately position and identify object.YOLO will test change It is Regression a problem, YOLO from the image of input, only passes through a Neural Network, directly Obtain the probability of bounding boxes and each bounding box generic.Entire detection process only has One network, so it can the directly optimization of end-to-end.
Input picture is divided into S*S grid by YOLO, if the coordinate of the center of some object Ground truth Some grid is dropped into, then this grid is just responsible for detecting this object.The advantage of YOLO has:
1) speed is exceedingly fast, and converts classification problem for detection problem with candidate regions domain method compared to R-CNN series, YOLO will test Problem is directly defined as regression problem, so not needing complicated process, it is only necessary to which picture is input in network to you can get it As a result.
2) utilization of global information, YOLO are obtained a result in detection by observing whole picture, rather than as sliding window Mouth method or candidate regions domain method etc. can only utilize area information.Therefore, the background letter of its implicit coded object itself and object Breath, so that background will not be mistakenly considered object by YOLO, the background of YOLO misses detectivity ratio fast R-CNN and lacks half.
3) generalization ability, YOLO can learn to the more general feature of object, when on natural kind picture training after When detecting on the picture of man-made objects, YOLO is far more than top detection algorithm such as DPM, R-CNN.
Since water book text characteristically differs greatly from compared with natural picture, and usually neural network be all The pre-training carried out on ImageNet etc. general image classification data collection, common neural network are difficult in these extensive features Difference, therefore the present invention selection can directly from testing image out position and classification information YOLO algorithm;YOLO makes With 24 layers of convolutional network and last 2 layers of fully-connected network, wherein alternately being reduced using the convolutional layer of 1*1 from preceding layer Feature space, last two layers of full articulamentum, which is modified, replaces with convolutional layer, therefore can input and detect to sizes, together When, full convolutional network can preferably retain the spatial positional information of target relative to full articulamentum.On the other hand, it introduces Residual structure, instruction deep layer network difficulty greatly reduces, therefore the network number of plies can be accomplished deeper, and precision improvement is brighter It is aobvious.
It solves and the problems such as water book text is low there are accuracy rate is identified by conventional text identification technology in the prior art.
Specific embodiment:
Step 1, acquisition water book text sample;
In order to solve the translation text of water book text, we selected page 120 in " water book common dictionary " as basic sample;
Step 2 carries out feature extraction and classification to water book text sample using based on CNN artificial neural model;
Image characteristics extraction, which refers to from object itself, obtains the various measurements or attribute useful for classification.It is obtained according to feature extraction The feature vector arrived assigns a category label to object, so that analysis sample is divided into n class, it is considered that two objects are similar to be Because they have similar feature, the sample with similar characteristics belongs to same category.
In conventional machines learning method, most of is the feature by manually extracting image to be classified, then by feature It is put into common classifier (such as SVM, decision tree, random forest) and classifies, all kinds of probability values is obtained, to judge Which kind of image to be classified belongs to.
In order to solve the problems, such as that existing character recognition method network generalization is insufficient, the present invention uses CNN network model.
Convolutional neural networks (CNN) are a kind of neural network models of special deep layer, it is by BP and deep learning net Network combines and the New BP Neural that generates.Convolutional neural networks are to set the multilayer perceptron of juice to identify two-dimensional shapes, are had Local experiences, the feature for the overall situation training that hierarchical structure, feature extraction and assorting process combine.This network structure can be to flat Shifting, rotation, scaling, inclination or the deformation of other forms have height invariance.
The spatial information of image is utilized in CNN network, enhances the feature in image, simultaneously as sharing weight, significantly Reduce parameter amount, so that CNN structure more can be reduced over-fitting in training, improves the generalization ability of model.
Water book character is identified, it is necessary first to prepare data set, present embodiment is made of 17 water book characters classifies, Each character is extended for 500, is stored under corresponding folder, picture is saved as to the specified format of 50*50*1, is made into Data format to be trained simultaneously divides training set and test set.Then network is established, only we establish one layer of convolution herein Network, convolution kernel is having a size of 3*3, and quantity is 20, and maxpooling is carried out after relu activation primitive, and then access connects entirely Layer is connect to classify.Training parameter is set, network is finally trained and shows accuracy, obtaining accuracy after 10 epoch is 93.74%。
In view of the CNN network structure is relatively simple, and class object is also only 17, compared to MLP network to 6 words The accuracy rate 90.4% of classification is accorded with, this is the result shows that used CNN structure to have biggish promotion to classification performance.Meanwhile to training Hyper parameter modification and generate sample improvement accuracy rate can also be made further to be promoted.
Since the biggish network structure biggish data volume of needs and longer training time, common practices are in pre-training It is finely adjusted on good basic network, i.e., trains the network of weight for one, only change the structure of final full articulamentum, It is then trained using the weight of front as initial value, by test comparison, we select VGG16 as basic network.VGG16 It is formed by stacking by 13 convolutional layers and 3 full articulamentums, it has the advantage that
1) as basic network, classification performance is very good;
2) network structure of VGG16 is very regular, and modification is got up relatively easy;
3) model of training has been announced on ImageNet, can be carried out on this basis to other data sets Finetuning, and it is fine to other data set adaptability;
4) there are many network structure that object detection field does basic network using VGG16, and effect same is also fine.
VGG16 is the network of the ImageNet image library pre-training based on a large amount of true pictures, we will succeed in school The weight of VGG16 moves to the initial weight on the convolutional neural networks of oneself as network, and network our own in this way is just Without being had trained in a large amount of data from the beginning, to improve training speed.
Step 3, the positioning and detection that water book character is realized using YOLO algorithm.
YOLO(You only look once) it is a kind of feature to be learnt using depth convolutional neural networks to detect simultaneously Position the algorithm of target detection of object, i.e. the target detection recognizer based on deep learning.YOLO algorithm imitates mankind's identification The mode of object, is concluded according to priori knowledge, fast and accurately positions and identify object.YOLO, which will test, becomes one Regression problem, YOLO only pass through a Neural Network, directly obtain from the image of input The probability of bounding boxes and each bounding box generic.Entire detection process only has one Network, so it can the directly optimization of end-to-end.
Input picture is divided into S*S grid by YOLO, if the coordinate of the center of some object Ground truth Some grid is dropped into, then this grid is just responsible for detecting this object.The advantage of YOLO has:
1) speed is exceedingly fast, and converts classification problem for detection problem with candidate regions domain method compared to R-CNN series, YOLO will test Problem is directly defined as regression problem, so not needing complicated process, it is only necessary to which picture is input in network to you can get it As a result.
2) utilization of global information, YOLO are obtained a result in detection by observing whole picture, rather than as sliding window Mouth method or candidate regions domain method etc. can only utilize area information.Therefore, the background letter of its implicit coded object itself and object Breath, so that background will not be mistakenly considered object by YOLO, the background of YOLO misses detectivity ratio fast R-CNN and lacks half.
3) generalization ability, YOLO can learn to the more general feature of object, when on natural kind picture training after When detecting on the picture of man-made objects, YOLO is far more than top detection algorithm such as DPM, R-CNN.
Since water book text characteristically differs greatly from compared with natural picture, and usually neural network be all The pre-training carried out on ImageNet etc. general image classification data collection, common neural network are difficult in these extensive features Difference, therefore selected can directly from testing image out position and classification information YOLO algorithm.
The target detection process of YOLO:
1) input picture is divided into S*S grid, if jobbie center is fallen on this grid, this grid is auxiliary negative Duty detects this object.
2) each grid predicts the confidence level of B box and it, confidence level are as follows: Pr (Object) IOU(truth Pred), if without object, confidence level 0, if there is object, confidence level be predict box and mark box intersection with simultaneously The ratio between collection (IOU)
3) each box includes five predicted values: x, y, w, h and confidence level, x, y represent seat of the box center relative to grid Mark, w, h represent box width and height, are relative to picture and are normalized.Confidence level represents prediction box and all marks Infuse the IOU of box.
4) each grid will also predict C conditional probability Pr (Class i | Object), and condition is based on grid packet In the case where object.One group of conditional probability (C) are predicted for each grid, but regardless of the quantity of B.
5) in the detection, finally the conditional probability of classification is multiplied with the confidence level of each box:
6) Pr (Classi | Object) Pr (Object) IOU=Pr (Classi) IOU
Each box is obtained for the confidence level of each classification, the probability that this score contains classification accuracy also includes Matching degree of the prediction block to object.
YOLO uses 24 layers of convolutional network and last 2 layers of fully-connected network, wherein alternately being subtracted using the convolutional layer of 1*1 The small feature space from preceding layer, last two layers of full articulamentum, which is modified, replaces with convolutional layer, therefore can be to sizes Input is detected, meanwhile, full convolutional network can preferably retain the spatial positional information of target relative to full articulamentum.Separately On the one hand, introduce residual structure, instruction deep layer network difficulty greatly reduces, therefore the network number of plies can be accomplished it is deeper, Precision improvement is obvious.
For the training effect of raising network, increase generalization ability and identification robustness, when training, sample is scaled, is rotated, Tone, contrast and distortion etc. adjust at random with exptended sample.

Claims (2)

1. a kind of water book character recognition method based on CNN artificial neural, it includes:
Step 1, acquisition water book text sample;
Step 2 carries out feature extraction and classification to water book text sample using based on CNN artificial neural model;
Step 3, the positioning and detection that water book character is realized using YOLO algorithm.
2. a kind of water book character recognition method based on CNN artificial neural according to claim 1, feature exist In: the method for realizing the positioning and detection of water book character described in step 3 using YOLO algorithm includes:
Input picture is divided into S*S grid by step 3.1;
Step 3.2, each grid predict B box and its confidence level;
Step 3.3, each box include five predicted values: x, y, w, h and confidence level, x, y represent box center relative to grid Coordinate, w, h represent box width and height, and confidence level represents the IOU of prediction box and all mark boxes;
Step 3.4, each grid will also predict C conditional probability Pr (Class i | Object), and condition is based on grid In the case where comprising object, one group of conditional probability (C) are predicted for each grid, but regardless of the quantity of B;
The conditional probability of classification, is finally multiplied by step 3.5, in the detection with the confidence level of each box:
Pr(Classi|Object) ∗Pr(Object) ∗ IOU = Pr(Classi) ∗ IOU
Each box is obtained for the confidence level of each classification, the probability that this score contains classification accuracy also contains Matching degree of the prediction block to object.
CN201910217488.6A 2019-03-21 2019-03-21 Water book character recognition method based on CNN artificial neural Pending CN110348280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910217488.6A CN110348280A (en) 2019-03-21 2019-03-21 Water book character recognition method based on CNN artificial neural

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910217488.6A CN110348280A (en) 2019-03-21 2019-03-21 Water book character recognition method based on CNN artificial neural

Publications (1)

Publication Number Publication Date
CN110348280A true CN110348280A (en) 2019-10-18

Family

ID=68174344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910217488.6A Pending CN110348280A (en) 2019-03-21 2019-03-21 Water book character recognition method based on CNN artificial neural

Country Status (1)

Country Link
CN (1) CN110348280A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909734A (en) * 2019-10-29 2020-03-24 福建两岸信息技术有限公司 Document character detection and identification method
CN111126128A (en) * 2019-10-29 2020-05-08 福建两岸信息技术有限公司 Method for detecting and dividing document layout area
CN111310868A (en) * 2020-03-13 2020-06-19 厦门大学 Water-based handwritten character recognition method based on convolutional neural network
CN111401371A (en) * 2020-06-03 2020-07-10 中邮消费金融有限公司 Text detection and identification method and system and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016033710A1 (en) * 2014-09-05 2016-03-10 Xiaoou Tang Scene text detection system and method
CN108520254A (en) * 2018-03-01 2018-09-11 腾讯科技(深圳)有限公司 A kind of Method for text detection, device and relevant device based on formatted image
CN109241904A (en) * 2018-08-31 2019-01-18 平安科技(深圳)有限公司 Text region model training, character recognition method, device, equipment and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016033710A1 (en) * 2014-09-05 2016-03-10 Xiaoou Tang Scene text detection system and method
CN108520254A (en) * 2018-03-01 2018-09-11 腾讯科技(深圳)有限公司 A kind of Method for text detection, device and relevant device based on formatted image
CN109241904A (en) * 2018-08-31 2019-01-18 平安科技(深圳)有限公司 Text region model training, character recognition method, device, equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑伊等: "候选框密度可变的YOLO网络国际音标字符识别方法", 《计算机应用》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909734A (en) * 2019-10-29 2020-03-24 福建两岸信息技术有限公司 Document character detection and identification method
CN111126128A (en) * 2019-10-29 2020-05-08 福建两岸信息技术有限公司 Method for detecting and dividing document layout area
CN111310868A (en) * 2020-03-13 2020-06-19 厦门大学 Water-based handwritten character recognition method based on convolutional neural network
CN111401371A (en) * 2020-06-03 2020-07-10 中邮消费金融有限公司 Text detection and identification method and system and computer equipment
CN111401371B (en) * 2020-06-03 2020-09-08 中邮消费金融有限公司 Text detection and identification method and system and computer equipment

Similar Documents

Publication Publication Date Title
Alonso et al. Adversarial generation of handwritten text images conditioned on sequences
Khan et al. KNN and ANN-based recognition of handwritten Pashto letters using zoning features
Mishra et al. Top-down and bottom-up cues for scene text recognition
Yao et al. Strokelets: A learned multi-scale representation for scene text recognition
CN110348280A (en) Water book character recognition method based on CNN artificial neural
Bai et al. Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition
CN103942550B (en) A kind of scene text recognition methods based on sparse coding feature
Rioux-Maldague et al. Sign language fingerspelling classification from depth and color images using a deep belief network
CN106408030B (en) SAR image classification method based on middle layer semantic attribute and convolutional neural networks
Bhowmik et al. Recognition of Bangla handwritten characters using an MLP classifier based on stroke features
CN107729865A (en) A kind of handwritten form mathematical formulae identified off-line method and system
CN110555475A (en) few-sample target detection method based on semantic information fusion
Sun et al. Robust text detection in natural scene images by generalized color-enhanced contrasting extremal region and neural networks
Burie et al. ICFHR2016 competition on the analysis of handwritten text in images of balinese palm leaf manuscripts
Hu et al. MST-based visual parsing of online handwritten mathematical expressions
CN107704859A (en) A kind of character recognition method based on deep learning training framework
CN104834941A (en) Offline handwriting recognition method of sparse autoencoder based on computer input
Pacha et al. Towards self-learning optical music recognition
Tian et al. Natural scene text detection with MC–MR candidate extraction and coarse-to-fine filtering
Patel et al. Gujarati handwritten character recognition using hybrid method based on binary tree-classifier and k-nearest neighbour
CN109034281A (en) The Chinese handwritten body based on convolutional neural networks is accelerated to know method for distinguishing
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
Hajič Jr et al. Detecting noteheads in handwritten scores with convnets and bounding box regression
Antony et al. Haar features based handwritten character recognition system for Tulu script
CN109902692A (en) A kind of image classification method based on regional area depth characteristic coding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191018

RJ01 Rejection of invention patent application after publication