CN110287839A - Handwritten numeral image recognition classification method - Google Patents

Handwritten numeral image recognition classification method Download PDF

Info

Publication number
CN110287839A
CN110287839A CN201910521269.7A CN201910521269A CN110287839A CN 110287839 A CN110287839 A CN 110287839A CN 201910521269 A CN201910521269 A CN 201910521269A CN 110287839 A CN110287839 A CN 110287839A
Authority
CN
China
Prior art keywords
classification
data
result
image
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910521269.7A
Other languages
Chinese (zh)
Inventor
常敏
陈果
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN201910521269.7A priority Critical patent/CN110287839A/en
Publication of CN110287839A publication Critical patent/CN110287839A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/30Writer recognition; Reading and verifying signatures
    • G06V40/37Writer recognition; Reading and verifying signatures based only on signature signals such as velocity or pressure, e.g. dynamic signature recognition
    • G06V40/394Matching; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to a kind of handwritten numeral image recognition classification methods, randomly select feeding BP neural network after data pre-process to data concentration and are trained, form trained neural network, be referred to as a Weak Classifier;Repetition training obtains several Weak Classifiers, combines several Weak Classifiers using improved ADABOOST algorithm, forms a strong classifier, finally will be with strong classifier to data Classification and Identification.The present invention optimizes the discriminant approach in ADABOOST, make its structure more rationally, intelligence, optimization, Classification and Identification is carried out to handwritten numeral by improved algorithm, to greatly strengthen its resolution capability, it can be greatly reduced since personnel's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to the classification fault caused by identifying, to improve the accuracy rate of detection, erroneous judgement and the generation for problem of failing to judge are reduced.

Description

Handwritten numeral image recognition classification method
Technical field
The present invention relates to a kind of image recognition technology, in particular to a kind of handwritten numeral image recognition classification method.
Background technique
Handwritten Digit Recognition is a branch of Symbol recognition, although only identifying simple 10 numbers, is had Very big practical value.In our daily life, a large amount of document processing jobs, duty receipt, bank will be carried out daily Check, money order, the processing of Credit Statement and sorting of post office letter etc., how using computer character identification and Document processing technology frees people from these heavy hand labours and has become one and in the urgent need to address ask Topic.Although handwritten numeral only 10 types, very high to the required precision of identification in many cases, and everyone has Different writings will be accomplished accurately to identify or acquire a certain degree of difficulty.Moreover in practical applications, the essence of Handwritten Digit Recognition Exactness require it is stringenter than Chinese character more because number identification is frequently used in the fields such as finance, finance.
Summary of the invention
The problem of the present invention be directed to Handwritten Digit Recognitions proposes a kind of handwritten numeral image recognition classification method, makees To assist interpretation method, avoid since personnel's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to caused by identification False diagnosis reduces erroneous judgement and the generation for problem of failing to judge to improve the diagnosis of detection.
The technical solution of the present invention is as follows: a kind of handwritten numeral image recognition classification method, specifically comprises the following steps:
1) among data set, 60,000 pictures are randomly selected in data as training set, in training set comprising 0 to 9 this 10 A each 6 thousand sheets of handwritten numeral picture, the picture size in data set are 28*28;
2) binary conversion treatment, is carried out to the image of selection, images to be recognized is converted into binary picture.It is again that image is big It is small to zoom in and out, it is uniformly scaled the image of size 10*10, is finally again rearranged the image of this 10*10, by its turn It is changed to the dimension array of 1*100;
3), training set is brought into BP nerve after the dimension array of step 2) treated 60,000 1*100 upsets sequence Network is trained, and is formed trained neural network, is referred to as a Weak Classifier;Then upset again again It is brought into after data sequence in neural network and is trained to obtain second Weak Classifier;Same method obtains several Weak Classifiers;
4) several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, most After will be with strong classifier to data Classification and Identification.
ADABOOST algorithm improvement method is as follows in the step 4):
4.1) mode decision scheme of original ADABOOST algorithm is improved, the classification letter in former ADABOOST algorithm Number:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction Training weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number;
4.2) being modified h (x) allows it more to optimize, and changes as follows:
4.2.1) " 0 " " 1 " is counted respectively according to obtained result in step 3), " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ", " 8 ", it is that 6000 pictures of " 9 " export as a result, the matrix of 10 1*6000 of its result composition is uniformly denoted as yn(xj), wherein J=1~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification) is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained, It is denoted as wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w) is calculatedn(x) average value, by this 10 average values according to number 0~9 Sequence is successively included into the same matrix, is denoted as ave (x), has just obtained a matrix being arranged successively by 0~9;
4.2.4 two adjacent numbers) are subjected to mean value calculation again in ave (x), i.e., by " 0 " in ave (x) and " 1 " Value carry out a mean value calculation, followed by " 1 " and " 2 ", the rest may be inferred, 9 average values will be obtained, it is successively denoted as c1, C2 ... ..., c9, final classification function are as follows:
The beneficial effects of the present invention are: handwritten numeral image recognition classification method of the present invention, to original ADABOOST Algorithm improves, and the discriminant approach in ADABOOST is optimized, make its structure more rationally, intelligence, optimization, pass through Improved algorithm, which carries out Classification and Identification to handwritten numeral, can be greatly reduced to greatly strengthen its resolution capability due to people Member's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to the classification fault caused by identifying, to improve detection Accuracy rate, reduce erroneous judgement and problem of failing to judge generation.
Detailed description of the invention
Fig. 1 is handwritten numeral image recognition classification method flow diagram of the present invention;
Fig. 2 is the flow diagram of the improved ADABOOST algorithm of the present invention.
Specific embodiment
Handwritten numeral image recognition classification method flow diagram as shown in Figure 1, specifically comprises the following steps:
1, among data set, 60,000 pictures are randomly selected in data as training set, remaining 10,000 groups are used as test set, In training set include 0 to 9 this each 6 thousand sheets of 10 handwritten numeral pictures, each 1000,10 handwritten numeral pictures of test set.Data The picture size of concentration is 28*28.
2, binary conversion treatment is carried out to the image of selection, images to be recognized is converted into binary picture.It is again that image is big It is small to zoom in and out, it is uniformly scaled the image of size 10*10, is finally again rearranged the image of this 10*10, by its turn It is changed to the dimension array of 1*100.
3, training set is brought into BP nerve net after the dimension array of step 2 treated 60,000 1*100 upsets sequence Network is trained, to form trained neural network, is referred to as a Weak Classifier, is then beaten again again Random number is trained to obtain second Weak Classifier according to being brought into neural network after sequence.It repeats same method and obtains several weak points Class device;
4, several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, most Network is strengthened into such a combination afterwards, data classification is carried out to test set, and which kind of is belonged to according to classification results judgement.
The flow diagram of improved ADABOOST algorithm as shown in Figure 2, improved method are as follows:
The mode decision scheme of 4.1 pairs of original ADABOOST algorithms improves, the classification function in former ADABOOST algorithm:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction Training weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number.
4.2 we h (x) is modified it is allowed more to optimize, change as follows:
4.2.1 " 0 " " 1 " is counted respectively according to obtained result in step 3, " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ", " 8 ", it is that 6000 pictures of " 9 " export as a result, its result to be constituted to the matrix of 10 1*6000, we are uniformly denoted as yn(xj), Wherein j=1~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained, it will It is denoted as wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w is calculatedn(x) average value, by this 10 average values according to the suitable of number 0~9 Sequence is successively included into the same matrix, we are denoted as ave (x), we have just obtained be arranged successively by 0~9 one in this way A matrix;
4.2.4 two adjacent numbers are subjected to a mean value calculation again in ave (x), i.e., by " 0 " in ave (x) and " 1 " Value carry out a mean value calculation, followed by " 1 " and " 2 ", the rest may be inferred, we will obtain 9 average values, it is successively remembered For c1, c2 ... ..., c9, its classification function so is just answered are as follows:

Claims (2)

1. a kind of handwritten numeral image recognition classification method, which is characterized in that specifically comprise the following steps:
1) among data set, 60,000 pictures is randomly selected in data as training set, include 0 to 9 this 10 hands in training set Each 6 thousand sheets of digital picture is write, the picture size in data set is 28*28;
2) binary conversion treatment, is carried out to the image of selection, images to be recognized is converted into binary picture, then by image size into Row scaling, is uniformly scaled the image of size 10*10, finally again rearranges the image of this 10*10, be converted into The dimension array of 1*100;
3), training set is brought into BP neural network after the dimension array of step 2) treated 60,000 1*100 upsets sequence It is trained, forms trained neural network, be referred to as a Weak Classifier;Then upset data again again It is brought into after sequence in neural network and is trained to obtain second Weak Classifier;Same method obtains several Weak Classifiers;
4) several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, finally will With strong classifier to data Classification and Identification.
2. handwritten numeral image recognition classification method according to claim 1, which is characterized in that in the step 4) ADABOOST algorithm improvement method is as follows:
4.1) mode decision scheme of original ADABOOST algorithm is improved, the classification function in former ADABOOST algorithm:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction training power Weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number;
4.2) being modified h (x) allows it more to optimize, and changes as follows:
4.2.1) " 0 " " 1 " is counted respectively according to obtained result in step 3), " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ", " 8 ", " 9 " 6000 pictures output as a result, by its result constitute 10 1*6000 matrix, be uniformly denoted as yn(xj), wherein j=1 ~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification) is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained, has been remembered For wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w) is calculatedn(x) average value, by this 10 average values according to number 0~9 sequence according to It is secondary to be included into the same matrix, it is denoted as ave (x), has just obtained a matrix being arranged successively by 0~9;
4.2.4 two adjacent numbers) are subjected to mean value calculation again in ave (x), i.e., by the value of " 0 " and " 1 " in ave (x) A mean value calculation is carried out, followed by " 1 " and " 2 ", the rest may be inferred, and 9 average values will be obtained, it is successively denoted as c1, C2 ... ..., c9, final classification function are as follows:
CN201910521269.7A 2019-06-17 2019-06-17 Handwritten numeral image recognition classification method Pending CN110287839A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910521269.7A CN110287839A (en) 2019-06-17 2019-06-17 Handwritten numeral image recognition classification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910521269.7A CN110287839A (en) 2019-06-17 2019-06-17 Handwritten numeral image recognition classification method

Publications (1)

Publication Number Publication Date
CN110287839A true CN110287839A (en) 2019-09-27

Family

ID=68005100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910521269.7A Pending CN110287839A (en) 2019-06-17 2019-06-17 Handwritten numeral image recognition classification method

Country Status (1)

Country Link
CN (1) CN110287839A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07302305A (en) * 1994-05-10 1995-11-14 Sony Corp Device for recognizing hand-written character
CN106022273A (en) * 2016-05-24 2016-10-12 华东理工大学 Handwritten form identification system of BP neural network based on dynamic sample selection strategy
CN106960217A (en) * 2017-02-27 2017-07-18 浙江工业大学 The Forecasting Methodology of injector performance based on the BP artificial neural networks using depth Adaboost algorithm
CN107153810A (en) * 2016-03-04 2017-09-12 中国矿业大学 A kind of Handwritten Numeral Recognition Method and system based on deep learning
CN108734168A (en) * 2018-05-18 2018-11-02 天津科技大学 A kind of recognition methods of handwritten numeral
CN109657707A (en) * 2018-12-04 2019-04-19 浙江大学 A kind of image classification method based on observing matrix transformation dimension

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07302305A (en) * 1994-05-10 1995-11-14 Sony Corp Device for recognizing hand-written character
CN107153810A (en) * 2016-03-04 2017-09-12 中国矿业大学 A kind of Handwritten Numeral Recognition Method and system based on deep learning
CN106022273A (en) * 2016-05-24 2016-10-12 华东理工大学 Handwritten form identification system of BP neural network based on dynamic sample selection strategy
CN106960217A (en) * 2017-02-27 2017-07-18 浙江工业大学 The Forecasting Methodology of injector performance based on the BP artificial neural networks using depth Adaboost algorithm
CN108734168A (en) * 2018-05-18 2018-11-02 天津科技大学 A kind of recognition methods of handwritten numeral
CN109657707A (en) * 2018-12-04 2019-04-19 浙江大学 A kind of image classification method based on observing matrix transformation dimension

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
叶晓波等: "一种改进的Adaboost-BP算法在手写数字识别中的研究", 《大理大学学报》 *
张红等: "改进BP神经网络在手写数字识别中的性能研究", 《信息与电脑(理论版)》 *

Similar Documents

Publication Publication Date Title
Kosbatwar et al. Pattern Association for character recognition by Back-Propagation algorithm using Neural Network approach
Alrobah et al. A hybrid deep model for recognizing arabic handwritten characters
Kader et al. Neural network-based English Alphanumeric character recognition
Lai et al. Learning discriminative feature hierarchies for off-line signature verification
Miah et al. Handwritten courtesy amount and signature recognition on bank cheque using neural network
CN113627543A (en) Anti-attack detection method
Jindal et al. Recognition of offline handwritten numerals using an ensemble of mlps combined by adaboost
Mahto et al. Deep learning based models for offline Gurmukhi handwritten character and numeral recognition
Shima et al. Handwritten digits recognition by using CNN alex-net pre-trained for large-scale object image dataset
Wang et al. Two criteria for model selection in multiclass support vector machines
Vasant et al. Performance evaluation of different image sizes for recognizing offline handwritten gujarati digits using neural network approach
CN110287839A (en) Handwritten numeral image recognition classification method
Agrawal Design of CNN based model for handwritten digit recognition using different optimizer techniques
Ali et al. Two stage classifier for Arabic handwritten character recognition
Athoillah et al. Handwritten arabic numeral character recognition using multi kernel support vector machine
CN110287840A (en) Hand-written image recognition methods
Qasim Letter recognition data using neural network
Meng et al. Khmer character recognition using artificial neural network
Liu et al. Visually similar handwritten Chinese character recognition with convolutional neural network
Alshrief et al. Ensemble machine learning model for classification of handwritten digit recognition
Bian et al. Binarization of color character strings in scene images using deep neural network
Jabde et al. Offline Handwritten Multilingual Numeral Recognition Using CNN
Kaladgi et al. Handwritten Character Recognition Using CNN with Extended MNIST Dataset
Rokade et al. An Offline Signature Verification Using Deep Convolutional Neural Networks
Boveiri Transformation-invariant classification of persian printed digits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190927