CN110287839A - Handwritten numeral image recognition classification method - Google Patents
Handwritten numeral image recognition classification method Download PDFInfo
- Publication number
- CN110287839A CN110287839A CN201910521269.7A CN201910521269A CN110287839A CN 110287839 A CN110287839 A CN 110287839A CN 201910521269 A CN201910521269 A CN 201910521269A CN 110287839 A CN110287839 A CN 110287839A
- Authority
- CN
- China
- Prior art keywords
- classification
- data
- result
- image
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000013528 artificial neural network Methods 0.000 claims abstract description 9
- 239000011159 matrix material Substances 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000007812 deficiency Effects 0.000 abstract description 3
- 238000001514 detection method Methods 0.000 abstract description 3
- 238000013459 approach Methods 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 2
- FGRBYDKOBBBPOI-UHFFFAOYSA-N 10,10-dioxo-2-[4-(N-phenylanilino)phenyl]thioxanthen-9-one Chemical compound O=C1c2ccccc2S(=O)(=O)c2ccc(cc12)-c1ccc(cc1)N(c1ccccc1)c1ccccc1 FGRBYDKOBBBPOI-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 208000037805 labour Diseases 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/30—Writer recognition; Reading and verifying signatures
- G06V40/37—Writer recognition; Reading and verifying signatures based only on signature signals such as velocity or pressure, e.g. dynamic signature recognition
- G06V40/394—Matching; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
Abstract
The present invention relates to a kind of handwritten numeral image recognition classification methods, randomly select feeding BP neural network after data pre-process to data concentration and are trained, form trained neural network, be referred to as a Weak Classifier;Repetition training obtains several Weak Classifiers, combines several Weak Classifiers using improved ADABOOST algorithm, forms a strong classifier, finally will be with strong classifier to data Classification and Identification.The present invention optimizes the discriminant approach in ADABOOST, make its structure more rationally, intelligence, optimization, Classification and Identification is carried out to handwritten numeral by improved algorithm, to greatly strengthen its resolution capability, it can be greatly reduced since personnel's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to the classification fault caused by identifying, to improve the accuracy rate of detection, erroneous judgement and the generation for problem of failing to judge are reduced.
Description
Technical field
The present invention relates to a kind of image recognition technology, in particular to a kind of handwritten numeral image recognition classification method.
Background technique
Handwritten Digit Recognition is a branch of Symbol recognition, although only identifying simple 10 numbers, is had
Very big practical value.In our daily life, a large amount of document processing jobs, duty receipt, bank will be carried out daily
Check, money order, the processing of Credit Statement and sorting of post office letter etc., how using computer character identification and
Document processing technology frees people from these heavy hand labours and has become one and in the urgent need to address ask
Topic.Although handwritten numeral only 10 types, very high to the required precision of identification in many cases, and everyone has
Different writings will be accomplished accurately to identify or acquire a certain degree of difficulty.Moreover in practical applications, the essence of Handwritten Digit Recognition
Exactness require it is stringenter than Chinese character more because number identification is frequently used in the fields such as finance, finance.
Summary of the invention
The problem of the present invention be directed to Handwritten Digit Recognitions proposes a kind of handwritten numeral image recognition classification method, makees
To assist interpretation method, avoid since personnel's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to caused by identification
False diagnosis reduces erroneous judgement and the generation for problem of failing to judge to improve the diagnosis of detection.
The technical solution of the present invention is as follows: a kind of handwritten numeral image recognition classification method, specifically comprises the following steps:
1) among data set, 60,000 pictures are randomly selected in data as training set, in training set comprising 0 to 9 this 10
A each 6 thousand sheets of handwritten numeral picture, the picture size in data set are 28*28;
2) binary conversion treatment, is carried out to the image of selection, images to be recognized is converted into binary picture.It is again that image is big
It is small to zoom in and out, it is uniformly scaled the image of size 10*10, is finally again rearranged the image of this 10*10, by its turn
It is changed to the dimension array of 1*100;
3), training set is brought into BP nerve after the dimension array of step 2) treated 60,000 1*100 upsets sequence
Network is trained, and is formed trained neural network, is referred to as a Weak Classifier;Then upset again again
It is brought into after data sequence in neural network and is trained to obtain second Weak Classifier;Same method obtains several Weak Classifiers;
4) several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, most
After will be with strong classifier to data Classification and Identification.
ADABOOST algorithm improvement method is as follows in the step 4):
4.1) mode decision scheme of original ADABOOST algorithm is improved, the classification letter in former ADABOOST algorithm
Number:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction
Training weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number;
4.2) being modified h (x) allows it more to optimize, and changes as follows:
4.2.1) " 0 " " 1 " is counted respectively according to obtained result in step 3), " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ",
" 8 ", it is that 6000 pictures of " 9 " export as a result, the matrix of 10 1*6000 of its result composition is uniformly denoted as yn(xj), wherein
J=1~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification) is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained,
It is denoted as wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w) is calculatedn(x) average value, by this 10 average values according to number 0~9
Sequence is successively included into the same matrix, is denoted as ave (x), has just obtained a matrix being arranged successively by 0~9;
4.2.4 two adjacent numbers) are subjected to mean value calculation again in ave (x), i.e., by " 0 " in ave (x) and " 1 "
Value carry out a mean value calculation, followed by " 1 " and " 2 ", the rest may be inferred, 9 average values will be obtained, it is successively denoted as c1,
C2 ... ..., c9, final classification function are as follows:
The beneficial effects of the present invention are: handwritten numeral image recognition classification method of the present invention, to original ADABOOST
Algorithm improves, and the discriminant approach in ADABOOST is optimized, make its structure more rationally, intelligence, optimization, pass through
Improved algorithm, which carries out Classification and Identification to handwritten numeral, can be greatly reduced to greatly strengthen its resolution capability due to people
Member's experience deficiency and careless and inadvertent and font are smudgy, it is difficult to the classification fault caused by identifying, to improve detection
Accuracy rate, reduce erroneous judgement and problem of failing to judge generation.
Detailed description of the invention
Fig. 1 is handwritten numeral image recognition classification method flow diagram of the present invention;
Fig. 2 is the flow diagram of the improved ADABOOST algorithm of the present invention.
Specific embodiment
Handwritten numeral image recognition classification method flow diagram as shown in Figure 1, specifically comprises the following steps:
1, among data set, 60,000 pictures are randomly selected in data as training set, remaining 10,000 groups are used as test set,
In training set include 0 to 9 this each 6 thousand sheets of 10 handwritten numeral pictures, each 1000,10 handwritten numeral pictures of test set.Data
The picture size of concentration is 28*28.
2, binary conversion treatment is carried out to the image of selection, images to be recognized is converted into binary picture.It is again that image is big
It is small to zoom in and out, it is uniformly scaled the image of size 10*10, is finally again rearranged the image of this 10*10, by its turn
It is changed to the dimension array of 1*100.
3, training set is brought into BP nerve net after the dimension array of step 2 treated 60,000 1*100 upsets sequence
Network is trained, to form trained neural network, is referred to as a Weak Classifier, is then beaten again again
Random number is trained to obtain second Weak Classifier according to being brought into neural network after sequence.It repeats same method and obtains several weak points
Class device;
4, several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, most
Network is strengthened into such a combination afterwards, data classification is carried out to test set, and which kind of is belonged to according to classification results judgement.
The flow diagram of improved ADABOOST algorithm as shown in Figure 2, improved method are as follows:
The mode decision scheme of 4.1 pairs of original ADABOOST algorithms improves, the classification function in former ADABOOST algorithm:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction
Training weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number.
4.2 we h (x) is modified it is allowed more to optimize, change as follows:
4.2.1 " 0 " " 1 " is counted respectively according to obtained result in step 3, " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ",
" 8 ", it is that 6000 pictures of " 9 " export as a result, its result to be constituted to the matrix of 10 1*6000, we are uniformly denoted as yn(xj),
Wherein j=1~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained, it will
It is denoted as wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w is calculatedn(x) average value, by this 10 average values according to the suitable of number 0~9
Sequence is successively included into the same matrix, we are denoted as ave (x), we have just obtained be arranged successively by 0~9 one in this way
A matrix;
4.2.4 two adjacent numbers are subjected to a mean value calculation again in ave (x), i.e., by " 0 " in ave (x) and " 1 "
Value carry out a mean value calculation, followed by " 1 " and " 2 ", the rest may be inferred, we will obtain 9 average values, it is successively remembered
For c1, c2 ... ..., c9, its classification function so is just answered are as follows:
Claims (2)
1. a kind of handwritten numeral image recognition classification method, which is characterized in that specifically comprise the following steps:
1) among data set, 60,000 pictures is randomly selected in data as training set, include 0 to 9 this 10 hands in training set
Each 6 thousand sheets of digital picture is write, the picture size in data set is 28*28;
2) binary conversion treatment, is carried out to the image of selection, images to be recognized is converted into binary picture, then by image size into
Row scaling, is uniformly scaled the image of size 10*10, finally again rearranges the image of this 10*10, be converted into
The dimension array of 1*100;
3), training set is brought into BP neural network after the dimension array of step 2) treated 60,000 1*100 upsets sequence
It is trained, forms trained neural network, be referred to as a Weak Classifier;Then upset data again again
It is brought into after sequence in neural network and is trained to obtain second Weak Classifier;Same method obtains several Weak Classifiers;
4) several Weak Classifiers are combined using improved ADABOOST algorithm, forms a strong classifier, finally will
With strong classifier to data Classification and Identification.
2. handwritten numeral image recognition classification method according to claim 1, which is characterized in that in the step 4)
ADABOOST algorithm improvement method is as follows:
4.1) mode decision scheme of original ADABOOST algorithm is improved, the classification function in former ADABOOST algorithm:
In formula, y (x) indicates obtained by BP network class as a result, x is the image data that is inputted, atIndicate prediction training power
Weight, T are BP network number included in ADABOOST, is defined as:
In formula, etTo be distributed weights sum, is defined as:
In formula, DtIt (i) is distribution weight, distribution weight initialization is 1/m, and m is training data group number;
4.2) being modified h (x) allows it more to optimize, and changes as follows:
4.2.1) " 0 " " 1 " is counted respectively according to obtained result in step 3), " 2 ", " 3 " 4 ", " 5 ", " 6 ", " 7 ", " 8 ",
" 9 " 6000 pictures output as a result, by its result constitute 10 1*6000 matrix, be uniformly denoted as yn(xj), wherein j=1
~6000, n=0~9 represent 0~9 result;
4.2.2 the result in respective classification) is multiplied by prediction training weight at, the new matrix of 10 1*6000 has been obtained, has been remembered
For wn(x), formula is as follows:
4.2.3 0~9 respective matrix of consequence w) is calculatedn(x) average value, by this 10 average values according to number 0~9 sequence according to
It is secondary to be included into the same matrix, it is denoted as ave (x), has just obtained a matrix being arranged successively by 0~9;
4.2.4 two adjacent numbers) are subjected to mean value calculation again in ave (x), i.e., by the value of " 0 " and " 1 " in ave (x)
A mean value calculation is carried out, followed by " 1 " and " 2 ", the rest may be inferred, and 9 average values will be obtained, it is successively denoted as c1,
C2 ... ..., c9, final classification function are as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910521269.7A CN110287839A (en) | 2019-06-17 | 2019-06-17 | Handwritten numeral image recognition classification method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910521269.7A CN110287839A (en) | 2019-06-17 | 2019-06-17 | Handwritten numeral image recognition classification method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110287839A true CN110287839A (en) | 2019-09-27 |
Family
ID=68005100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910521269.7A Pending CN110287839A (en) | 2019-06-17 | 2019-06-17 | Handwritten numeral image recognition classification method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287839A (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07302305A (en) * | 1994-05-10 | 1995-11-14 | Sony Corp | Device for recognizing hand-written character |
CN106022273A (en) * | 2016-05-24 | 2016-10-12 | 华东理工大学 | Handwritten form identification system of BP neural network based on dynamic sample selection strategy |
CN106960217A (en) * | 2017-02-27 | 2017-07-18 | 浙江工业大学 | The Forecasting Methodology of injector performance based on the BP artificial neural networks using depth Adaboost algorithm |
CN107153810A (en) * | 2016-03-04 | 2017-09-12 | 中国矿业大学 | A kind of Handwritten Numeral Recognition Method and system based on deep learning |
CN108734168A (en) * | 2018-05-18 | 2018-11-02 | 天津科技大学 | A kind of recognition methods of handwritten numeral |
CN109657707A (en) * | 2018-12-04 | 2019-04-19 | 浙江大学 | A kind of image classification method based on observing matrix transformation dimension |
-
2019
- 2019-06-17 CN CN201910521269.7A patent/CN110287839A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07302305A (en) * | 1994-05-10 | 1995-11-14 | Sony Corp | Device for recognizing hand-written character |
CN107153810A (en) * | 2016-03-04 | 2017-09-12 | 中国矿业大学 | A kind of Handwritten Numeral Recognition Method and system based on deep learning |
CN106022273A (en) * | 2016-05-24 | 2016-10-12 | 华东理工大学 | Handwritten form identification system of BP neural network based on dynamic sample selection strategy |
CN106960217A (en) * | 2017-02-27 | 2017-07-18 | 浙江工业大学 | The Forecasting Methodology of injector performance based on the BP artificial neural networks using depth Adaboost algorithm |
CN108734168A (en) * | 2018-05-18 | 2018-11-02 | 天津科技大学 | A kind of recognition methods of handwritten numeral |
CN109657707A (en) * | 2018-12-04 | 2019-04-19 | 浙江大学 | A kind of image classification method based on observing matrix transformation dimension |
Non-Patent Citations (2)
Title |
---|
叶晓波等: "一种改进的Adaboost-BP算法在手写数字识别中的研究", 《大理大学学报》 * |
张红等: "改进BP神经网络在手写数字识别中的性能研究", 《信息与电脑(理论版)》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kosbatwar et al. | Pattern Association for character recognition by Back-Propagation algorithm using Neural Network approach | |
Alrobah et al. | A hybrid deep model for recognizing arabic handwritten characters | |
Kader et al. | Neural network-based English Alphanumeric character recognition | |
Lai et al. | Learning discriminative feature hierarchies for off-line signature verification | |
Miah et al. | Handwritten courtesy amount and signature recognition on bank cheque using neural network | |
CN113627543A (en) | Anti-attack detection method | |
Jindal et al. | Recognition of offline handwritten numerals using an ensemble of mlps combined by adaboost | |
Mahto et al. | Deep learning based models for offline Gurmukhi handwritten character and numeral recognition | |
Shima et al. | Handwritten digits recognition by using CNN alex-net pre-trained for large-scale object image dataset | |
Wang et al. | Two criteria for model selection in multiclass support vector machines | |
Vasant et al. | Performance evaluation of different image sizes for recognizing offline handwritten gujarati digits using neural network approach | |
CN110287839A (en) | Handwritten numeral image recognition classification method | |
Agrawal | Design of CNN based model for handwritten digit recognition using different optimizer techniques | |
Ali et al. | Two stage classifier for Arabic handwritten character recognition | |
Athoillah et al. | Handwritten arabic numeral character recognition using multi kernel support vector machine | |
CN110287840A (en) | Hand-written image recognition methods | |
Qasim | Letter recognition data using neural network | |
Meng et al. | Khmer character recognition using artificial neural network | |
Liu et al. | Visually similar handwritten Chinese character recognition with convolutional neural network | |
Alshrief et al. | Ensemble machine learning model for classification of handwritten digit recognition | |
Bian et al. | Binarization of color character strings in scene images using deep neural network | |
Jabde et al. | Offline Handwritten Multilingual Numeral Recognition Using CNN | |
Kaladgi et al. | Handwritten Character Recognition Using CNN with Extended MNIST Dataset | |
Rokade et al. | An Offline Signature Verification Using Deep Convolutional Neural Networks | |
Boveiri | Transformation-invariant classification of persian printed digits |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190927 |