CN104866867B - A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine - Google Patents

A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine Download PDF

Info

Publication number
CN104866867B
CN104866867B CN201510253055.8A CN201510253055A CN104866867B CN 104866867 B CN104866867 B CN 104866867B CN 201510253055 A CN201510253055 A CN 201510253055A CN 104866867 B CN104866867 B CN 104866867B
Authority
CN
China
Prior art keywords
character
binaryzation
obtains
mrow
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510253055.8A
Other languages
Chinese (zh)
Other versions
CN104866867A (en
Inventor
于慧敏
施成燕
李天豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201510253055.8A priority Critical patent/CN104866867B/en
Publication of CN104866867A publication Critical patent/CN104866867A/en
Application granted granted Critical
Publication of CN104866867B publication Critical patent/CN104866867B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The embodiment of the invention discloses a kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine, serial number image is split, obtains the image of multiple characters, by the size normalization of the image of each character, on this basis, the character picture x after normalization is handled according to the following steps:Binary conversion treatment is carried out to x, obtains the binaryzation matrix x ' of character picture, and is converted into binaryzation vector

Description

A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine
Technical field
The invention belongs to automatic identification technology field, a kind of particularly multinational paper money sequence number word based on cleaning-sorting machine Recognition methods is accorded with, Logic Regression Models are relate in template training part.
Background technology
Character mother plate generation is the important step of paper money sequence number character recognition.Pre-training is used in paper money sequence number identification first Mode produce Character mother plate, then by the way that the character picture of input is matched to be identified result with Character mother plate. Therefore Character mother plate generates the step and follow-up recognition result is had a great influence.
Often by caused by debugging by hand, the method for debugging is to obtain each word first for conventional Character mother plate generation The statistical matrix of symbolWherein m is number of samples,Each element minimum 0 be up to m, it is higher to its value Part, i.e. dense parts, W takes larger on the occasion of in the less part of value, W takes less on the occasion of taking 0 part, i.e., The sparse part of character, W take negative value.
This generation method not only wastes time and energy, and the degree of accuracy is not high so that the accuracy rate of match cognization is also big afterwards It is big to reduce.
For the existing drawbacks described above of currently available technology, it is necessary to studied, to provide a kind of scheme, solved existing There is defect present in technology, improve speed and the degree of accuracy of template generation.
The content of the invention
To solve the above problems, object of the present invention is to provide a kind of multinational paper money sequence number word based on cleaning-sorting machine Accord with recognition methods.This method uses Logic Regression Models, and solve wastes time and energy in existing paper money sequence number Character mother plate generation With the degree of accuracy it is relatively low the problem of.
To achieve the above object, the technical scheme is that:A kind of multinational paper money sequence number character based on cleaning-sorting machine Recognition methods, this method are:Serial number image I is split first, obtains the image of multiple characters, by the figure of each character The size normalization of picture is m × n, that is, the character picture after normalizing Represent the real number matrix of m rows n row; On the basis of this, the character picture x after normalization is handled according to the following steps:
Step 1:Binary conversion treatment is carried out to this character picture x first, obtains character picture x binaryzation matrixThen this binaryzation matrix x ' is converted into binaryzation vectorWherein, used in binary conversion treatment Threshold value is calculated by Two-peak method.
Step 2:By binaryzation vectorMatched with each subtemplate in template set W.The method of matching is by binaryzation VectorDot product is carried out with each subtemplate respectively, and each element that product matrix is obtained to dot product is summed, and obtains element summation r, ifWith character k matching subtemplate WkDuring dot product, element summation r obtains maximum, i.e., Then k is recognition result.
Further, each subtemplate in the template set W in the step 2 obtains by the following method:
(1) N number of character picture is inputted, obtains the binaryzation vector of each image.The binaryzation vector passes through with lower section Formula obtains:Binary conversion treatment is carried out to each character picture of input first, obtains the binaryzation matrix of the character pictureThen by this binaryzation matrix x 'jBe converted to binaryzation vectorWherein, binaryzation Threshold value is calculated by Two-peak method used in processing.Using the binaryzation vector of this N number of character picture as in training set X Element, training set X is formed, i.e.,
(2) according to training set X, any character c pre-matching template W is obtainedcFor:Wc=argmaxl (θ)
Wherein,YjIt is 0 or 1 for authentic signature value, word The mark value for according with c is 1, and the mark value of other characters is 0;XjFor j-th of element in training set X, i.e.,Function
Optimized parameter W can be solved by being iterated using gradient descent methodc, in each iteration, parameter θ enters according to following formula Row renewal is until convergence:
Wherein, α is learning rate, and gradient is
(3) to character c pre-matching template WcFixed point operation is carried out, obtains character c pre-matching template Wc F, specific side Formula is as follows:
Wc F=(Wc-min(Wc))./(max(Wc)-min(Wc))*(2p-1)
Wherein/and it is a division operation, template W when p is fixed pointcIn the integer figure that changes into needed for each element.
The beneficial effects of the invention are as follows:
(1) Logic Regression Models are utilized, the training character sample of input is trained automatically, produces the mould of each character Plate, compared with the method debugged manually before, formation speed greatly improves.
(2) as a result of Logic Regression Models, once many samples can be trained, compared with debugging manually before Method, substantially increase the degree of accuracy of template.This method is flexible, suitable for current main flow a variety of currency types include RMB, Dollar, Euro, Hongkong dollar, yen etc..
Brief description of the drawings
Fig. 1 is the step flow of the multinational paper money sequence number character identifying method based on cleaning-sorting machine of the embodiment of the present invention Figure;
Fig. 2 is the character " 3 " of the multinational paper money sequence number character identifying method based on cleaning-sorting machine of the embodiment of the present invention Binaryzation matrix schematic diagram.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, it is right below in conjunction with drawings and Examples The present invention is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the present invention, not For limiting the present invention.
On the contrary, the present invention covers any replacement done in the spirit and scope of the present invention being defined by the claims, repaiied Change, equivalent method and scheme.Further, in order that the public has a better understanding to the present invention, below to the thin of the present invention It is detailed to describe some specific detail sections in section description.Part without these details for a person skilled in the art Description can also understand the present invention completely.
Fig. 1 show the step of multinational paper money sequence number character identifying method based on cleaning-sorting machine of the embodiment of the present invention and flowed Cheng Tu.
Serial number image I is split first, obtains the image of multiple characters, the size of the image of each character is returned One turns to m × n, and on this basis, the character picture x after normalization is handled according to the following steps:
Step 1:Represent the real number matrix of m rows n row;Binaryzation is carried out to this character picture x first Processing, obtain character picture x binaryzation matrixIt is the binaryzation matrix of character " 3 " as shown in Figure 2.So This binaryzation matrix x ' is converted into binaryzation vector afterwardsWherein, threshold value used in binary conversion treatment passes through bimodal Method is calculated.
Step 2:Each subtemplate in template set W is obtained, is obtained by the following method:
(2.1) N number of character picture is inputted, obtains the binaryzation vector of each image.The binaryzation vector passes through following Mode obtains:Binary conversion treatment is carried out to each character picture of input first, obtains the binaryzation matrix of the character pictureThen by this binaryzation matrix x 'jBe converted to binaryzation vectorWherein, binaryzation Threshold value is calculated by Two-peak method used in processing.Using the binaryzation vector of this N number of character picture as in training set X Element, training set X is formed, i.e.,
(2) according to training set X, any character c pre-matching template W is obtainedcFor:Wc=argmaxl (θ)
Wherein,YjIt is 0 or 1 for authentic signature value, word The mark value for according with c is 1, and the mark value of other characters is 0;XjFor j-th of element in training set X, i.e.,Function
Optimized parameter W can be solved by being iterated using gradient descent methodc, in each iteration, parameter θ enters according to following formula Row renewal is until convergence:
Wherein, α is learning rate, and gradient is
(3) to character c pre-matching template WcFixed point operation is carried out, obtains character c pre-matching template Wc F, specific side Formula is as follows:
Wc F=(Wc-min(Wc))./(max(Wc)-min(Wc))*(2p-1)
Wherein/and it is a division operation, template W when p is fixed pointcIn the integer figure that changes into needed for each element.
Step 3:By the binaryzation obtained in step 1 vectorMatched with each subtemplate in template set W.Matching Method be by binaryzation vectorDot product is carried out with each subtemplate respectively, and each element of product matrix is obtained to dot product Summation, obtains element summation r, ifWith character k matching subtemplate WkDuring dot product, element summation r obtains maximum, i.e.,Then k is recognition result.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement made within refreshing and principle etc., should be included in the scope of the protection.

Claims (1)

1. a kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine, it is characterised in that this method is:First to sequence Row number image I is split, and obtains the image of multiple characters, is m × n by the size normalization of the image of each character, that is, returns Character picture after one change Represent the real number matrix of m rows n row;On this basis, the character figure after normalization As x is handled according to the following steps:
Step 1:Binary conversion treatment is carried out to this character picture x first, obtains character picture x binaryzation matrix Then this binaryzation matrix x ' is converted into binaryzation vectorWherein, threshold value used in binary conversion treatment passes through double Peak method is calculated;
Step 2:By binaryzation vectorMatched with each subtemplate in template set W;The method of matching is by binaryzation vector Dot product is carried out with each subtemplate respectively, and each element that product matrix is obtained to dot product is summed, and obtains element summation r, ifWith Character k matching subtemplate WkDuring dot product, element summation r obtains maximum, i.e., Then k is recognition result;
Each subtemplate in template set W in the step 2 obtains by the following method:
(2.1) N number of character picture is inputted, obtains the binaryzation vector of each image;The binaryzation vector is in the following manner Obtain:Binary conversion treatment is carried out to each character picture of input first, obtains the binaryzation matrix of the character pictureThen by this binaryzation matrix xj' be converted to binaryzation vectorWherein, at binaryzation Threshold value is calculated by Two-peak method used in reason;Using the binaryzation vector of this N number of character picture as the member in training set X Element, training set X is formed, i.e.,(2.2) according to training set X, any character c pre-matching template W is obtainedc For:Wc=argmaxl (θ) wherein,YjFor authentic signature value, For 0 or 1, character c mark value is 1, and the mark value of other characters is 0;XjFor j-th of element in training set X, i.e.,Function
Optimized parameter W can be solved by being iterated using gradient descent methodc, in each iteration, parameter θ is carried out more according to following formula Newly until convergence:
<mrow> <mi>&amp;theta;</mi> <mo>:</mo> <mo>=</mo> <mi>&amp;theta;</mi> <mo>+</mo> <mi>&amp;alpha;</mi> <mfrac> <mrow> <mo>&amp;part;</mo> <mrow> <mo>(</mo> <mi>l</mi> <mo>(</mo> <mi>&amp;theta;</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mrow> <mo>&amp;part;</mo> <mi>&amp;theta;</mi> </mrow> </mfrac> <msub> <mover> <mi>x</mi> <mo>^</mo> </mover> <mi>j</mi> </msub> </mrow>
Wherein, α is learning rate, and gradient is
(2.3) to character c pre-matching template WcFixed point operation is carried out, obtains character c pre-matching template Wc F, concrete mode It is as follows:
Wc F=(Wc-min(Wc))./(max(Wc)-min(Wc))*(2p-1)
Wherein/and it is a division operation, template W when p is fixed pointcIn the integer figure that changes into needed for each element.
CN201510253055.8A 2015-05-15 2015-05-15 A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine Active CN104866867B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510253055.8A CN104866867B (en) 2015-05-15 2015-05-15 A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510253055.8A CN104866867B (en) 2015-05-15 2015-05-15 A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine

Publications (2)

Publication Number Publication Date
CN104866867A CN104866867A (en) 2015-08-26
CN104866867B true CN104866867B (en) 2017-12-05

Family

ID=53912688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510253055.8A Active CN104866867B (en) 2015-05-15 2015-05-15 A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine

Country Status (1)

Country Link
CN (1) CN104866867B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106803091B (en) * 2015-11-25 2020-04-28 深圳怡化电脑股份有限公司 Method and system for identifying currency value of paper money
CN105957238B (en) 2016-05-20 2019-02-19 聚龙股份有限公司 A kind of paper currency management method and its system
CN106056751B (en) * 2016-05-20 2019-04-12 聚龙股份有限公司 The recognition methods and system of serial number
CN106447905B (en) * 2016-09-12 2019-04-09 深圳怡化电脑股份有限公司 A kind of bank note currency type recognition methods and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923741A (en) * 2010-08-11 2010-12-22 西安理工大学 Paper currency number identification method based on currency detector
CN102194275A (en) * 2010-03-15 2011-09-21 党力 Automatic ticket checking method for train tickets
CN103218613A (en) * 2013-04-10 2013-07-24 苏州大学 Method and device for identifying handwritten form figures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102194275A (en) * 2010-03-15 2011-09-21 党力 Automatic ticket checking method for train tickets
CN101923741A (en) * 2010-08-11 2010-12-22 西安理工大学 Paper currency number identification method based on currency detector
CN103218613A (en) * 2013-04-10 2013-07-24 苏州大学 Method and device for identifying handwritten form figures

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Feature extraction methods for cartoon character recognition;Masayuki ARAI 等;《2012 5th International Congress on Image and Signal Processing》;20130225;445-448 *
纸币号码识别系统的算法研究;焦杏艳;《中国优秀硕士学位论文全文数据库 信息科技辑》;20090615(第6期);I138-899 *

Also Published As

Publication number Publication date
CN104866867A (en) 2015-08-26

Similar Documents

Publication Publication Date Title
CN104866867B (en) A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine
CN105244029B (en) Voice recognition post-processing method and system
CN108984745A (en) A kind of neural network file classification method merging more knowledge mappings
CN102156871B (en) Image classification method based on category correlated codebook and classifier voting strategy
Li et al. Improving attention-based handwritten mathematical expression recognition with scale augmentation and drop attention
CN104572892A (en) Text classification method based on cyclic convolution network
CN105302884B (en) Webpage mode identification method and visual structure learning method based on deep learning
CN101256631B (en) Method and apparatus for character recognition
CN106897403A (en) Towards the fine granularity Chinese attribute alignment schemes that knowledge mapping builds
CN111160037A (en) Fine-grained emotion analysis method supporting cross-language migration
CN107122349A (en) A kind of feature word of text extracting method based on word2vec LDA models
CN105654129A (en) Optical character sequence recognition method
CN106980856A (en) Formula identification method and system and symbolic reasoning computational methods and system
Wu et al. Some analysis and research of the AdaBoost algorithm
CN109598002A (en) Neural machine translation method and system based on bidirectional circulating neural network
CN108829810A (en) File classification method towards healthy public sentiment
CN106503694A (en) Digit recognition method based on eight neighborhood feature
CN108205522A (en) The method and its system of Emotion tagging
CN105609116A (en) Speech emotional dimensions region automatic recognition method
CN104794455A (en) Dongba hieroglyphic recognizing method
CN103020167A (en) Chinese text classification method for computer
CN109657039A (en) A kind of track record information extraction method based on the double-deck BiLSTM-CRF
CN109683871A (en) Code automatically generating device and method based on image object detection method
CN103400144A (en) Active learning method based on K-neighbor for support vector machine (SVM)
CN105631477A (en) Traffic sign recognition method based on extreme learning machine and self-adaptive lifting

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant