CN102722713B - Handwritten numeral recognition method based on lie group structure data and system thereof - Google Patents

Handwritten numeral recognition method based on lie group structure data and system thereof Download PDF

Info

Publication number
CN102722713B
CN102722713B CN201210041116.0A CN201210041116A CN102722713B CN 102722713 B CN102722713 B CN 102722713B CN 201210041116 A CN201210041116 A CN 201210041116A CN 102722713 B CN102722713 B CN 102722713B
Authority
CN
China
Prior art keywords
lie group
structured data
class
training
group structured
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210041116.0A
Other languages
Chinese (zh)
Other versions
CN102722713A (en
Inventor
张莉
王晓乾
杨季文
何书萍
李凡长
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201210041116.0A priority Critical patent/CN102722713B/en
Publication of CN102722713A publication Critical patent/CN102722713A/en
Application granted granted Critical
Publication of CN102722713B publication Critical patent/CN102722713B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the invention provide a handwritten numeral recognition method based on lie group structure data and a system thereof. The method is characterized by: extracting corresponding lie group structure data from original handwritten numeral image data; through constructing a matrix Gaussian kernel function, using a support vector machine algorithm to train a classifier model; inputting the lie group structure data corresponding to the handwritten numeral image data to be detected into the classifier model obtained through training respectively so as to obtain a corresponding numeral type; carrying out non-linear characteristic capturing on the lie group structure data corresponding to the handwritten numeral image data to be detected so that the handwritten numeral recognition is better realized.

Description

A kind of Handwritten Numeral Recognition Method and system based on Lie group structured data
Technical field
The present invention relates to Handwritten Digital Recognition technical field, more particularly, relate to a kind of Handwritten Numeral Recognition Method and system based on Lie group structured data.
Background technology
In recent years along with the develop rapidly of computer technology and digital image processing techniques, Handwritten Digital Recognition technology is at extensive data statistics, mail sorting, in finance, the tax and financial field, be widely used, in this simultaneously, along with the popularization and application of machine learning techniques, a lot of physicists and chemist start to be widely used the data of Lie group theoretical research association area.Accordingly, in Handwritten Digital Recognition technical field, Lie group structured data is widely used with its good mathematic(al) structure.
At present, Handwritten Digital Recognition based on Lie group structured data is all generally to set up sorter model by sorting algorithm, thereby to the processing of classifying of the Lie group structured data of handwritten form digital picture, obtain sorter Output rusults, and then obtain the recognition result of handwriting digital according to the Output rusults of sorter.The conventional sorting algorithm of prior art is Lie group Fisher algorithm, Lie group Fisher algorithm need carry out a linear transformation projection to original Lie group structured data, homogeneous data is projected to together as far as possible, inhomogeneity data as much as possible away from, although the data after projection have good separability, but the Lie group Fisher algorithm that adopts linear classification method can not be caught the nonlinear characteristic of Lie group structured data, this just causes Lie group Fisher algorithm to have certain defect in the nonlinear characteristic of processing Lie group structured data.
Summary of the invention
In view of this, the invention provides a kind of Handwritten Numeral Recognition Method and system based on Lie group structured data, to solve existing Handwritten Digital Recognition technology processing the defect existing in the nonlinear characteristic of Lie group structured data, to realize the Nonlinear Processing of Lie group structured data.
For achieving the above object, the invention provides following technical scheme:
Based on a Handwritten Numeral Recognition Method for Lie group structured data, comprise step:
A. from original handwriting digital view data, extract the Lie group structured data of respective amount;
B. the corresponding relation of the class label of the handwriting digital view data that Lie group structured data is corresponding with it is as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, structure is processed the matrix gaussian kernel function of Lie group structured data simultaneously:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, p > 0 is kernel function, ‖ ‖ ffor matrix norm;
C. utilize algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
D. by Lie group structured data corresponding handwriting digital view data to be measured, be input to respectively and train in the sorter model obtaining, obtain corresponding digital classification.
The present invention also provides a kind of Handwritten Numeral Recognition System based on Lie group structured data, comprising:
Lie group structured data extraction module, extracts the Lie group structured data of respective amount for the handwriting digital view data from original;
Pretreatment module, for the corresponding relation of the class label of handwriting digital view data that Lie group structured data is corresponding with it as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, meanwhile, structure is processed the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, and a ≠ b, p > 0 is kernel function, ‖ ‖ ffor matrix norm;
Model training module, for utilizing algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
Sort module, for by Lie group structured data corresponding handwriting digital view data to be measured, is input to respectively and trains in the sorter model obtaining, and obtains corresponding digital classification.
Based on above technical scheme, the embodiment of the present invention is by structural matrix gaussian kernel function, utilize algorithm of support vector machine to process Lie group structured data, in the advantage of identifying small sample, non-linear and high dimensional pattern hypograph, realize the Nonlinear Processing of Lie group structured data by algorithm of support vector machine.
Brief description of the drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the process flow diagram of a kind of Handwritten Numeral Recognition Method based on Lie group structured data of the present invention;
Fig. 2 is Lie group mean algorithm, the comparison diagram of Lie group Fisher algorithm and the inventive method classification performance to numeral 1 and 7;
Fig. 3 is the comparison diagram of Lie group mean algorithm and the classification performance of the inventive method to numeral 1,7 and 9;
Fig. 4 is the comparison diagram of Lie group mean algorithm and the classification performance of the inventive method to numeral 1,2,7 and 9;
Fig. 5 is the structured flowchart of a kind of Handwritten Numeral Recognition System based on Lie group structured data of the present invention;
Fig. 6 is the structured flowchart of model training module of the present invention;
Fig. 7 is the structured flowchart of sort module of the present invention.
Embodiment
Inventor finds that by research algorithm of support vector machine has the advantages such as structural risk minimization and good generalization ability, adopt the Handwritten Digital Recognition of implement the algorithm of support vector machine based on Lie group structured data, can solve the identification problem of handwriting digital under small sample, non-linear and high dimensional pattern, catch thereby solve existing Handwritten Digital Recognition technology the defect existing in the nonlinear characteristic of Lie group structured data.But inventor also finds because Lie group structured data is matrix data instead of vector data by research, the not processing of support matrix data of algorithm of support vector machine of standard application at present, therefore the support vector machine method of standard application also cannot be processed Lie group structured data at present.
Inventor can pass through structural matrix gaussian kernel function by discovery after research further, utilizes algorithm of support vector machine, sets up corresponding sorter model, to the processing of classifying of Lie group structured data, and then realizes object of the present invention.
In conjunction with the invention described above thought, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 is the process flow diagram of a kind of Handwritten Numeral Recognition Method based on Lie group structured data of the present invention.With reference to Fig. 1, the method can comprise:
Step S100, from original handwriting digital view data, extract the Lie group structured data of respective amount;
For ease of describing, in the embodiment of the present invention, represent handwriting digital view data with x, represent Lie group structured data with z, and the number of establishing original handwriting digital view data x is that (l is integer to l, and l>=1), original handwriting digital view data x is respectively x 1... x l, the Lie group structured data of respective amount is z 1... z l;
Taking a handwriting digital image data extraction Lie group structured data as example, establishing rv is reference vector, gets at random k point on the stroke region of x view data, forms k vector v i = r i n i , i = 1 , . . . , k , R irepresent that vectorial mould is long, n iv idirection, and ‖ n i‖=1=M i× rv, M i = e ai e - θi e θi e ai , wherein ‖ ‖ represents that this vectorial mould is long, can obtain the corresponding Lie group structured data of handwriting digital view data sample x z to be:
The corresponding relation of the class label of step S200, handwriting digital view data that Lie group structured data is corresponding with it is as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, structure is processed the matrix gaussian kernel function of Lie group structured data simultaneously;
If y is the class label of handwriting digital view data x, y ∈ 1 ... c}, c is the classification number of handwriting digital view data x, by the Lie group structured data z extracting 1... z lclass label y with corresponding handwriting digital view data 1... y lcombine, can obtain training sample the set { (z that comprises x and y corresponding relation 1, y 1) ... (z l, y l);
The algorithm of support vector machine of standard application is not supported the processing of Lie group structured data at present, therefore the kernel function of existing algorithm of support vector machine is for the present invention inadaptable, for solving the application of algorithm of support vector machine to Lie group structured data, can build the matrix gaussian kernel function of algorithm of support vector machine, make Lie group structured data and algorithm of support vector machine compatible mutually, the concrete formula of matrix gaussian kernel function is as follows:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, a, b is integer, equal ∈ 1 ... l}, and a ≠ b, p > 0 is kernel function, ‖ ‖ ffor matrix norm.
Step S300, utilize algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
Those skilled in the art can know and utilize algorithm of support vector machine, carry out the conventional method of machine training, and the embodiment of the present invention will provide a kind of machine training method of many classification problems processing of supporting Lie group structured data, this training method is specially: described each Lie group structured data is input to respectively to described c and gets in 2 several sorter models of combination, a Lie group structured data obtains corresponding c and gets 2 the several sorter Output rusults of combination, add up this Lie group structured data in described Output rusults and be divided into the value of a certain class in c class, and maximizing therefrom, described maximal value is defined as to the digital classification of the handwriting digital view data that this Lie group structured data is corresponding.
Understand in detail for the ease of the machine training method to the embodiment of the present invention, below will provide concrete training process.
Training sample set { (z 1, y 1) ... (z l, y l) classification number be c, therefrom appoint and get sample corresponding to two class class labels, the sample that taken out only includes this two classes label, and the label of the sample in the training sample set of not taking out is not this two classes label, can obtain c taking sample corresponding to two class class labels as a combination and get 2 several combinations of combination, for convenience of statement, now get a combination i in 2 several combinations of combination with c, j(i, the equal ∈ of j 1 ... c}, and the sample that i ≠ j) two class labels are corresponding is example, the training process of sorter model is described, concrete training process is:
From training sample set { (z 1, y 1) ... (z l, y l) in extract to obtain i, after j two class samples, by described i, j two class samples carry out form optimization and can obtain: order wherein, subscript ij represents and i, the data message that j two classes are relevant, and subscript m represents an index, represent i, the j two classes Lie group structured data of being correlated with, l ijrepresent i, the sample sum of j two classes, for corresponding class label, and work as y m ij = i , ? y ‾ m ij = - 1 , When y m ij = j , ? y ‾ m ij = + 1 ;
The present invention is based on Lie group structured data, identify the view data of handwriting digital by support vector machine method, using support vector machine method to process the i of handwriting digital view data so, when j two class classification, need to solve following optimization problem:
max Σ m = 1 l ij β m ij - 1 2 Σ m = 1 l ij Σ n = 1 l ij y ‾ m ij y ‾ n ij β m ij β n ij k ( z m ij , z n ij )
s . t . Σ m = 1 l ij y ‾ m ij β m ij = 0,0 ≤ β m ij ≤ S ,
Wherein m, n all represents an index, for corresponding class label, m, n is integer, and m, the equal ∈ of n 1 ... l ij, for algorithm of support vector machine training production model coefficient, S is the regular parameter of algorithm of support vector machine training need, produces following sorter model according to above-mentioned optimization training:
f ij ( z ) = sgn { Σ m = 1 l ij β m ij y ‾ m ij k ( z , z m ij ) + b ij } , I, j all=1 ... c, and i ≠ j;
Sgn(in above formula) expression sign function, b ijbe model threshold, can calculate gained by following formula:
b ij = y sv &OverBar; - &Sigma; m = 1 l ij &beta; m ij y &OverBar; m ij k ( z sv , z m ij ) , Wherein z svcorresponding coefficient value is 0 < &beta; sv < S ;
The above-mentioned i that drawn, sorter model corresponding to j two class sample, all the other combinations that extract from training sample if also exist, the principle of all the other combined training sorter models is identical therewith, can mutually contrast, and repeats no more herein.
Step S400, by Lie group structured data corresponding handwriting digital view data to be measured, be input to respectively in the sorter model that obtains of training, obtain corresponding digital classification.
Step S300 obtains after sorter model, can from handwriting digital view data to be measured, extract corresponding Lie group structured data, to handwriting digital view data to be measured is classified, obtains corresponding digital classification.It should be noted that, the purposes of the original handwriting digital view data in step S100 is training classifier model, and it can think a huge handwriting digital image data base; And handwriting digital view data to be measured in step S400 is the identifying object of Handwritten Numeral Recognition Method of the present invention, the view data of the handwriting digital of identifying for needs.
The sorter model that the embodiment of the present invention trains can be processed many classification problems of Lie group structured data, in concrete Classification and Identification, can carry out according to following manner: Lie group structured data corresponding handwriting digital view data to be measured is input to respectively to described c and gets in 2 several sorter models of combination, a Lie group structured data obtains corresponding c and gets 2 the several sorter Output rusults of combination, add up this Lie group structured data in described Output rusults and be divided into the value of a certain class in c class, and maximizing therefrom, described maximal value is defined as to the digital classification of the handwriting digital view data that this Lie group structured data is corresponding,
Described sorter Output rusults can be used formula f ij(z) represent, i, j=1 ... c, and i ≠ j, concrete in the time will adding up this Lie group structured data and be divided into the value of i class, can be undertaken by following formula:
&Sigma; j = 1 , i &NotEqual; j c f ij ( z ) , i = 1 , . . . c ,
Can obtain by above formula c the value that this added up Lie group structured data is divided into i class, pass through formula maximizing from this c value, the corresponding digital classification of maximal value searching out is exactly the classification of the corresponding handwriting digital view data of this Lie group structured data.
The embodiment of the present invention is by structural matrix gaussian kernel function, utilize algorithm of support vector machine to process Lie group structured data corresponding to handwriting digital view data, by the advantage of algorithm of support vector machine identification small sample, non-linear and high dimensional pattern hypograph, realize the nonlinear characteristic of Lie group structured data and caught;
Secondly, by many classification problems of Lie group structured data are reduced to multiple two classification problems, and according to the algorithm of support vector machine processing of classifying, realized many classification problems processing of Lie group structured data, thereby better realized Handwritten Digital Recognition.
The beneficial effect that can bring below by following experimental verification the present invention:
The classification number of the general storable handwriting digital view data of handwriting digital database is 10 classes, now select four classes wherein, obtain numeral 1,2,7 and 9 test, every class is got first 200 respectively from training set and test set, and every class all has 200 training samples and test sample book.Then on training sample, select parameter with ten times of cross validations, wherein the span of regular factor is: { 2 -1, 2 0... 2 4, matrix gaussian kernel parameter value scope is { 2 -10, 2 -9... 2 -6.Then apply the parameter of select and again train a model, obtain discrimination at test set estimated performance.Further, can consider the impact of cloud quantity on discrimination, the value set of some cloud number is that { 5,10,15,20,25,30,35,40,50}, some cloud is random generation, can repeat 5 experiments, provides an average result.Fig. 2 to Fig. 4 shows and adopts Lie group mean algorithm, and Lie group Fisher algorithm and technical solution of the present invention are tested obtained experimental result.
Fig. 2 is Lie group mean algorithm, the comparison diagram of Lie group Fisher algorithm and the inventive method classification performance to numeral 1 and 7.With reference to Fig. 2, can find to be obviously better than Lie group mean algorithm and Lie group Fisher algorithm based on classifying quality of the present invention, and discrimination is along with each sample is got increasing and presenting increase tendency of counting out.Fig. 3 is that Lie group mean algorithm and the inventive method are to numeral 1, the comparison diagram of 7 and 9 classification performance, Fig. 4 is that Lie group mean algorithm and the inventive method are to numeral 1,2, the comparison diagram of 7 and 9 classification performance, with reference to Fig. 3 and Fig. 4, can find out that the many classifying qualities of the present invention are obviously better than Lie group mean algorithm.
Fig. 5 is the structured flowchart of a kind of Handwritten Numeral Recognition System based on Lie group structured data of the present invention.With reference to Fig. 5, this system can comprise:
Lie group structured data extraction module 100, extracts the Lie group structured data of respective amount for the handwriting digital view data from original;
Pretreatment module 200, for the corresponding relation of the class label of handwriting digital view data that Lie group structured data is corresponding with it as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, meanwhile, structure is processed the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p &times; | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, and a ≠ b, p > 0 is kernel function, ‖ ‖ ffor matrix norm;
Model training module 300, for utilizing algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
Sort module 400, for by Lie group structured data corresponding handwriting digital view data to be measured, is input to respectively and trains in the sorter model obtaining, and obtains corresponding digital classification.
Wherein, the structure of model training module 300 can as shown in Figure 6, comprise:
Combination acquiring unit 301, for appointing and get sample corresponding to two class class labels from described training sample set, obtains c and gets 2 several combinations of combination, and c is the classification number of handwriting digital view data;
Circuit training unit 302, for each unit of being combined as, utilizes respectively algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, inputs sample corresponding to each combination, and training obtains c and gets 2 the several sorter models of combination.
Further, circuit training unit 302 can comprise
Training subelement (not shown), comprises i for extracting, the combination of j two class samples, and i, the equal ∈ of j 1 ... c}, and i ≠ j, the flow process of execution training classifier model: order l represents the number of handwriting digital view data, and x represents handwriting digital view data, and z represents Lie group structured data, and y is the class label of handwriting digital view data x, y ∈ 1 ... c}, subscript ij represents and i, the data message that j two classes are relevant, subscript m represents an index represent i, the j two classes Lie group structured data of being correlated with, l ijrepresent i, the sample sum of j two classes, for corresponding class label, and work as y m ij = i , ? y &OverBar; m ij = - 1 , When y m ij = j , ? y &OverBar; m ij = + 1 , And solve
max &Sigma; m = 1 l ij &beta; m ij - 1 2 &Sigma; m = 1 l ij &Sigma; n = 1 l ij y &OverBar; m ij y &OverBar; n ij &beta; m ij &beta; n ij k ( z m ij , z n ij )
m, n all represents an index, m, n is integer, and m, the equal ∈ of n 1 ... l ij, for algorithm of support vector machine training production model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f ij ( z ) = sgn { &Sigma; m = 1 l ij &beta; m ij y &OverBar; m ij k ( z , z m ij ) + b ij } , Sgn() represent sign function, b ijit is model threshold;
Circulation subelement (not shown), for completing after the flow process of above-mentioned training classifier model at described training subelement, extracts another combination, then carries out above-mentioned sorter model training flow process, gets 2 the several sorter models of combination until obtain c.
The structure of sort module 400 can as shown in Figure 7, comprise:
Computing unit 401, gets 2 the several sorter models of combination for Lie group structured data corresponding handwriting digital view data to be measured being input to respectively to described c, and a Lie group structured data obtains corresponding c and gets 2 the several sorter Output rusults of combination;
Statistic unit 402, is divided into the value of a certain class in c class for adding up this Lie group structured data of described Output rusults, and maximizing therefrom;
Determining unit 403, is defined as the digital classification of the handwriting digital view data that this Lie group structured data is corresponding for the maximal value that described statistic unit is searched out.
Further, statistic unit 402 can comprise:
Class Data-Statistics subelement (not shown), for according to formula i ∈ 1 ... c} adds up the value that this Lie group structured data in described Output rusults is divided into i class, a certain class in the c class that will add up that described i class is hypothesis;
Maximal value is searched subelement (not shown), for according to formula:
find the maximal value in the numerical value of described statistics subelement statistics.
The present invention is based on the Handwritten Numeral Recognition System of Lie group structured data, mutually corresponding with the Handwritten Numeral Recognition Method based on Lie group structured data, system concrete function is realized and can referring to corresponding method no longer, be repeated no more herein.
Professional can also further recognize, unit and the algorithm steps of each example of describing in conjunction with embodiment disclosed herein, can realize with electronic hardware, computer software or the combination of the two, for the interchangeability of hardware and software is clearly described, composition and the step of each example described according to function in the above description in general manner.These functions are carried out with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can realize described function with distinct methods to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.
The software module that the method for describing in conjunction with embodiment disclosed herein or the step of algorithm can directly use hardware, processor to carry out, or the combination of the two is implemented.Software module can be placed in the storage medium of any other form known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.
To the above-mentioned explanation of the disclosed embodiments, make professional and technical personnel in the field can realize or use the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiment, General Principle as defined herein can, in the situation that not departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention will can not be restricted to these embodiment shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims (6)

1. the Handwritten Numeral Recognition Method based on Lie group structured data, is characterized in that, comprises step:
A. from original handwriting digital view data, extract the Lie group structured data of respective amount;
B. the corresponding relation of the class label of the handwriting digital view data that Lie group structured data is corresponding with it is as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, structure is processed the matrix gaussian kernel function of Lie group structured data simultaneously:
k ( z a , z b ) = e - p &times; | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, p > 0 is kernel functional parameter, || || ffor matrix norm;
C. utilize algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
Described step C is specially: from described training sample set, appoint and get sample corresponding to two class class labels, obtain c and get 2 combination of two, c is the classification number of handwriting digital view data, each combination comprises sample corresponding to two class class labels, with each unit of being combined as, utilize respectively algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input sample corresponding to each combination, training obtains c and gets 2 the several sorter models of combination;
D. by Lie group structured data corresponding handwriting digital view data to be measured, be input to respectively and train in the sorter model obtaining, obtain corresponding digital classification;
Described step D is specially: Lie group structured data corresponding handwriting digital view data to be measured is input to respectively to described c and gets in 2 several sorter models of combination, a Lie group structured data obtains corresponding c and gets 2 the several sorter Output rusults of combination, add up this Lie group structured data in described Output rusults and be divided into the value of a certain class in c class, and maximizing therefrom, described maximal value is defined as to the digital classification of the handwriting digital view data that this Lie group structured data is corresponding.
2. method according to claim 1, is characterized in that, described step C comprises:
C1. from described training sample set, appoint and get sample corresponding to two class class labels, obtain c and get 2 combination of two;
C2. extract and comprise i, the combination of j two class samples, i, the equal ∈ of j 1 ... c}, and i ≠ j, c is the classification number of handwriting digital view data, carries out the flow process of training classifier model: order wherein l represents the number of handwriting digital view data, and z represents Lie group structured data, and y is the class label of handwriting digital view data, y ∈ 1 ... and c}, subscript ij represents and i, the data message that j two classes are relevant, subscript m represents an index, represent i, the Lie group structured data that j two classes are relevant, l ijrepresent i, the sample sum of j two classes, for corresponding class label, and work as y m ij = i , ? y &OverBar; m ij = - 1 , When y m ij = j , ? y &OverBar; m ij = + 1 , And solve,
max &Sigma; m = 1 l ij &beta; m ij - 1 2 &Sigma; m = 1 l ij &Sigma; n = 1 l ij y &OverBar; m ij y &OverBar; n ij &beta; m ij &beta; n ij k ( z m ij , z n ij )
for corresponding class label, m, n all represents an index, m, n is integer, and m, the equal ∈ of n 1 ... l ij, for algorithm of support vector machine training production model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f ij ( z ) = sgn { &Sigma; m = 1 l ij &beta; m ij y &OverBar; m ij k ( z , z m ij ) + b ij } , Sgn() represent sign function, b ijit is model threshold;
C3. extract another combination, carry out above-mentioned sorter model training flow process, get 2 the several sorter models of combination until obtain c.
3. method according to claim 1 and 2, is characterized in that, the value that in the described Output rusults of described statistics, this Lie group structured data is divided into a certain class in c class is specially:
According to formula i ∈ 1 ... c} adds up the value that this Lie group structured data in described Output rusults is divided into i class;
Described therefrom maximizing is specially:
According to formula f ( z ) = max i = 1 . . . c &Sigma; i = 1 , i &NotEqual; j c f ij ( z ) Maximizing.
4. the Handwritten Numeral Recognition System based on Lie group structured data, is characterized in that, comprising:
Lie group structured data extraction module, extracts the Lie group structured data of respective amount for the handwriting digital view data from original;
Pretreatment module, for the corresponding relation of the class label of handwriting digital view data that Lie group structured data is corresponding with it as training sample, obtain the training sample set corresponding with the Lie group structured data of described respective amount, meanwhile, structure is processed the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p &times; | | z a - z b | | F 2 , Described z aand z brepresent any two Lie group structured datas, and a ≠ b, p > 0 is kernel functional parameter, || || ffor matrix norm;
Model training module, for utilizing algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, input training sample, training obtains sorter model;
Described model training module comprises: combination acquiring unit, and for appointing and get sample corresponding to two class class labels from described training sample set, obtain c and get 2 combination of two, c is the classification number of handwriting digital view data; Circuit training unit, for each unit of being combined as, utilizes respectively algorithm of support vector machine, taking described matrix gaussian kernel function as kernel function, inputs sample corresponding to each combination, and training obtains c and gets 2 the several sorter models of combination;
Sort module, for by Lie group structured data corresponding handwriting digital view data to be measured, is input to respectively and trains in the sorter model obtaining, and obtains corresponding digital classification;
Described sort module comprises: computing unit, get 2 the several sorter models of combination for Lie group structured data corresponding handwriting digital view data to be measured is input to respectively to described c, a Lie group structured data obtains corresponding c and gets 2 the several sorter Output rusults of combination; Statistic unit, is divided into the value of a certain class in c class for adding up this Lie group structured data of described Output rusults, and maximizing therefrom; Determining unit, is defined as the digital classification of the handwriting digital view data that this Lie group structured data is corresponding for the maximal value that described statistic unit is searched out.
5. system according to claim 4, is characterized in that, described circuit training unit comprises:
Training subelement, comprises i for extracting, the combination of j two class samples, and i, the equal ∈ of j 1 ... c}, and i ≠ j, c is the classification number of handwriting digital view data, carries out the flow process of training classifier model: order wherein l represents the number of handwriting digital view data, and z represents Lie group structured data, and y is the class label of handwriting digital view data, y ∈ 1 ... and c}, subscript ij represents and i, the data message that j two classes are relevant, subscript m represents an index, represent i, the j two classes Lie group structured data of being correlated with, l ijrepresent i, the sample sum of j two classes, for corresponding class label, and work as y m ij = i , ? y &OverBar; m ij = - 1 , When y m ij = j , ? y &OverBar; m ij = + 1 , And solve,
max &Sigma; m = 1 l ij &beta; m ij - 1 2 &Sigma; m = 1 l ij &Sigma; n = 1 l ij y &OverBar; m ij y &OverBar; n ij &beta; m ij &beta; n ij k ( z m ij , z n ij )
for corresponding class label, m, n all represents an index, m, n is integer, and m, the equal ∈ of n 1 ... l ij, for algorithm of support vector machine training production model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f ij ( z ) = sgn { &Sigma; m = 1 l ij &beta; m ij y &OverBar; m ij k ( z , z m ij ) + b ij } , Sgn() represent sign function, b ijit is model threshold;
Circulation subelement, for completing after the flow process of above-mentioned training classifier model at described training subelement, extracts another combination, then carries out above-mentioned sorter model training flow process, gets 2 the several sorter models of combination until obtain c.
6. according to the system described in claim 4 or 5, it is characterized in that, described statistic unit comprises:
Class Data-Statistics subelement, for according to formula i ∈ 1 ... c} adds up the value that this Lie group structured data in described Output rusults is divided into i class, a certain class in the c class that will add up that described i class is hypothesis;
Maximal value is searched subelement, for according to formula f ( z ) = max i = 1 . . . c &Sigma; i = 1 , i &NotEqual; j c f ij ( z ) Find the maximal value in the numerical value of described statistics subelement statistics.
CN201210041116.0A 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof Expired - Fee Related CN102722713B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210041116.0A CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210041116.0A CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Publications (2)

Publication Number Publication Date
CN102722713A CN102722713A (en) 2012-10-10
CN102722713B true CN102722713B (en) 2014-07-16

Family

ID=46948463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210041116.0A Expired - Fee Related CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Country Status (1)

Country Link
CN (1) CN102722713B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982343B (en) * 2012-11-12 2015-03-25 信阳师范学院 Handwritten number recognition and incremental type obscure support vector machine method
CN103218613B (en) * 2013-04-10 2016-04-20 苏州大学 Handwritten Numeral Recognition Method and device
CN103164701B (en) * 2013-04-10 2016-06-01 苏州大学 Handwritten Numeral Recognition Method and device
CN103258211A (en) * 2013-05-31 2013-08-21 苏州大学 Handwriting digital recognition method and system
CN103310217B (en) * 2013-06-20 2016-06-01 苏州大学 Based on Handwritten Numeral Recognition Method and the device of image covariance feature
CN103310237B (en) * 2013-07-09 2016-08-24 苏州大学 Handwritten Numeral Recognition Method and system
CN103400161A (en) * 2013-07-18 2013-11-20 苏州大学 Handwritten numeral recognition method and system
CN108647670A (en) * 2018-05-22 2018-10-12 哈尔滨理工大学 A kind of characteristic recognition method of the lateral vehicle image based on support vector machines
CN109002461B (en) * 2018-06-04 2023-04-18 平安科技(深圳)有限公司 Handwriting model training method, text recognition method, device, equipment and medium
CN109978064A (en) * 2019-03-29 2019-07-05 苏州大学 Lie group dictionary learning classification method based on image set
CN111026897A (en) * 2019-11-19 2020-04-17 武汉大学 Scene classification method and system based on Lie-Fisher remote sensing image
CN111062417A (en) * 2019-11-19 2020-04-24 武汉大学 Lie-Mean-based flat shell defect detection method and system
CN111191618A (en) * 2020-01-02 2020-05-22 武汉大学 KNN scene classification method and system based on matrix group

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624712A (en) * 2004-12-09 2005-06-08 上海交通大学 Hand writing number identification method based on kernel function

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624712A (en) * 2004-12-09 2005-06-08 上海交通大学 Hand writing number identification method based on kernel function

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
Gradient-Based Adaptation of General Gaussian Kernels;Tobias Glasmachers et al.;《Neural Computation》;20051231;2099-2105 *
Tobias Glasmachers et al..Gradient-Based Adaptation of General Gaussian Kernels.《Neural Computation》.2005,2099-2105.
一种新的李群分类器在手写体数字中的应用;王晓乾等;《计算机工程与科学》;20130228;第35卷(第2期);85-90 *
一种李群机器学习线性分类算法研究;陈明等;《微电子学与计算机》;20091031;第26卷(第10期);170-173 *
李凡长.基于Lie群的机器学习理论框架.《云南民族大学学报(自然科学版)》.2004,第13卷(第4期),251-255. *
李群核学习算法研究;高聪等;《计算机科学与探索》;20121115;第6卷(第11期);1026-1037 *
王晓乾等.一种新的李群分类器在手写体数字中的应用.《计算机工程与科学》.2013,第35卷(第2期),85-90.
陈明等.一种李群机器学习线性分类算法研究.《微电子学与计算机》.2009,第26卷(第10期),170-173.
高聪等.李群核学习算法研究.《计算机科学与探索》.2012,第6卷(第11期),1026-1037.

Also Published As

Publication number Publication date
CN102722713A (en) 2012-10-10

Similar Documents

Publication Publication Date Title
CN102722713B (en) Handwritten numeral recognition method based on lie group structure data and system thereof
CN109388712A (en) A kind of trade classification method and terminal device based on machine learning
CN102982349B (en) A kind of image-recognizing method and device
CN112016605B (en) Target detection method based on corner alignment and boundary matching of bounding box
CN110348384B (en) Small target vehicle attribute identification method based on feature fusion
CN105389583A (en) Image classifier generation method, and image classification method and device
CN105224951A (en) A kind of vehicle type classification method and sorter
CN104573130A (en) Entity resolution method based on group calculation and entity resolution device based on group calculation
CN104951791A (en) Data classification method and apparatus
CN103258210A (en) High-definition image classification method based on dictionary learning
CN108764302A (en) A kind of bill images sorting technique based on color characteristic and bag of words feature
CN111460927A (en) Method for extracting structured information of house property certificate image
CN103971136A (en) Large-scale data-oriented parallel structured support vector machine classification method
CN103473308B (en) High-dimensional multimedia data classifying method based on maximum margin tensor study
CN103473275A (en) Automatic image labeling method and automatic image labeling system by means of multi-feature fusion
CN111737477A (en) Intellectual property big data-based intelligence investigation method, system and storage medium
CN114663002A (en) Method and equipment for automatically matching performance assessment indexes
CN102360436B (en) Identification method for on-line handwritten Tibetan characters based on components
CN106203508A (en) A kind of image classification method based on Hadoop platform
CN102298703B (en) Classification method based on projection residual errors
US20220215679A1 (en) Method of determining a density of cells in a cell image, electronic device, and storage medium
CN104318224A (en) Face recognition method and monitoring equipment
CN110399432A (en) A kind of classification method of table, device, computer equipment and storage medium
CN111553442B (en) Optimization method and system for classifier chain tag sequence
CN112508000B (en) Method and equipment for generating OCR image recognition model training data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zhang Li

Inventor after: Wang Xiaoqian

Inventor after: Yang Jiwen

Inventor after: He Shuping

Inventor after: Li Fanchang

Inventor after: Zhang Zhao

Inventor before: Zhang Li

Inventor before: Wang Xiaoqian

Inventor before: Yang Jiwen

Inventor before: He Shuping

Inventor before: Li Fanchang

COR Change of bibliographic data
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140716