CN104992166B - A kind of Manuscripted Characters Identification Method and system based on robust measurement - Google Patents
A kind of Manuscripted Characters Identification Method and system based on robust measurement Download PDFInfo
- Publication number
- CN104992166B CN104992166B CN201510450358.9A CN201510450358A CN104992166B CN 104992166 B CN104992166 B CN 104992166B CN 201510450358 A CN201510450358 A CN 201510450358A CN 104992166 B CN104992166 B CN 104992166B
- Authority
- CN
- China
- Prior art keywords
- sample
- projection matrix
- label
- classification
- test
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/333—Preprocessing; Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/32—Digital ink
- G06V30/36—Matching; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The invention discloses a kind of Manuscripted Characters Identification Methods and system based on robust measurement, by carrying out similarity-based learning to handwritten form training sample, construction weighting similar diagram, divergence and the local characteristics that all training samples are kept while detaching local class scatter in compact local class.In order to promote the robustness of handwritten form description, it proposes 1 norm measure being applied to semi-supervised feature learning model, the Manuscripted Characters Identification Method and system of performance robust are designed, output one can be used in sample and the projection matrix P of the outer handwritten form image characteristics extraction of sample.The conclusion of the outer image of sample is by projecting test sample to projection matrix P, and then the feature of extraction is inputted into efficient label Category of Communication device and is sorted out, take the position of the maximum value of probability in the soft label of corresponding classification, for the classification of discriminating test sample, most accurate character identification result is obtained.Meanwhile by establishing ratio model, reducing model parameter, and projection matrix P meets orthogonal property.
Description
Technical field
The present invention relates to computer visions and image identification technical field, more particularly to a kind of hand based on robust measurement
Write the recognition methods of body body and system.
Background technology
Nowadays it is the epoch of an information explosion, there are a large amount of, valuable multimedia is high in our daily lifes
Tie up information.Offline handwriting recongnition is the example for carrying out feature extraction to wherein certain high dimensional information and utilizing.It is logical
Computer is crossed by paper image electronization, the character picture of computer storage is obtained, passes through a series of side of machine learning later
The operations such as method extraction characteristics of image, classification finally identify character.Once the method for obtaining efficiently and accurately identification character, may be used on
The fields such as office automation, machine translation, you can bring huge social and economic benefit.But it is hand-written due to effectively extracting
The process tool of body characteristics of image acquires a certain degree of difficulty, and causes up to the present, offline handwritten form (abbreviation handwritten form in the present invention) word
Symbol identification also has certain distance apart from real requirement.Current most of research work all concentrates on processing handwriting picture feature
Extraction problem, and also obtained certain achievement.But the handwriting picture that is acquired from real world usually exists comprising making an uproar
The problems such as sound, heterogeneous data or shortage of data, handwriting picture is in the presence of stroke lack of standardization etc. caused by the reasons such as writing style
Problem, it is therefore desirable to which more robust algorithm carries out feature extraction.
In recent years, some robust Models based on 1- norms are suggested, such as the Principal Component Analysis Algorithm based on 1- norms
(PCA-L1), the identification locality preserving projections algorithm (DLPP-L1) etc. based on 1- norms.The thought that these robust algorithms propose
It is:Traditional algorithm based on 2- norm distance measures in data noise or heterogeneous data it is more sensitive, and be based on 1-
The distance metric of norm can then overcome this disadvantage, the robustness of lift scheme.These algorithms so that result is more robust really,
But it is unsupervised with algorithm that is entirely supervising due to only existing at present, it is unable to fully using having label data and without label data information,
Therefore the accuracy of result also has prodigious room for promotion.In addition, some empirical parameters in algorithm be also very difficult to it is optimal really
It is fixed.
Therefore it provides a kind of Manuscripted Characters Identification Method and system based on robust measurement, realize that handwritten character image is special
The robust of sign extracts, while improving the accuracy of handwritten character characterization image ability and identification, be those skilled in the art urgently
Problem to be solved.
Invention content
In view of this, the present invention provides a kind of Manuscripted Characters Identification Method and system based on robust measurement, realize hand-written
The robust of body character picture feature extracts, while improving the accuracy of handwritten character characterization image ability and identification, to overcome
Jin Shiyong has label or without label data in the prior art, the characteristics of without fully considering data information in reality.
In order to solve the above technical problems, the present invention provides a kind of Manuscripted Characters Identification Method measured based on robust, based on having
The thought for the 1- norm projections that the identification of label data is locally kept with all samples, this method include:
Similarity-based learning carried out to handwritten form training sample, construction weighting similar diagram, in compact local class divergence and point
The local characteristics of all training samples are kept while from local class scatter;It is semi-supervised to build the robust based on 1- norm measures
Handwritten character characteristics of image learning model, the model optimization export one and can be used in sample carrying with the outer characteristics of image of sample
The projection matrix P taken;Meanwhile by establishing ratio model, model parameter is reduced, and the projection matrix P for optimizing output meets just
Hand over characteristic;
Using the projection matrix P to this progress feature extraction of hand-written body examination sample, the conclusion of the outer image of sample is mainly led to
It crosses and maps the test sample to projection matrix P;
Using label Category of Communication device, test is completed to the test sample feature after dimensionality reduction, exports the test sample
The soft label of classification takes the position of the maximum value of probability in the corresponding soft label of classification, the class for judging the test sample
Not, character identification result is obtained;
Wherein, the numerical value in the soft label of the classification represents the probability that the test sample belongs to each classification.
In the above method, optionally, robust semi-supervised handwritten character image of the structure based on 1- norm measures is special
Learning model is levied, the model optimization exports a projection matrix P that can be used in sample with the outer image characteristics extraction of sample, packet
It includes:
There are the original training sample collection of noise by given oneWherein, n is trained sample
The dimension of this collection, N are the quantity of training sample set, and training sample concentration includes the sample set of c class labelWith the sample set without any labelAnd meet sample number
Amount, wherein c is the integer more than 2, and meets sample size l+u=N;IfThere is label sample for l
This label, and sample xiLabel be yi(i≤l);
A projection with identification feature with part holding feature is calculated according to the original training sample collection
MatrixIncluding exporting to obtain handwritten form character picture outside extractable sample by solving following optimization method
The projection matrix P of feature:
s.t.PTP=Id
Wherein, | | | |1For 1- norms, it is defined as | | S | |1=∑i,j|Si,j|, Si,jIndicate (i, j) number member of s-matrix
Element,It is weight coefficient matrix with W.
It is optionally, described that hand-written body examination examination sample image progress feature is carried using the projection matrix P in the above method
Take, the conclusion of the outer image of sample mainly by the way that the test sample image is mapped to projection matrix P, including:
Training sample and test sample are projected using the projection matrix P, complete handwritten character characteristics of image
Extraction.
The present invention also provides a kind of handwriting recongnition systems based on robust measurement, including:
Training module, for carrying out similarity-based learning to handwritten form training sample, construction obtains weighting similar diagram, compact
The local characteristics of all training samples are kept in local class while divergence and separation part class scatter;Structure is based on 1- norms
The semi-supervised handwritten character characteristics of image learning model of robust of measurement, the model optimization export one can be used in sample and
The projection matrix P of the outer image characteristics extraction of sample;Meanwhile by establishing ratio model, model parameter is reduced, and optimize output
Projection matrix P meets orthogonal property;
Preprocessing module is tested, is used for using the projection matrix P to this progress feature extraction of hand-written body examination sample, sample
The conclusion of outer image is mainly by mapping the test sample to projection matrix P;
Test module completes test to the test sample feature after dimensionality reduction, exports institute for utilizing label Category of Communication device
The soft label of classification for stating test sample takes the position of the maximum value of probability in the corresponding soft label of classification, described for judging
The classification of test sample, obtains character identification result;
Wherein, the numerical value in the soft label of the classification represents the probability that the test sample belongs to each classification.
It can be seen via above technical scheme that compared with prior art, the invention discloses a kind of based on robust measurement
Manuscripted Characters Identification Method and system, by carrying out similarity-based learning to handwritten form training sample, construction obtains weighting similar diagram,
The local characteristics of all training samples are kept in compact part class while divergence and separation part class scatter;In order to promote hand
The robustness of body description is write, the semi-supervised handwritten character characteristics of image learning model of robust based on 1- norm measures, institute are built
It states model optimization and exports a projection matrix P that can be used in sample with the outer image characteristics extraction of sample;The outer image of sample is returned
It receives by projecting test sample to projection matrix P, and then the feature of extraction is inputted into efficient label Category of Communication device
Sorted out, take the position of the maximum value of probability in the soft label of corresponding classification, is used for the classification of discriminating test sample, obtain most accurate
True character identification result.Meanwhile by establishing ratio model, reducing model parameter, and projection matrix P meets orthogonal spy
Property.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of flow chart of Manuscripted Characters Identification Method based on robust measurement provided in an embodiment of the present invention;
Fig. 2 is a kind of structure diagram signal of handwriting recongnition system based on robust measurement provided in an embodiment of the present invention
Figure;
Fig. 3 is a kind of identification schematic diagram of Manuscripted Characters Identification Method based on robust measurement provided in an embodiment of the present invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Core of the invention is to provide a kind of Manuscripted Characters Identification Method measured based on robust and system, realizes handwritten form word
The robust extraction of characteristics of image is accorded with, while improving the accuracy of handwritten character characterization image ability and identification, it is existing to overcome
Jin Shiyong has label or without label data in technology, the characteristics of without fully considering data information in reality.
The invention discloses a kind of Manuscripted Characters Identification Methods and system based on robust measurement, by training sample to handwritten form
This progress similarity-based learning, construction obtain weighting similar diagram, and divergence and the local class scatter of separation is same in compact local class
When keep the local characteristics of all training samples;In order to promote the robustness of handwritten form description, build based on 1- norm measures
The semi-supervised handwritten character characteristics of image learning model of robust, the model optimization export one and can be used in sample and outside sample
The projection matrix P of image characteristics extraction;The conclusion of the outer image of sample by the way that test sample is projected to projection matrix P, into
And the feature of extraction is inputted into efficient label Category of Communication device and is sorted out, take the maximum value of probability in the soft label of corresponding classification
Position, be used for discriminating test sample classification, obtain most accurate character identification result.Meanwhile by establishing ratio model,
Reduce model parameter, and projection matrix P meets orthogonal property.
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description
The present invention is described in further detail.
The present invention is tested in three handwritten character image data bases:CAS, USPS and ORHD.CAS is the Chinese Academy of Sciences
Institute of Automation handwritten form database, including 3755 Chinese characters and 171 letters, number or symbol;USPS is U.S.'s postal
The handwritten numeral database of political affairs system, including 9298 handwriting digital 0-9;ORHD is University of California at Irvine (UCI) machine
The database of device study, includes 5620 numeral samples, contain in each sample there are one integer within the scope of 0-16.These data
Library is collected from many aspects, thus test result is with universal illustrative.
Give a handwritten form sample set, be divided into training set and test set, separately include original training sample and
Test sample.
With reference to figure 1, a kind of stream of Manuscripted Characters Identification Method measured based on robust provided in an embodiment of the present invention is shown
Cheng Tu, the thought for the 1- norm projections that this method is locally kept based on the identification for having label data with all samples, specifically may be used
To include the following steps:
Step S100, similarity-based learning, construction weighting similar diagram, in compact local class are carried out to handwritten form training sample
The local characteristics of all training samples are kept while divergence and separation part class scatter.For the robustness of lifting system,
Step S100 will also build the semi-supervised handwritten character characteristics of image learning model of robust based on 1- norm measures, the model
Optimization output one can be used in sample and the projection matrix P of the outer image characteristics extraction of sample;Meanwhile by establishing ratio mould
Type reduces model parameter, and the projection matrix P for optimizing output meets orthogonal property;
Step S101, the projection matrix P obtained using the step S100 to this progress feature extraction of hand-written body examination sample,
The conclusion of the outer image of sample is mainly by mapping the test sample to projection matrix P;
Step S102, using label Category of Communication device, test is completed to the test sample feature after dimensionality reduction, exports the survey
The soft label of classification of sample sheet takes the position of the maximum value of probability in the corresponding soft label of classification, for judging the test
The classification of sample, obtains character identification result;Wherein, the numerical value in the soft label of the classification represents the test sample and belongs to each
The probability of a classification.
In the present invention, handwritten form training image differentiate according to above-mentioned steps S100 and geometry keeps learning,
It is proposed the semi-supervised feature learning model of robust based on 1- norm measures, it is special that optimization output one can be used for the outer test image of sample
The projection matrix P of extraction is levied, detailed process is as follows:
For given one, there may be the original handwritten form vector set of noiseReferred to as
Original training sample collection (wherein, n is the dimension of handwritten form sample, and N is the quantity of sample), training sample concentration include c
The sample set of class labelWith the sample set without any label
And meeting sample size, wherein c is the integer more than 2, and meets sample size l+u=N.If
The label for having exemplar for l, and sample xiLabel be yi(i≤l)。
A projection matrix with identification feature with part holding feature is calculated according to original training setSpecifically, it exports to obtain handwritten form character picture outside extractable sample by solving following optimization method
The projection matrix P of feature:
s.t.PTP=Id
Wherein | | | |1For 1- norms, it is defined as follows:
Wherein SI, jIndicate (i, j) number element of s-matrix, weight coefficient matrixW is defined as follows:
Wherein, M is reconstruct coefficient matrix, and γ ∈ [0,1] are to have label data and the tradeoff without label data feature extraction
Parameter.MatrixIt is defined as follows:
The reconstruction coefficients matrix M can be obtained by solving following optimization problem:
Wherein, | | | | it is 2- norms, is defined as follows:
When calculating, the present embodiment, to locally optimal solution Step wise approximation, can be calculated dimension dropping to 1 first using alternative manner
The case where dimension.First reconstruction coefficients matrix M is obtained by having the following optimization problem of method calculating:
Weight coefficient matrix is calculated laterWith
To obtain the weight coefficient matrix W in described problem:
1- norm optimization solutions are carried out below.Enable sign function
Former majorized function is substituted into obtain:
Increment is enabled again
P (t+1)=p (t)+β δ (t) are updated later.Wherein, β is the positive number of a very little.If the value of F (p (t+1)) increases
Long unobvious, then export p*=p (t+1), otherwise always iteration until convergence.
Above description is the case where being down to 1 dimension i.e. d=1, following further clarification be down to multidimensional i.e. 1 d > the case where:
P is set first0=0, (xi)0=xi(i=1,2 ..., N);Later for i=1, each i in 2 ..., N,
Calculate following formula:
By (xi)kIt substitutes into aforementioned alternative manner and calculates pk。
Specific algorithm is as follows:
Handwritten form feature extraction algorithm based on robust measurement
Input:Raw data matrixControl parameter γ, β, d.
Output:Projection matrix P*。
Initialization:d0=0, p0=0, (xi)0=xi, k=0, γ=0.2, β=0.01, ε=10-6
step1:It solves and calculates
step2:It calculates:
step3:Work as d0When < d, k ← k+1, for i=1, each i in 2 ..., N is calculated
Otherwise P is exported*=P
step4:It enables(miRepresent the mean value of the i-th class sample), and standardize
step5:Do when while is also not converged
It calculates
Calculate increment
Update
Check whether convergence:
IfThen stop, setting P (:,d0)=p (t+1);
Otherwise t=t+1
end while
step6:D is set0←d0+ 1, continue to execute step3.
The selection of iteration initial value in this example:γ=0.2;β=0.01 is initial value, is constantly reduced in iterative process.
We have obtained handwritten character image characteristics extraction matrix P as a result,.
In the present invention, for being carried in step S101, using the projection matrix P to hand-written body examination sample this progress feature
It takes, the conclusion of the outer image of sample specifically includes mainly by mapping the test sample to projection matrix P:
The feature that handwritten form test image sample is extracted using projection matrix P obtained above, generates new test set, specifically
For:Based on training set, projection matrix can be passed throughIt is embedded in training sample and test sample to obtain projector space again,
Handwritten character image characteristics extraction is completed, training set and test set after feature extraction are generated.
Wherein, training sample xtrainAnd test sample xtestFeature extraction results expression it is as follows:WhereinThe respectively feature extraction knot of original training sample and test sample
Fruit.
In the present invention, for step S102, using label Category of Communication device, the test sample feature after dimensionality reduction is completed to survey
Examination, exports the soft label of classification of the test sample, takes the position of the maximum value of probability in the corresponding soft label of classification, is used for
The classification for judging the test sample, obtains character identification result, specifically includes:
After original training set and test set character picture feature is calculated by step S101, it is easy construction feature extraction
Handwritten form test sample collection afterwardsAnd training setWhereinIt is corresponding every
One original sample xiFeature.Classified later to test set sample using label Category of Communication device, obtains test set sample
This classification results.
Compared with prior art, the invention discloses a kind of Manuscripted Characters Identification Method and system based on robust measurement, lead to
It crosses and similarity-based learning is carried out to handwritten form training sample, construction obtains weighting similar diagram, divergence and separation in compact local class
The local characteristics of all training samples are kept while local class scatter;In order to promote the robustness of handwritten form description, structure
The semi-supervised handwritten character characteristics of image learning model of robust based on 1- norm measures, the model optimization export one and can be used
In the projection matrix P with image characteristics extraction outside sample in sample;The conclusion of the outer image of sample by by test sample to projection
Matrix P is projected, and then the feature of extraction is inputted efficient label Category of Communication device and is sorted out, and takes the soft mark of corresponding classification
The position of the maximum value of probability in label is used for the classification of discriminating test sample, obtains most accurate character identification result.Meanwhile
By establishing ratio model, reduce model parameter, and projection matrix P meets orthogonal property.
It is corresponding with a kind of Manuscripted Characters Identification Method based on robust measurement disclosed in the embodiments of the present invention, the present invention
Embodiment additionally provides a kind of handwriting recongnition system measured based on robust, and with reference to figure 2, which may include as follows
Content:
Training module 201, for carrying out similarity-based learning to handwritten form training sample, construction obtains weighting similar diagram,
The local characteristics of all training samples are kept in compact part class while divergence and separation part class scatter.In addition, will also
The semi-supervised handwritten character characteristics of image learning model of robust based on 1- norm measures is built, the model optimization exports one
It can be used for the projection matrix P with the outer image characteristics extraction of sample in sample;Meanwhile by establishing ratio model, reducing model ginseng
Number, and the projection matrix P for optimizing output meets orthogonal property;
Test preprocessing module 202, for using the projection matrix P to this progress feature extraction of hand-written body examination sample,
The conclusion of the outer image of sample is mainly by mapping the test sample to projection matrix P;
Test module 203 completes test for utilizing label Category of Communication device to the test sample feature after dimensionality reduction, defeated
The soft label of classification for going out the test sample takes the position of the maximum value of probability in the corresponding soft label of classification, for judging
The classification of the test sample, obtains character identification result;
Wherein, the numerical value in the soft label of the classification represents the probability that the test sample belongs to each classification.
Table 1 is please referred to, is the method for the present invention and semi-supervised maximum margin criterion algorithm (SSMMC), semi-supervised linear
Discrimination Analysis Algorithm (SSLDA), identification locality preserving projections algorithm (DLPP-L1) method recognition result pair based on 1- norms
Than table, the average recognition rate and highest discrimination of each method experiment are given.In this example, participate in SSMMC, SSLDA for comparing and
DLPP-L1 methods are used for the feature extraction of test sample using the projection matrix P being respectively calculated, and classification is all made of label
Category of Communication device.
1. present invention of table and the comparison of SSMMC, SSLDA, DLPP-L1 method recognition result
By three real data sets, i.e. (a) CAS Offline Chinese Handwriting Digits, (b)
Example laboratory on USPS and (c) Optical Recognition of Handwritten Digits is the results show that the present invention
Method can be effectively used for the Automatic Feature Extraction of handwritten form.
Attached drawing 3 is please referred to, shows a kind of Manuscripted Characters Identification Method measured based on robust provided in an embodiment of the present invention
Identification schematic diagram.
By experimental result we can see that the handwritten character image characteristics extraction and recognition effect of the present invention are apparent
Better than relevant SSMMC, SSLDA and DLPP-L1 method, and stronger stability is shown, there is certain advantage.
In summary:The invention discloses a kind of Manuscripted Characters Identification Methods and system based on robust measurement, pass through opponent
It writes body training sample and carries out similarity-based learning, construction obtains weighting similar diagram, divergence and the local class of separation in compact local class
Between divergence while keep the local characteristics of all training samples;In order to promote the robustness of handwritten form description, structure is based on 1-
The semi-supervised handwritten character characteristics of image learning model of robust of norm measure, the model optimization export one and can be used for sample
The projection matrix P of the outer image characteristics extraction of interior and sample;The conclusion of the outer image of sample by by test sample to projection matrix P into
Row projection, and then the feature of extraction is inputted into efficient label Category of Communication device and is sorted out, it takes general in the soft label of corresponding classification
The position of the maximum value of rate is used for the classification of discriminating test sample, obtains most accurate character identification result.Meanwhile by building
Vertical ratio model, reduces model parameter, and projection matrix P meets orthogonal property.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment weight
What point illustrated is all the difference with other embodiments, and the same or similar parts between the embodiments can be referred to each other.
For system class embodiment, since it is basically similar to the method embodiment, so describe fairly simple, related place ginseng
See the part explanation of embodiment of the method.
A kind of Manuscripted Characters Identification Method and system based on robust measurement provided by the present invention have been carried out in detail above
It introduces.Principle and implementation of the present invention are described for specific case used herein, the explanation of above example
It is merely used to help understand the method and its core concept of the present invention.It should be pointed out that for the ordinary skill people of the art
Member for, without departing from the principle of the present invention, can with several improvements and modifications are made to the present invention, these improve and
Modification is also fallen within the protection scope of the claims of the present invention.
Claims (4)
1. a kind of Manuscripted Characters Identification Method based on robust measurement, which is characterized in that based on the identification and institute for having label data
There are the thought for the 1- norm projections that sample locally keeps, this method to include:
Similarity-based learning, construction weighting similar diagram, divergence and separation office in compact local class are carried out to handwritten form training sample
The local characteristics of all training samples are kept while portion's class scatter;It is semi-supervised hand-written to build the robust based on 1- norm measures
Body character picture feature learning model, the model optimization output one can be used in sample and image characteristics extraction outside sample
Projection matrix P;Meanwhile by establishing ratio model, model parameter is reduced, and the projection matrix P for optimizing output meets orthogonal spy
Property;
Using the projection matrix P to this progress feature extraction of hand-written body examination sample, the conclusion of the outer image of sample will be by will be described
Test sample is mapped to projection matrix P;
Using label Category of Communication device, test is completed to the test sample feature after dimensionality reduction, exports the classification of the test sample
Soft label takes the position of the maximum value of probability in the corresponding soft label of classification, the classification for judging the test sample to obtain
To character identification result;
Wherein, the numerical value in the soft label of the classification represents the probability that the test sample belongs to each classification.
2. the method as described in claim 1, which is characterized in that the robust of the structure based on 1- norm measures is semi-supervised hand-written
Body character picture feature learning model, the model optimization output one can be used in sample and image characteristics extraction outside sample
Projection matrix P, including:
Given one there are the original training sample collection of noiseWherein, n is training sample set
Dimension, N are the quantity of training sample set, and training sample concentration includes the sample set of c class labelWith the sample set without any labelAnd meet sample number
Amount, wherein c is the integer more than 2, and meets sample size l+u=N;IfThere is label sample for l
This label, and sample xiLabel be yi(i≤l);
A projection matrix with identification feature with part holding feature is calculated according to the original training sample collectionIncluding exporting to obtain handwritten form character picture feature outside extractable sample by solving following optimization method
Projection matrix P:
s.t.PTP=Id
Wherein, | | | |1For 1- norms, it is defined as | | S | |1=∑i,j|Si,j|, Si,jIndicate (i, j) number element of s-matrix,It is weight coefficient matrix with W.
3. method as claimed in claim 2, which is characterized in that described to utilize the projection matrix P to hand-written body examination sample sheet
Image carries out feature extraction, and the conclusion of the outer image of sample is wrapped by mapping the test sample image to projection matrix P
It includes:
Training sample and test sample are projected using the projection matrix P, complete handwritten character image characteristics extraction.
4. a kind of handwriting recongnition system based on robust measurement, which is characterized in that including:
Training module, for carrying out similarity-based learning to handwritten form training sample, construction obtains weighting similar diagram, in compact part
The local characteristics of all training samples are kept in class while divergence and separation part class scatter;Structure is based on 1- norm measures
The semi-supervised handwritten character characteristics of image learning model of robust, the model optimization exports one and can be used in sample and sample
The projection matrix P of outer image characteristics extraction;Meanwhile by establishing ratio model, model parameter is reduced, and optimize the projection of output
Matrix P meets orthogonal property;
Preprocessing module is tested, for, to this progress feature extraction of hand-written body examination sample, sample to be schemed outside using the projection matrix P
The conclusion of picture is by mapping the test sample to projection matrix P;
Test module completes test to the test sample feature after dimensionality reduction, exports the survey for utilizing label Category of Communication device
The soft label of classification of sample sheet takes the position of the maximum value of probability in the corresponding soft label of classification, for judging the test
The classification of sample, obtains character identification result;
Wherein, the numerical value in the soft label of the classification represents the probability that the test sample belongs to each classification.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510450358.9A CN104992166B (en) | 2015-07-28 | 2015-07-28 | A kind of Manuscripted Characters Identification Method and system based on robust measurement |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510450358.9A CN104992166B (en) | 2015-07-28 | 2015-07-28 | A kind of Manuscripted Characters Identification Method and system based on robust measurement |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104992166A CN104992166A (en) | 2015-10-21 |
CN104992166B true CN104992166B (en) | 2018-09-11 |
Family
ID=54303979
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510450358.9A Active CN104992166B (en) | 2015-07-28 | 2015-07-28 | A kind of Manuscripted Characters Identification Method and system based on robust measurement |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104992166B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105528620B (en) * | 2015-12-11 | 2019-12-06 | 苏州大学 | method and system for combined robust principal component feature learning and visual classification |
CN106845358B (en) * | 2016-12-26 | 2020-11-10 | 苏州大学 | Method and system for recognizing image features of handwritten characters |
CN109472370B (en) * | 2018-09-30 | 2021-09-10 | 深圳市元征科技股份有限公司 | Method and device for classifying maintenance plants |
CN111753930B (en) * | 2020-06-01 | 2024-02-13 | 安徽理工大学 | Handwriting digital recognition method based on double-view label elastic feature learning |
CN115830599B (en) * | 2023-02-08 | 2023-04-21 | 成都数联云算科技有限公司 | Industrial character recognition method, model training method, device, equipment and medium |
CN117671704B (en) * | 2024-01-31 | 2024-04-26 | 常熟理工学院 | Handwriting digital recognition method, handwriting digital recognition device and computer storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793704A (en) * | 2014-03-11 | 2014-05-14 | 苏州大学 | Supervising neighborhood preserving embedding face recognition method and system and face recognizer |
CN103955676A (en) * | 2014-05-12 | 2014-07-30 | 苏州大学 | Human face identification method and system |
CN104504412A (en) * | 2014-11-28 | 2015-04-08 | 苏州大学 | Method and system for extracting and identifying handwriting stroke features |
CN104751191A (en) * | 2015-04-23 | 2015-07-01 | 重庆大学 | Sparse self-adaptive semi-supervised manifold learning hyperspectral image classification method |
CN104794489A (en) * | 2015-04-23 | 2015-07-22 | 苏州大学 | Deep label prediction based inducing type image classification method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8457409B2 (en) * | 2008-05-22 | 2013-06-04 | James Ting-Ho Lo | Cortex-like learning machine for temporal and hierarchical pattern recognition |
-
2015
- 2015-07-28 CN CN201510450358.9A patent/CN104992166B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103793704A (en) * | 2014-03-11 | 2014-05-14 | 苏州大学 | Supervising neighborhood preserving embedding face recognition method and system and face recognizer |
CN103955676A (en) * | 2014-05-12 | 2014-07-30 | 苏州大学 | Human face identification method and system |
CN104504412A (en) * | 2014-11-28 | 2015-04-08 | 苏州大学 | Method and system for extracting and identifying handwriting stroke features |
CN104751191A (en) * | 2015-04-23 | 2015-07-01 | 重庆大学 | Sparse self-adaptive semi-supervised manifold learning hyperspectral image classification method |
CN104794489A (en) * | 2015-04-23 | 2015-07-22 | 苏州大学 | Deep label prediction based inducing type image classification method and system |
Also Published As
Publication number | Publication date |
---|---|
CN104992166A (en) | 2015-10-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104992166B (en) | A kind of Manuscripted Characters Identification Method and system based on robust measurement | |
CN106845358B (en) | Method and system for recognizing image features of handwritten characters | |
Yao et al. | Strokelets: A learned multi-scale representation for scene text recognition | |
Cai et al. | Effective active skeleton representation for low latency human action recognition | |
Bai et al. | Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition | |
CN106599913B (en) | A kind of multi-tag imbalance biomedical data classification method based on cluster | |
US20170076152A1 (en) | Determining a text string based on visual features of a shred | |
CN108664975B (en) | Uyghur handwritten letter recognition method and system and electronic equipment | |
CN105844223A (en) | Face expression algorithm combining class characteristic dictionary learning and shared dictionary learning | |
CN104834941A (en) | Offline handwriting recognition method of sparse autoencoder based on computer input | |
CN104008375A (en) | Integrated human face recognition mehtod based on feature fusion | |
Djeddi et al. | Artificial immune recognition system for Arabic writer identification | |
Zhang et al. | Automatic discrimination of text and non-text natural images | |
CN110472652A (en) | A small amount of sample classification method based on semanteme guidance | |
CN103593674A (en) | Cervical lymph node ultrasonoscopy feature selection method | |
CN103020167A (en) | Chinese text classification method for computer | |
CN108805061A (en) | Hyperspectral image classification method based on local auto-adaptive discriminant analysis | |
CN104268507A (en) | Manual alphabet identification method based on RGB-D image | |
Li et al. | Hierarchical shape primitive features for online text-independent writer identification | |
CN115937873A (en) | Online handwriting verification system and method based on recognizable single character | |
CN103839074B (en) | Image classification method based on matching of sketch line segment information and space pyramid | |
Gu et al. | Unsupervised and semi-supervised robust spherical space domain adaptation | |
CN102855488A (en) | Three-dimensional gesture recognition method and system | |
Li et al. | Fuzzy bag of words for social image description | |
CN104978569A (en) | Sparse representation based incremental face recognition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |