CN102184395B - String-kernel-based hand-drawn sketch recognition method - Google Patents

String-kernel-based hand-drawn sketch recognition method Download PDF

Info

Publication number
CN102184395B
CN102184395B CN 201110151853 CN201110151853A CN102184395B CN 102184395 B CN102184395 B CN 102184395B CN 201110151853 CN201110151853 CN 201110151853 CN 201110151853 A CN201110151853 A CN 201110151853A CN 102184395 B CN102184395 B CN 102184395B
Authority
CN
China
Prior art keywords
character
sketch
string
sampling
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110151853
Other languages
Chinese (zh)
Other versions
CN102184395A (en
Inventor
廖士中
段孟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN 201110151853 priority Critical patent/CN102184395B/en
Publication of CN102184395A publication Critical patent/CN102184395A/en
Application granted granted Critical
Publication of CN102184395B publication Critical patent/CN102184395B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Character Discrimination (AREA)

Abstract

The invention discloses a support-vector-machine-based kernel matrix approximating method. The method comprises the following steps of: firstly, mapping a hand-drawn sketch into a feature string on the basis of a region-filling concept; secondly, training a training sample by using a support vector machine (SVM) on the basis of a string kernel to obtain a classifier; and finally, classifying and recognizing the sketch to be recognized by using the classifier obtained through training, and mapping the blurred and irregular hand-drawn sketch into a precise geometrical shape. Compared with the prior art, the method has the advantages that: the method is unrelated to the position, the size and the drawing way of the sketch and a user is allowed to draw the sketch in an individual habitual way; and by the string-kernel-based hand-drawn sketch recognition method, the recognition accuracy is relatively high and the method is easy to implement.

Description

Sketch recognition method based on character string nuclear
Technical field
The present invention relates to cartographical sketching identification based on kernel method.
Background technology
Background technology involved in the present invention comprises following three aspects:
One, cartographical sketching identification
Cartographical sketching identification is the fuzzy sketch that pen type obtains alternately to be expressed map to accurate avatars; Promptly from the man-machine interaction process, excavate the sketch shape constraining in the ever-increasing sketch information; On the basis of understanding the original intention of user, freely, irregular sketch recognition become rule, geometric configuration accurately.
The at present main pattern recognition method that has three kinds of modes is promptly based on the method for stroke representation; The method of representing based on pels such as straight line, arc, curves; Pattern recognition method based on geometric properties.
1, based on the method for expressing of gesticulating
Generally, the sketch that obtains through the pen type interactive mode of sketch recognition system shows as some strokes that are made up of the sampling spot sequence of user between starting to write and starting writing.Stroke is the configuration information of figure fractal key.It is symbol that stroke is looked by early stage system, gives its specific implication, the identification of stroke is equal to the identification of figure.In the one stroke identification, comparatively classical is: the Rubine method, and this method is simply effective, but it requires the user to sketch the contours figure with fixed mode stroke, and figure constitutes fairly simple.
2, based on the method for expressing of pel
Represent that based on pel with diagrammatic representation be the combinations of pels such as straight line, arc, curve under the certain space relation.Based on the method for pel comprise usually cut apart, step such as match, regular, identification.The sketch that the user draws can split into pel and relation thereof automatically, and the training of graphics template does not need user intervention, the ATL easy expansion, and adaptability is strong.But train template inevitably and all oneself attributes of definition of graphics test have been increased system overhead.
3, based on the identification of geometric properties
Method based on geometric properties is directly treated as recognition unit with figure, and the geometric properties that directly extracts sketch is used for classification.Usually its extraction that will cut apart the different patterns geometric properties in some way earlier comes down to stroke information dimensionality reduction; But because the complicacy of figure; Be difficult to the fixing complete information of the geometric properties reservation stroke of dimension, partial information has been lost in the dimensionality reduction process.Therefore, the geometric properties of the dissimilar figure of the not higher vision of accuracy of the method for geometric properties identification expresses possibility close.
Two, character string nuclear (Sting Kernel)
Kernel method is represented one group of relevant machine learning and data mining algorithm.The key components of kernel method are kernel functions, and kernel function can be measured the similarity of input data.Based on these kernel functions, can pass through SVMs (SVM) and accomplish tasks such as classification, recurrence.
Character string is endorsed and handled the input data type is the data of character string, through the similarity of two input of character string of character string kernel function tolerance.
Character string endorse be divided into multiple, for example: spectrum-similarly, calculate the character string nuclear of the public substring of two input of character string, based on the character string nuclear of comparison, the character string nuclear that obtains by probability model.
Character string nuclear has been applied to fields such as protein homology detection, text classification.
Three, SVMs (Support Vector Machine, SVM)
(Support Vector Machine SVM) is the famous system based on kernel method to SVMs.
(support vector machine SVM) solves the new tool of machine learning problem to SVMs by means of optimization method.It is introduced in the meeting of computer learning in 1992 theory and is got into the machine learning field by Vapnik and co-worker's invention thereof, has received afterwards widely and having paid close attention to.All obtain breakthrough aspect its theoretical research and the algorithm realization in recent years, and becoming the powerful measure that overcomes " dimension disaster " and tradition difficulties such as " crossing study ".The object that the theoretical system of SVM contains is very extensive, like dual representation, feature space, the theories of learning, optimum theory and algorithm etc.SVM has obtained reasonable application in fields such as text classification, handwriting recognition, image classification, bioinformatics.
This algorithm becomes sky at set L, stops when perhaps λ is enough little.
Summary of the invention
Based on above-mentioned prior art; The present invention proposes a kind of Freehand Sketchy Graphics Recognition based on character string nuclear; In conjunction with cartographical sketching identification, character string nuclear, these three prior arts of SVMs, realize a kind of new Freehand Sketchy Graphics Recognition that can continuous training/accumulation.
The present invention proposes a kind of Freehand Sketchy Graphics Recognition based on character string nuclear, this method may further comprise the steps:
A kind of Freehand Sketchy Graphics Recognition based on character string nuclear, this method may further comprise the steps:
Step 1 is mapped as feature string with cartographical sketching, and the cartographical sketching of input is carried out equidistant sampling, and the sampled distance threshold value is rule of thumb selected 5 pixels, and limits sketch of continuous sampling in the time threshold 0.7 second of sampling; The cartographical sketching of sampling is mapped to feature string, further comprising the steps of:
Get a positive integer n; The boundary rectangle of the cartographical sketching that calculating sampling arrives, and the boundary rectangle of sketch is divided into n 2Part; Each cuts apart little rectangle that the back obtains can use two-dimensional coordinate x, and y representes, 1≤x wherein, y≤n and x, y ∈ N; When the central point of and if only if a little rectangle had dropped in the zone that cartographical sketching surrounds, we thought that this little rectangle has been filled; It is n that a cartographical sketching has been mapped to a length 2Feature string;
Step 2 is checked the sampling sketch training as training sample through SVMs based on character string, obtains sorter; Using the lattice search to carry out parameter transfers excellent; Select best penalty factor C, gama, and use the best penalty factor C that obtains; Positive integer n selected in gama and the step 1 is trained whole training set, obtains supporting vector machine model; Character string nuclear is measured the similarity of these two character strings through the editing distance between two character strings; Said editing distance is that character string one is transformed to the used minimum character manipulation number of character string two, and said character manipulation comprises: character of (1) deletion; (2) insert a character; (3) change a character into another character;
Step 3, the sorter that obtains through training is classified to sketch to be identified and is discerned, and will blur, irregular cartographical sketching is mapped as accurate geometric configuration;
Said best penalty factor C, the value of gama is respectively: 5,0.01, the positive integer n value is 9.
Compared with prior art, a kind of sketch recognition method of the present invention based on character string nuclear, the position of this method and sketch, size, the drafting mode is irrelevant, allows the user to carry out sketch drafting according to the mode of oneself being accustomed to.Sketch recognition method recognition accuracy based on character string nuclear is higher, and realizes simple.
Description of drawings
Fig. 1 is the leg-of-mutton feature string mapping of Freehandhand-drawing (n=5) synoptic diagram;
Fig. 2 is experiment graphical-set synoptic diagram;
Fig. 3 is cartographical sketching and recognition result figure thereof;
Fig. 4 is the overall flow figure of the Freehand Sketchy Graphics Recognition based on character string nuclear of the present invention.
Embodiment
The thought of at first filling based on the zone is mapped as feature string with cartographical sketching; Secondly through SVMs (Support Vector Machine; SVM) based on character string nuclear (String Kernelho) training sample is trained, obtain sorter, the sorter that obtains through training is then classified to sketch to be identified and is discerned; To blur, irregular cartographical sketching is mapped as accurate geometric configuration.
System of the present invention runs under the visual C++6.0 environment, based on the libsvm software package, uses C Plus Plus to develop.
At first, in visual C++6.0, set up the engineering of a single document view; The libsvm software package that has added character string nuclear (String Kernel) is transplanted in the engineering of foundation, is write the code of feature string mapping block.Prepare training data: gathered 1150 samples, wherein 1000 samples are as training data, and 150 as test data, and sketch is mapped as feature string, the feature string that obtains is write in the file preserve.Training classifier: use characteristic character string training classifier, and use the mode of lattice search to carry out parameter regulation, i.e. penalty factor C, gama, n regulates.
Describe in detail as follows:
One, the cartographical sketching sampling & as the present invention's input discerns
What in this step, need do is, the cartographical sketching to be identified that samples is mapped as feature string, and the sorter that uses training to obtain is classified to it, thereby accomplishes the identification of cartographical sketching.
1) pre-service
Cartographical sketching to input carries out equidistant sampling; The sampled distance threshold value is rule of thumb selected 5 pixels; And limit sketch of (0.7 second) continuous sampling in the time threshold of sampling, order and direction no requirement (NR) that input is gesticulated, even the user start to write and the last time between starting writing interval greater than 0.7 second; Then user's a last sketch EOI is thought by system, and begins to draw a new sketch.As the gap that pixel resamples, this variable-value rule of thumb obtains.
The cartographical sketching of sampling is mapped to feature string, may further comprise the steps:
At first get a positive integer n;
The boundary rectangle of the cartographical sketching that calculating sampling arrives, and the boundary rectangle of sketch is divided into n 2Part.Each is cut apart little rectangle that the back obtains and can use two-dimensional coordinate (x y) representes, 1≤x wherein, y≤n and x, y ∈ N.When the central point of and if only if a little rectangle had dropped in the zone that cartographical sketching surrounds, we thought that this little rectangle has been filled.The leg-of-mutton feature string mapping of Freehandhand-drawing as shown in Figure 1 (n=5Pixel) synoptic diagram (dash area is expressed as and is filled).The feature string S=0100002222033004400050000 of Freehandhand-drawing triangle mapping among the figure.The computing formula of mapping process is following: at first, and defined function
f:N 2→{0,1,2,…,n}
Figure GDA00001881932000061
Secondly, the process of calculated characteristics character string is following:
Initialize:x=1,y=1,n=k,k∈□ +,S=″″;
For(x=1;x≤n;x++)
For(y=1;y≤n;y++)
{
Int?tmp=f(x,y);
S=strcat(S,tmp);
Output?S;
So far we just a cartographical sketching to be mapped to a length be n 2Feature string.
Pre-service of the present invention comprises the sample collection of the cartographical sketching of various ways, and for example, the present invention has selected six kinds of common geometric figures such as straight line, triangle, rectangle, ellipse as the experiment graphical-set.
As shown in Figure 2, to above-mentioned graphical-set, in experiment, collect sample data.The user is free to the skeletonizing figure, 1000 of the total sample number that the present invention adopts.Mainly be used for of the influence of the different sample numbers of comparison to training time and recognition accuracy.
Two, examine training classifier through SVMs based on character string
Use character string nuclear training classifier.It is excellent that the parameter accent is carried out in the search of use lattice, selects best penalty factor C, gama, n.Character string kernel function in the native system is measured the similarity of these two character strings through the editing distance between two character strings (EDIT DISTANCE).
The definition of editing distance:
Suppose these 2 character strings of A and B.To convert character string A into character string B with minimum character manipulation.Here said character manipulation comprises:
(1) character of deletion;
(2) insert a character;
(3) change a character into another character.
Character string A is transformed to the used minimum character manipulation number of character string B is called the editing distance of character string A to B, be designated as K (A, B).
Experimental result, as shown in Figure 3.And following experimental verification data are arranged:
The accuracy of identification of form 1 difformity figure and recall ratio (get n=9, C=5, gama=0.01)
Figure GDA00001881932000071
The recognition accuracy of the cartographical sketching of form 2 on different training collection scale (C=5, gama=0.01)
Figure GDA00001881932000072
Figure GDA00001881932000081

Claims (1)

1. Freehand Sketchy Graphics Recognition based on character string nuclear, this method may further comprise the steps:
Step 1 is mapped as feature string with cartographical sketching, and the cartographical sketching of input is carried out equidistant sampling, and the sampled distance threshold value is rule of thumb selected 5 pixels, and limits sketch of continuous sampling in the time threshold 0.7 second of sampling; The cartographical sketching of sampling is mapped to feature string, further comprising the steps of:
Get a positive integer n; The boundary rectangle of the cartographical sketching that calculating sampling arrives, and the boundary rectangle of sketch is divided into n 2Part; Each cuts apart little rectangle that the back obtains can use two-dimensional coordinate x, and y representes, 1≤x wherein, y≤n and x, y ∈ N; When the central point of and if only if a little rectangle had dropped in the zone that cartographical sketching surrounds, we thought that this little rectangle has been filled; It is n that a cartographical sketching has been mapped to a length 2Feature string;
Step 2 is checked the sampling sketch training as training sample through SVMs based on character string, obtains sorter; Using the lattice search to carry out parameter transfers excellent; Select best penalty factor C, gama, and use the best penalty factor C that obtains; Positive integer n selected in gama and the step 1 is trained whole training set, obtains supporting vector machine model; Character string nuclear is measured the similarity of these two character strings through the editing distance between two character strings; Said editing distance is that character string one is transformed to the used minimum character manipulation number of character string two, and said character manipulation comprises: character of (1) deletion; (2) insert a character; (3) change a character into another character;
Step 3, the sorter that obtains through training is classified to sketch to be identified and is discerned, and will blur, irregular cartographical sketching is mapped as accurate geometric configuration;
Said best penalty factor C, the value of gama is respectively: 5,0.01, the positive integer n value is 9.
CN 201110151853 2011-06-08 2011-06-08 String-kernel-based hand-drawn sketch recognition method Expired - Fee Related CN102184395B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110151853 CN102184395B (en) 2011-06-08 2011-06-08 String-kernel-based hand-drawn sketch recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110151853 CN102184395B (en) 2011-06-08 2011-06-08 String-kernel-based hand-drawn sketch recognition method

Publications (2)

Publication Number Publication Date
CN102184395A CN102184395A (en) 2011-09-14
CN102184395B true CN102184395B (en) 2012-12-19

Family

ID=44570569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110151853 Expired - Fee Related CN102184395B (en) 2011-06-08 2011-06-08 String-kernel-based hand-drawn sketch recognition method

Country Status (1)

Country Link
CN (1) CN102184395B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663033A (en) * 2012-03-23 2012-09-12 汉海信息技术(上海)有限公司 Method for searching interest points in designated area of map by hand-drawing way
CN103106309B (en) * 2013-02-06 2015-06-17 浙江大学 Sketch behavior identification method and sketch behavior identification system during computer aided design process
CN104517112B (en) * 2013-09-29 2017-11-28 北大方正集团有限公司 A kind of Table recognition method and system
CN104141217B (en) * 2014-07-18 2018-07-24 长园和鹰智能科技有限公司 The fast cutting method of irregular shape sheet material
CN108121984B (en) * 2016-11-30 2021-09-21 杭州海康威视数字技术股份有限公司 Character recognition method and device
US10719702B2 (en) 2017-11-08 2020-07-21 International Business Machines Corporation Evaluating image-text consistency without reference
CN110633745B (en) * 2017-12-12 2022-11-29 腾讯科技(深圳)有限公司 Image classification training method and device based on artificial intelligence and storage medium
CN108664651B (en) * 2018-05-17 2020-08-04 腾讯科技(深圳)有限公司 Pattern recommendation method, device and storage medium
CN108846386B (en) * 2018-07-10 2022-06-24 深圳市前海手绘科技文化有限公司 Intelligent identification and correction method for hand-drawn pattern
CN110598634B (en) * 2019-09-12 2020-08-07 山东文多网络科技有限公司 Machine room sketch identification method and device based on graph example library
CN112861709A (en) * 2021-02-05 2021-05-28 金陵科技学院 Hand-drawn sketch recognition method based on simple strokes

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017533A (en) * 2007-03-09 2007-08-15 清华大学 Recognition method of printed mongolian character
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN101593270A (en) * 2008-05-29 2009-12-02 汉王科技股份有限公司 A kind of method of Freehandhand-drawing shape recognition and device
CN101763516A (en) * 2010-01-15 2010-06-30 南京航空航天大学 Character recognition method based on fitting functions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101017533A (en) * 2007-03-09 2007-08-15 清华大学 Recognition method of printed mongolian character
CN101593270A (en) * 2008-05-29 2009-12-02 汉王科技股份有限公司 A kind of method of Freehandhand-drawing shape recognition and device
CN101329731A (en) * 2008-06-06 2008-12-24 南开大学 Automatic recognition method pf mathematical formula in image
CN101763516A (en) * 2010-01-15 2010-06-30 南京航空航天大学 Character recognition method based on fitting functions

Also Published As

Publication number Publication date
CN102184395A (en) 2011-09-14

Similar Documents

Publication Publication Date Title
CN102184395B (en) String-kernel-based hand-drawn sketch recognition method
CN110413816B (en) Color Sketch Image Search
Sonkusare et al. A review on hand gesture recognition system
US20150154442A1 (en) Handwriting drawing apparatus and method
EP1971957B1 (en) Methods and apparatuses for extending dynamic handwriting recognition to recognize static handwritten and machine generated text
NO20161728A1 (en) Written text transformer
US20080240569A1 (en) Character input apparatus and method and computer readable storage medium
CN105549890B (en) One-dimensional handwriting input equipment and one-dimensional hand-written character input method
US9378427B2 (en) Displaying handwritten strokes on a device according to a determined stroke direction matching the present direction of inclination of the device
JP2018037087A5 (en)
JP6010253B2 (en) Electronic device, method and program
CN104200240A (en) Sketch retrieval method based on content adaptive Hash encoding
JP6055065B1 (en) Character recognition program and character recognition device
Panwar Hand gesture based interface for aiding visually impaired
CN103186241B (en) A kind of interactive desktop contact right-hand man's recognition methods
US9927971B2 (en) Electronic apparatus, method and storage medium for generating chart object
US20200242346A1 (en) Preserving styles and ink effects in ink-to-text
JP6081606B2 (en) Electronic apparatus and method
JP5735126B2 (en) System and handwriting search method
JP5330576B1 (en) Information processing apparatus and handwriting search method
JP6039066B2 (en) Electronic device, handwritten document search method and program
Inuganti et al. Preprocessing of online handwritten Telugu character recognition
JP5666011B1 (en) Method and electronic equipment
US20210350122A1 (en) Stroke based control of handwriting input
Jian et al. Real-time continuous handwritten trajectories recognition based on a regression-based temporal pyramid network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20110914

Assignee: Tianjin University Urban Planning & Design Research Institute

Assignor: Tianjin University

Contract record no.: 2013120000016

Denomination of invention: String-kernel-based hand-drawn sketch recognition method

Granted publication date: 20121219

License type: Exclusive License

Record date: 20130319

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20121219

Termination date: 20210608

CF01 Termination of patent right due to non-payment of annual fee