CN101299236B - Method for recognizing Chinese hand-written phrase - Google Patents

Method for recognizing Chinese hand-written phrase Download PDF

Info

Publication number
CN101299236B
CN101299236B CN2008100290085A CN200810029008A CN101299236B CN 101299236 B CN101299236 B CN 101299236B CN 2008100290085 A CN2008100290085 A CN 2008100290085A CN 200810029008 A CN200810029008 A CN 200810029008A CN 101299236 B CN101299236 B CN 101299236B
Authority
CN
China
Prior art keywords
pen section
phrase
pen
stroke
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008100290085A
Other languages
Chinese (zh)
Other versions
CN101299236A (en
Inventor
金连文
龙腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN2008100290085A priority Critical patent/CN101299236B/en
Publication of CN101299236A publication Critical patent/CN101299236A/en
Application granted granted Critical
Publication of CN101299236B publication Critical patent/CN101299236B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a Chinese hand-written phrase recognizing method, including the following steps: correcting the rotation direction of the input hand-written phrase; extracting the stroke information of the hand-written phrase; disassembling the stroke of the linked stroke; recombining all the strokes; executing character merging route searching to the stroke assembly in combination with the recognizing information and dictionary information, ordering according to phrase result scores obtained by each search route, so at to obtain the phrase recognizing result. The invention can not only make computer to automatically recognize the Chinese phrase written by users in random inclination, but also can effectively process the linked stroke, inter-character strokes adhesion or even partly superposed, and the like problems existing in the unconstrained hand-written cursive phrase, so that users can unconstrainedly freely write Chinese phrase to realize Chinese characters input, which not only has faster writing speed and furthermore free writing, but also has higher recognizing accuracy due to the use of the phrase context information.

Description

A kind of Chinese hand-written phrase recognizing
Technical field
The invention belongs to the computer pattern recognition field, particularly relate to a kind of handwriting recognition processing method.
Background technology
Handwritten Kanji recognition is meant that generally the user passes through handwriting input device (such as handwriting pad, touch-screen, mouse etc.) writing Chinese characters, and the Chinese-character writing track that simultaneous computer collects handwriting input device is converted to the recognition technology of corresponding Chinese character machine inner code.The common input mode that adopts of traditional handwriting recognition technology is monocase identification, promptly writes Chinese character of a Chinese Character Recognition, and this mode need allow computing machine know that a Chinese character has been write to finish, and the method that finishes is write in judgement commonly used to be had following several:
1, clicking specific button tells computing machine to write to finish;
2, write and lift pen after the end and wait for a period of time, surpass a threshold value when the time of waiting for, computing machine is then judged to have write and is finished;
3, write a character after, every writing character late more again, when computing machine judges that distance between lifting pen and starting to write is above certain threshold value, think then that previous character is write to finish;
4, provide two or more writing frames, when changing to another writing frame after the user has write writing in a frame, computing machine judges that the Chinese-character writing in the previous writing frame finishes;
More than these several modes all have in various degree defective, need button click as first kind of mode, waste time and energy and inadequately the nature, the second way has been write needs a period of time of pausing, though leicht fallen D/A is time-consuming, directly influence the speed of input, the third and the 4th kind of mode need move after having write a word and could begin to write character late more at a distance, do not meet the custom that people write naturally.The handwriting mode that multiword symbol connects a rapid style of writing be people natural and the most fastest Chinese handwriting input mode, but above these methods obviously all can't solve the problem that the multiword symbol connects the pen input.
Summary of the invention
The objective of the invention is to overcome the deficiency of above-mentioned Chinese character hand-written recognition methods, a kind of input mode of Chinese character hand-written more freely is provided, promptly the nothing of irrelevant to rotation direction retrains hand-written Chinese rapid style of writing phrase identification.
The technical solution used in the present invention is as follows:
A kind of Chinese hand-written phrase recognizing comprises the steps:
(1), hand-written phrase input being rotated direction corrects;
(2), extract a segment information of hand-written phrase;
(3), will connect a pen section of writing splits;
(4), all sections are reconfigured;
(5), the pen section is combined into the search of line character merge way, the phrase mark ordering as a result according to each bar searching route obtains obtains the phrase recognition result in conjunction with identifying information and dictinary information.
Input is rotated direction and corrects the method that is adopted and be the sense of rotation correcting method based on gravity balance described step (1) to hand-written phrase, and it comprises the steps:
(11), with phrase by being divided into left and right sides two parts through the perpendicular line of its center of gravity, calculate the two-part center of gravity in the left and right sides respectively, calculate two center of gravity projector distance D in the horizontal direction 1
(12), with phrase by be divided into two parts up and down through the horizontal line of its center of gravity, calculate two-part center of gravity up and down respectively, calculate two center of gravity projector distance D in vertical direction 2
(13) if D 1<D 2, 90 ° of phrases then turn clockwise;
(14), with phrase by being divided into left and right sides two parts through the vertical line of its center of gravity, calculate the two-part center of gravity in the left and right sides respectively;
(15), the two-part center of gravity in the left and right sides that will calculate carries out line, calculates the angle theta of this line and horizontal direction, if θ<τ, perhaps (180 °-θ)<τ, then forward step (17) to, described τ is a threshold value that pre-defines;
(16) if θ>τ and θ<90 °, the θ angle that then phrase turned clockwise if θ>90 ° then are rotated counterclockwise phrase 180 °-θ angle, is returned step (14);
(17) phrase is divided into according to the perpendicular line by center of gravity about part, if the stroke starting point of the first stroke of writing drops on right-hand component, then with whole phrase Rotate 180 °;
Described step (the 11)~center of gravity of (17) is the center of gravity of stroke writing pixel.
Described step (2) is extracted a segment information and at first the stroke of handwriting input is carried out smoothing processing, obtains a sampled point sequence { (X i, Y i) | i=1 ..., M}, wherein M is always counting of this stroke, establishes θ iBe the stroke direction angle of ordering, establish the t point for a last flex point or stroke starting point at i, if below p point is satisfied formula, think that then this point is a flex point,
| θ p - Σ i = t p - 1 θ i p - t | > π 4
After finding all flex points in the handwriting input stroke, the first stroke of a Chinese character of stroke with start to write also as flex point, stroke between per two continuous flex points is extracted out becomes a pen section, and the point of writing at first in two end points of pen section is pen section starting point, and the point of writing at last is a segment endpoint.
Described step (3) will connect the deflection that pen section that pen writes splits concrete origin-to-destination by calculating all sections, and deflection be belonged to-10 °~85 ° pen section from wherein separated, be split as two pen sections.
Described step (4) reconfigures all sections and comprises the steps:
(41), handwriting samples is carried out size normalization, by the method for linear normalization it is amplified V doubly, wherein
Figure G2008100290085D00041
W and H are constant, and width and height are respectively original width of phrase and height;
(42), the short pen section that links to each other is made up, if one length is arranged in two pen sections that link to each other less than L 1Length after perhaps passing through to make up is less than L 2, then they being merged, the pen section after claiming to merge is combined as pen section group, L 1And L 2Be constant, and L 1<L 2
(43), to any two pen sections or pen section group, if with the pen section width of frame after its combination less than L 3, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L 3Be constant;
(44), to any two pen sections or pen section group, if the pen section frame of one of them then with its combination, all repeats this step after each combination in another pen section frame inside, until not having pen section capable of being combined or pen section group;
(45), to any two pen sections or pen section group, surpass 2/3 if both horizontal projections are overlapping, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group;
(46), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 4, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L greater than 1/2 of any one section frame area wherein 4Be constant;
(47), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 5, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L greater than 1/2 of any one section frame area wherein 5Be constant, and be L 42 times;
(48) if the width of leftmost pen section or pen section group less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then itself and leftmost pen section or the pen section removed it are made up, if a rightmost pen section or a section width of organizing are less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then with its with remove the rightmost pen section it or section make up W 1, B 1, T 1Be constant, and W 1<B 1<T 1
In above-mentioned steps (43)~(48), pen section frame is meant the minimum rectangle frame that can comprise pen section or pen section group.
Described step (5) is combined into the search of line character merge way in conjunction with identifying information and dictinary information to the pen section, its step is at first to enumerate pen section or pen section all merge ways of organizing that step (4) obtains, promptly, establish all merge ways set and be { W for from left to right T pen section or pen section group φ, for merge way wherein In each section or pen section group w i φ, N is a pen section and a sum of section group in the merge way, the i value is 1,2 ..., N can draw a candidate sequence L by the individual character recognition classifier i=L I1..., L IKFor each paths, if the identification candidate combinations that obtains is phrase { L 1R (1)..., L NR (N)Can be found in dictionary, the identification candidate phrase R for this path calculates its mark by following formula so:
S ( R ) = Σ i = 0 N [ K - R ( i ) ] 2 N
Here R (i) is in position among the candidate list Li for i the Chinese character of candidate's phrase R, and scope is 0 to K-1, otherwise its mark is 0, at last by all candidate's phrases are carried out descending sort according to its mark, draws the result of final phrase identification.
Ultimate principle of the present invention is: writing of Chinese character phrase is exquisite very steady, usually the center of gravity of each word is all approximate in the phrase is on same the horizontal line, after therefore even phrase is rotated, as long as can find this line, its rotation is corrected to horizontal line, then the direction of phrase just can be corrected to horizontal direction, consider that simultaneously the phrase after the correction might turn upside down, and presentation direction all is from left to right usually, therefore judges the phrase that turns upside down to be corrected by the presentation direction of phrase and comes.After postrotational phrase is corrected to horizontal level, extracting by the pen section can be separated with connecting a stroke of writing, the direction that the first stroke of a Chinese character becomes of lifting pen and a back word of considering most of phrase prev word in addition is all towards the upper right side, therefore will there be the pen section of these characteristics to split, and can effectively an intercharacter company stroke be disconnected from the centre.Like this, whole handwriting input phrase is dispersed as a series of pen section, geometry site by these sections will probably be combined for the pen section of same word, then for the pen section combination that can't judge whether to same word, by combining of monocase identifying information and dictinary information, judge that any pen section array mode can produce phrase recognition result reasonable and with a high credibility, like this, even the user connects the Chinese rapid style of writing phrase of pen input without restrictions, the method applied in the present invention also can be correct with its identification.
The present invention compares with existing hand-written recognition method, has following advantage and beneficial effect:
(1), be the phrase handwriting input owing to what adopt, do not limit and whether connect pen or rapid style of writing, therefore its input speed substantially exceeds monocase handwriting input mode, and this unconfined phrase input mode is compared also more nature and accepted by people easily with traditional monocase input mode.
(2), owing to adopted the rotation mechanism of correcting, even the user writes sideling in the handheld device updip, the present invention also can correctly discern, and also rarely has this type of input mode of supporting companys of writing phrase appearance now.
(3), phrase cutting method of the present invention is mainly the off line cutting method, limit pen is not suitable, can solve the abnormal stroke problem well, so its cutting effect is better than the suitable phrase cutting method of limit pen;
(4), the present invention not only used the monocase identifying information when cutting, and used dictinary information, therefore the phrase result who obtains is more accurately and reliably;
(5), similar Chinese character is many, but similar phrase is relatively seldom, so the candidate word quantity of the recognition result of phrase input is than the candidate quantity of monocase identification still less, has saved the time that the user selects prepare word.
Description of drawings
Fig. 1 is a system architecture diagram of the present invention;
Fig. 2 is the FB(flow block) of sense of rotation correcting method of the present invention;
Fig. 3 corrects the step example to the rotation of concrete hand-written phrase sample;
Fig. 4 extracts example to the pen section of concrete hand-written phrase sample;
Fig. 5 is the FB(flow block) of pen section recombination method of the present invention;
Fig. 6 is the pen section frame merge way search example to concrete hand-written phrase sample " Microsoft ";
Fig. 7 is an implementation result example of the present invention.
Embodiment
The present invention is described further below in conjunction with accompanying drawing, implement the used identification equipment of the present invention and can adopt the handwriting pad writing Chinese characters, discern with computing machine, with pure flat escope explicit user graphical interfaces, can adopt the C language to work out all kinds of handling procedures, just can implement the present invention preferably.
System architecture diagram of the present invention as shown in Figure 1, after the sequential point input of the hand-written phrase stroke of Chinese, at first hand-written phrase input being rotated direction corrects, extract a segment information then, and may split for connecting a pen section of writing, afterwards all sections are reconfigured, obtain a series of pen section group, in conjunction with identifying information and dictinary information the pen section is combined into the search of line character merge way then, phrase mark ordering as a result according to each bar searching route obtains obtains the phrase recognition result at last.
Phrase sense of rotation of the present invention is corrected the method that is adopted and is the sense of rotation correcting method based on gravity balance, its process flow diagram as shown in Figure 2, concrete steps are as follows:
(A), phrase is divided into left and right sides two parts by the perpendicular line through its center of gravity.Calculate the two-part center of gravity in the left and right sides respectively.Calculate the two centers of gravity distance D of the projection on the X-axis in the horizontal direction 1
(B), phrase is divided into two parts up and down by the horizontal line through its center of gravity.Calculate two-part center of gravity up and down respectively.Calculate the distance D of the projection of two centers of gravity on the vertical direction Y-axis 2
(C) if D 1<D 2, 90 ° of phrases then turn clockwise;
(D), phrase is divided into left and right sides two parts by the vertical line through its center of gravity.Calculate the two-part center of gravity in the left and right sides respectively;
(E), the two-part center of gravity in the left and right sides that will calculate carries out line, calculates the angle theta of this line and horizontal direction, if θ<τ, perhaps (180 °-θ)<τ, then forward step (G) to.The τ here is a threshold value that pre-defines, normally a very little angle value;
(F) if θ>τ and θ<90 °, the θ angle that then rectangle turned clockwise, otherwise, rectangle is rotated counterclockwise (180 °-θ) angle.Get back to step (D);
(G) phrase is divided into according to the perpendicular line by center of gravity about part, if the stroke starting point of the first stroke of writing drops on right-hand component, then with whole phrase Rotate 180 °, otherwise algorithm finishes.
The center of gravity of being mentioned in the above step is the center of gravity of stroke writing pixel.
The rotation correction procedure of the hand-written phrase sample of writing for a concrete inclination " afternoon " as shown in Figure 3, as can be seen, after correcting through 2 rotations, this phrase is rotated to horizontal level.The center of gravity of part about wherein the small circle in the middle of the two parts of the left and right sides is respectively.Phrase shown in the accompanying drawing 3 " afternoon ", according to above-mentioned steps, processing procedure is as follows:
At step (C), because D1>D2 directly carries out (D) calculating in step, judge (E) step then, also do not satisfy (E) rotating condition in step, carry out the processing of (F) step then, carry out the rotation first time, algorithm turns back to (D) step then, and (F) step of reruning is carried out the rotation processing second time, turn back to (D) step, carry out (E) and judge that θ<τ sets up, jump to (G) step, the starting point of evaluation algorithm and final position do not need to carry out 180 degree rotations, and algorithm finishes.
Of the present invention segment information extracts the stroke that is embodied as at first handwriting input and carries out smoothing processing, obtains a sampled point sequence { (X i, Y i) | i=1 ..., M}, wherein M is always counting of this stroke, establishes θ iBe the stroke direction angle of ordering, establish the t point for a last flex point or stroke starting point at i, if below p point is satisfied formula, think that then this point is a flex point.
| θ p - Σ i = t p - 1 θ i p - t | > π 4
After all flex points in finding the handwriting input stroke, and with the first stroke of a Chinese character of stroke with start to write all think flex point after, the stroke between per two continuous flex points is extracted out and becomes a pen section.The point of writing at first in the two-end-point with the pen section is thought pen section starting point, and the point of writing is at last thought a segment endpoint.As the Chinese phrase " article " that accompanying drawing 4 is imported, have 16 flex points, represent by circle.
The pen pen section that connects of the present invention splits the deflection that is embodied as the origin-to-destination that calculates all sections, and the pen section that deflection is belonged to (10 °~85 °) is split as two pen sections from wherein separated.
Pen section of the present invention reconfigures process flow diagram as shown in Figure 5, and concrete implementation step is as follows:
(I), handwriting samples is carried out size normalization, by the method for linear normalization it is amplified V doubly, wherein
Figure G2008100290085D00092
Here can choose W is 320, and H is 80, and width and height are respectively original width of phrase and height;
(II), the short pen section that links to each other is made up, if one length is arranged in two pen sections that link to each other less than L 1Length after perhaps passing through to make up is less than L 2, then they being merged, the pen section after claiming to merge is combined as pen section group.Can choose L 1Be 8, L 2Be 15;
(III), to any two two pen sections or pen section group, if with the pen section width of frame after its combination less than L 3, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group.Here can choose L 3Be 20;
(IV), to any two pen sections or pen section group, if the pen section frame of one of them then with its combination, all repeats this step after each combination in another pen section frame inside, until not having pen section capable of being combined or pen section group;
(V), to any two pen sections or pen section group, surpass 2/3 if both horizontal projections are overlapping, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group;
(VI), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 4, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group greater than 1/2 of any one section frame area wherein.Here can choose L 4Be 50;
(VII), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 5, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group greater than 1/2 of any one section frame area wherein.Here can choose L 5Be 100;
(VIII) if the width of leftmost pen section or pen section group less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then itself and leftmost pen section or the pen section removed it are made up, if a rightmost pen section or a section width of organizing are less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then with its with remove the rightmost pen section it or section make up, can choose B here 1Be 25, T 1Be 55, W 1Be 15;
Of the present inventionly the pen section is combined into the search of line character merge way in conjunction with identifying information and dictinary information, its step is at first enumerating all sections after the reorganization or all merge ways of pen section group, promptly, establish all merge ways set and be { W for from left to right T pen section or pen section group φ, for merge way wherein
Figure G2008100290085D00111
In each section group w i φ, can draw a candidate sequence L by the individual character recognition classifier i=L I1..., L IKFor each paths, if the identification candidate combinations that obtains is phrase { L 1R (1)..., L NR (N)(R (i) is in position among the candidate list Li for i the Chinese character of candidate's phrase R here, and scope is 0 to K-1) can be found in dictionary, the identification candidate phrase R for this path calculates its mark by following formula so:
S ( R ) = Σ i = 0 N [ K - R ( i ) ] 2 N
Otherwise its mark is 0.At last, by all candidate's phrases are carried out descending sort according to its mark, draw the result of final phrase identification.
For the pen section combination of concrete hand-written phrase sample " Microsoft " and route searching example as shown in Figure 6,3 pen section combinations and path have wherein been provided, and each section group in every paths has been provided the candidate sequence of monocase identification respectively, wherein the mark that obtains of the phrase formed of the first candidate of the identification of two pen section groups in the second path is the highest, and therefore last phrase recognition result is " Microsoft ".
Concrete implementation result example of the present invention as shown in Figure 7, as can be seen, for the arbitrarily angled hand-written phrase of writing of nothing constraint rapid style of writing, the present invention all can discern preferably, and the below of each hand-written phrase sample is the recognition result (indivedual phrases have 2 candidate recognition results) of the present invention to it.

Claims (6)

1. a Chinese hand-written phrase recognizing is characterized in that comprising the steps:
(1), hand-written phrase input is carried out correcting based on the sense of rotation of gravity balance;
(2), extract a segment information of hand-written phrase based on feature point detection;
(3), will connect a pen section of writing according to deflection splits;
(4), the geological information according to the pen section reconfigures all sections;
(5), the pen section is combined into the search of line character merge way, the phrase that obtains according to each bar searching route mark as a result sorts, and obtains the phrase recognition result in conjunction with monocase identifying information and dictinary information.
2. Chinese hand-written phrase recognizing according to claim 1 is characterized in that input is rotated direction and corrects the method that is adopted and be the sense of rotation correcting method based on gravity balance described step (1) to hand-written phrase, and it comprises the steps:
(11), with phrase by being divided into left and right sides two parts through the perpendicular line of its center of gravity, calculate the two-part center of gravity in the left and right sides respectively, calculate two center of gravity projector distance D in the horizontal direction 1
(12), with phrase by be divided into two parts up and down through the horizontal line of its center of gravity, calculate two-part center of gravity up and down respectively, calculate two center of gravity projector distance D in vertical direction 2
(13) if D 1<D 2, 90 ° of phrases then turn clockwise;
(14), with phrase by being divided into left and right sides two parts through the vertical line of its center of gravity, calculate the two-part center of gravity in the left and right sides respectively;
(15), the two-part center of gravity in the left and right sides that will calculate carries out line, calculates the angle theta of this line and horizontal direction, if θ<τ, perhaps (180 °-θ)<τ, then forward step (17) to, described τ is a threshold value that pre-defines;
(16) if θ>τ and θ<90 °, the θ angle that then phrase turned clockwise if θ>90 ° then are rotated counterclockwise phrase 180 °-θ angle, is returned step (14);
(17) phrase is divided into according to the perpendicular line by center of gravity about part, if the stroke starting point of the first stroke of writing drops on right-hand component,, correct processing procedure otherwise finish rotation then with whole phrase Rotate 180 °;
Described step (the 11)~center of gravity of (17) is the center of gravity of stroke writing pixel.
3. Chinese hand-written phrase recognizing according to claim 1 is characterized in that described step (a 2) extraction segment information at first carries out smoothing processing to the stroke of handwriting input, obtains a sampled point sequence { (X i, Y i) | i=1 ..., M}, wherein M is always counting of this stroke, establishes θ iBe the stroke direction angle of ordering, establish the t point for a last flex point or stroke starting point at i, if below p point is satisfied formula, think that then this point is a flex point,
| θ p - Σ i = t p - 1 θ i p - t | > π 4
After finding all flex points in the handwriting input stroke, the first stroke of a Chinese character of stroke with start to write also as flex point, stroke between per two continuous flex points is extracted out becomes a pen section, and the point of writing at first in two end points of pen section is pen section starting point, and the point of writing at last is a segment endpoint.
4. Chinese hand-written phrase recognizing according to claim 1, it is characterized in that described step (3) will connect a pen section of writing and split the concrete deflection that passes through the origin-to-destination of all sections of calculating, and deflection belonged to-10 °~85 ° pen section from wherein separated, be split as two pen sections.
5. Chinese hand-written phrase recognizing according to claim 1 is characterized in that described step (4) reconfigures all sections to comprise the steps:
(41), handwriting samples is carried out size normalization, by the method for linear normalization it is amplified V doubly, wherein
Figure F2008100290085C00031
W and H are constant, and width and height are respectively original width of phrase and height;
(42), the short pen section that links to each other is made up, if one length is arranged in two pen sections that link to each other less than L 1Length after perhaps passing through to make up is less than L 2, then they being merged, the pen section after claiming to merge is combined as pen section group, L 1And L 2Be constant, and L 1<L 2
(43), to any two pen sections or pen section group, if with the pen section width of frame after its combination less than L 3, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L 3Be constant;
(44), to any two pen sections or pen section group, if the pen section frame of one of them then with its combination, all repeats this step after each combination in another pen section frame inside, until not having pen section capable of being combined or pen section group;
(45), to any two pen sections or pen section group, surpass 2/3 if both horizontal projections are overlapping, then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group;
(46), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 4, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L greater than 1/2 of any one section frame area wherein 4Be constant;
(47), to any two pen sections or pen section group, if the pen section width of frame after the combination is less than L 5, and both pen section frame overlapping areas then with its combination, all repeat this step after each combination, until not having pen section capable of being combined or pen section group, L greater than 1/2 of any one section frame area wherein 5Be constant, and be L 42 times;
(48) if the width of leftmost pen section or pen section group less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then itself and leftmost pen section or the pen section removed it are made up, if a rightmost pen section or a section width of organizing are less than W 1, perhaps its bottom ordinate is less than B 1, perhaps its top ordinate is greater than T 1, then with its with remove the rightmost pen section it or section make up W 1, B 1, T 1Be constant, and W 1<B 1<T 1
In above-mentioned steps (43)~(48), pen section frame is meant the minimum rectangle frame that can comprise pen section or pen section group.
6. Chinese hand-written phrase recognizing according to claim 1, it is characterized in that described step (5) is combined into the search of line character merge way in conjunction with identifying information and dictinary information to the pen section, its step is at first to enumerate pen section or pen section all merge ways of organizing that step (4) obtains, promptly, establish all merge ways set and be { W for from left to right T pen section or pen section group φ, for merge way wherein
Figure F2008100290085C00041
In each section or pen section group w i φ, N is a pen section and a sum of section group in the merge way, the i value is 1,2 ..., N can draw a candidate sequence L by the individual character recognition classifier i=L I1..., L IK, K is the positive integer constant; For each paths, if the identification candidate combinations that obtains is phrase { L 1R (1)..., L NR (N)Can be found in dictionary, the identification candidate phrase R for this path calculates its mark by following formula so:
S ( R ) = Σ i = 0 N [ K - R ( i ) ] 2 N
Here R (i) is in candidate list L for i the Chinese character of candidate's phrase R iIn the position, scope is 0 to K-1, otherwise its mark is 0, at last by all candidate's phrases are carried out descending sort according to its mark, draws the result of final phrase identification, what mark was the highest is first-selected recognition result, remaining is candidate's recognition result.
CN2008100290085A 2008-06-25 2008-06-25 Method for recognizing Chinese hand-written phrase Expired - Fee Related CN101299236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100290085A CN101299236B (en) 2008-06-25 2008-06-25 Method for recognizing Chinese hand-written phrase

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100290085A CN101299236B (en) 2008-06-25 2008-06-25 Method for recognizing Chinese hand-written phrase

Publications (2)

Publication Number Publication Date
CN101299236A CN101299236A (en) 2008-11-05
CN101299236B true CN101299236B (en) 2010-06-09

Family

ID=40079060

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100290085A Expired - Fee Related CN101299236B (en) 2008-06-25 2008-06-25 Method for recognizing Chinese hand-written phrase

Country Status (1)

Country Link
CN (1) CN101299236B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012024829A1 (en) * 2010-08-24 2012-03-01 Nokia Corporation Method and apparatus for segmenting strokes of overlapped handwriting into one or more groups

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814142B (en) * 2009-02-24 2013-06-05 阿尔派株式会社 Handwriting character input device and character processing method
CN102023964B (en) * 2010-11-04 2012-10-10 广东因豪信息科技有限公司 Efficient handwritten sample acquisition method and device
CN102073884A (en) * 2010-12-31 2011-05-25 北京捷通华声语音技术有限公司 Handwriting recognition method, system and handwriting recognition terminal
CN104205126B (en) * 2012-03-23 2018-06-08 微软技术许可有限责任公司 The identification without spin of classifying hand-written characters
JP5717691B2 (en) * 2012-05-28 2015-05-13 株式会社東芝 Handwritten character search device, method and program
CN102750272B (en) * 2012-07-02 2015-01-14 安徽科大讯飞信息科技股份有限公司 Method and system for optimizing hand-input candidate item of character
CN103745202B (en) * 2014-01-08 2017-04-05 宇龙计算机通信科技(深圳)有限公司 Chinese characters recognition method and device during handwriting input
US9524440B2 (en) * 2014-04-04 2016-12-20 Myscript System and method for superimposed handwriting recognition technology
US9460359B1 (en) * 2015-03-12 2016-10-04 Lenovo (Singapore) Pte. Ltd. Predicting a target logogram
CN104933408B (en) * 2015-06-09 2019-04-05 深圳先进技术研究院 The method and system of gesture identification
CN105426075B (en) * 2015-10-31 2021-03-16 Oppo广东移动通信有限公司 User terminal control method and user terminal
CN105426090B (en) * 2015-10-31 2020-04-10 Oppo广东移动通信有限公司 Display picture control method and user terminal
CN107330430B (en) * 2017-06-27 2020-12-04 司马大大(北京)智能系统有限公司 Tibetan character recognition device and method
CN113095171A (en) * 2021-03-29 2021-07-09 Oppo广东移动通信有限公司 Method and device for recognizing written characters, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1156741C (en) * 1998-04-16 2004-07-07 国际商业机器公司 Chinese handwriting identifying method and device
CN1652138A (en) * 2005-02-08 2005-08-10 华南理工大学 Method for identifying hand-writing characters
CN1896934A (en) * 2005-07-15 2007-01-17 英华达(上海)电子有限公司 Intelligent handwritten phrase information inputting method for manual equipment
CN1333366C (en) * 2005-04-01 2007-08-22 清华大学 On-line hand-written Chinese characters recognition method based on statistic structural features

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1156741C (en) * 1998-04-16 2004-07-07 国际商业机器公司 Chinese handwriting identifying method and device
CN1652138A (en) * 2005-02-08 2005-08-10 华南理工大学 Method for identifying hand-writing characters
CN1333366C (en) * 2005-04-01 2007-08-22 清华大学 On-line hand-written Chinese characters recognition method based on statistic structural features
CN1896934A (en) * 2005-07-15 2007-01-17 英华达(上海)电子有限公司 Intelligent handwritten phrase information inputting method for manual equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘庆祥,熊杰.手写汉字识别系统的研究与应用.高等函授学报19 4.2006,19(4),3-5,12.
刘庆祥,熊杰.手写汉字识别系统的研究与应用.高等函授学报19 4.2006,19(4),3-5,12. *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012024829A1 (en) * 2010-08-24 2012-03-01 Nokia Corporation Method and apparatus for segmenting strokes of overlapped handwriting into one or more groups

Also Published As

Publication number Publication date
CN101299236A (en) 2008-11-05

Similar Documents

Publication Publication Date Title
CN101299236B (en) Method for recognizing Chinese hand-written phrase
Roy et al. HMM-based Indic handwritten word recognition using zone segmentation
Mouchère et al. ICFHR2016 CROHME: Competition on recognition of online handwritten mathematical expressions
Tripathy et al. Handwriting segmentation of unconstrained Oriya text
JP5071914B2 (en) Recognition graph
CN100412861C (en) Apparatus and method for searching for digital ink query
Saabni et al. Language-independent text lines extraction using seam carving
Alaei et al. A new scheme for unconstrained handwritten text-line segmentation
JP4787275B2 (en) Segmentation-based recognition
Zahour et al. Arabic hand-written text-line extraction
US7302099B2 (en) Stroke segmentation for template-based cursive handwriting recognition
US20080240569A1 (en) Character input apparatus and method and computer readable storage medium
CN105260751B (en) A kind of character recognition method and its system
Bag et al. Recognition of Bangla compound characters using structural decomposition
JPH0844826A (en) Collating method of handwritten input
CN102208039B (en) Method and device for recognizing multi-language mixed handwriting text lines
Roy et al. Morphology based handwritten line segmentation using foreground and background information
Bhattacharya et al. An end-to-end system for Bangla online handwriting recognition
CN101452531B (en) Identification method for handwriting latin letter
CN102156889A (en) Method and device for identifying language type of handwritten text line
CN101697200B (en) Handwritten Chinese grass-style phrase identification method irrelevant to rotation
Bhattacharya et al. A semi-automatic annotation scheme for Bangla online mixed cursive handwriting samples
Ladwani et al. Novel approach to segmentation of handwritten Devnagari word
Roy et al. Online Bangla handwriting recognition system
Mahmood et al. A novel segmentation technique for urdu type-written text

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100609

Termination date: 20140625

EXPY Termination of patent right or utility model