CN1020213C - Hand-written charactor recognition apparatus - Google Patents
Hand-written charactor recognition apparatus Download PDFInfo
- Publication number
- CN1020213C CN1020213C CN89106539A CN89106539A CN1020213C CN 1020213 C CN1020213 C CN 1020213C CN 89106539 A CN89106539 A CN 89106539A CN 89106539 A CN89106539 A CN 89106539A CN 1020213 C CN1020213 C CN 1020213C
- Authority
- CN
- China
- Prior art keywords
- character
- stroke
- degree
- approximation
- input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Character Discrimination (AREA)
Abstract
To contrive the miniaturization of a dictionary and to speed up retrieval by obtaining a degree of approximation between an inputted stroke and the basic constituting element of a character, which is defined in advance, and evaluating the character to be inputted from a feature vector, which defines this degree of approximation as a component, with fuzzy.One stroke part of data is supplied through a temporary buffer memory 3 to approximation degree arithmetic circuits 401-426 and the approximation degree with ten plates T1-T26 is calculated. The calculated approximation degree is supplied to a feature vector storing buffer 6. 26 memory areas are obtained in a row direction in correspondence to the ten plate and memory areas are obtained in a column direction in correspondence to the maximum stroke number of the character to be recognized. Consequently, the feature vector of each stroke is stored at every stroke. This feature vector by one character and a character code of a feature dictionary 7 are processed in a detecting circuit 8 and the code number of the inputted character is outputted.
Description
The present invention relates to the recognition device of hand script Chinese input equipment character.
In the hand-written charactor recognition apparatus of the present invention, obtain the degree of approximation of input stroke and predefined character basic structural element, and, use the fuzzy evaluation input character in view of the above, improve the various characteristics of its identification with the composition of this degree of approximation as eigenvector.
The recognition methods of prior art hand script Chinese input equipment character as shown in Figure 6,
1. with sampling spot P on the input person's handwriting
0, P
1... P
nAnd the broken line (Fig. 6 B) that time serial message constitutes removes the input person's handwriting of an approximate hand-written stroke.(Fig. 6 A)
With the basic configuration of predefined person's handwriting, promptly " basic strokes type " relatively carry out " stroke identification " with the 1st broken line.
3. according to the 2nd result, the stroke of input is transformed to the Code Number of immediate basic strokes type.
4. to all strokes of a character, repeat the 1-3 item.
5. with reference to dictionary, will be judged as the character that is transfused to by the character that order of strokes is held the 3rd item code number.
Said method is being widely used always.
If make in this way, owing to replace the stroke of importing with a kind of basic penization type, at basic input person's handwriting sampling spot P
0, P
1... P
nIn the information,, nearly all can cast out, even therefore the less device of memory capacity also may be used as character recognition except the necessary data of later identifying.Because the Code Number of the basic strokes type of dictionary is by each stroke series arrangement, make comparisons in proper order with the Code Number of the basic strokes type of it and input character again,, can carry out the identification of input character, therefore, both the dictionary miniaturization can be made, the more needed time can be shortened again.
Document: " Nikkei " Dec nineteen eighty-three 5 days numbers
But under the occasion that makes in this way, fundamental form that shortcoming makes the input person's handwriting changes and even when being out of shape, will produce mistake in the 2nd stroke identification because input noise, hand-written person are imported, and its result descends accuracy of identification significantly.For example, write " one " or one when horizontal, as represented in zero among Fig. 7 A, if any " first stroke of a Chinese character place ", then its basic strokes type can be identified as shown in Fig. 7 B.
For this reason, in the past with " the code name sign indicating number C of stroke s
1, C
2... C
n" such and line description handles in the dictionary those codes that " is easy to the stroke type of mistake identification ".
But, if do like this, both increased the dictionary size, prolonged retrieval time again.And the user owing to can not only determine the necessary stroke type code of login from the input stroke, realizes that such login of appending is very difficult will append not the character of login in dictionary the time.
The objective of the invention is to solve problem as above.
For this reason, in the present invention, obtain the stroke information of input and the degree of approximation of predefined character basic structural element, in the information of conversion with this degree of approximation constitutive characteristic vector, the stroke information of input changes evaluating deg according to the qualifier in estimating.
Carry out character recognition with ambiguity, can not reduce discrimination, can realize dictionary miniaturization of size and retrieval high speed etc.
At first, just the summary of the present invention and embodiment is illustrated.
In the present invention promptly:
1. prepare appropriate template pattern for example shown in Figure 2 (basic strokes type) T
1-T
26;
2. with ⅰ stroke information S in each character of handwriting input
iWith template T
1-T
26Order compares, and calculates and each template T
j(j=1-26) degree of approximation E
Ij
For example, if input character be katakana "
", its first stroke s is " Pie " so since with template T
1, T
2, T
3Deng the degree of approximation higher, with template T
7Deng the degree of approximation lower, so obtain E
101=90%, E
102=80%, E
103=95% ... E
107=0% ... E
126=0%.Similarly, because the second stroke S is " Dian ", so obtain E
201=5%, E
202=0%, E
203=0% ... E
207=95% ... E
226=0%, (numerical value is the size of supposing for explanation).
3, for each character, at each stroke S
iOn, with the 2nd result as eigenvector V
i, V
i=(E
I01, E
I02... E
I26) store.
V=(90,80,95,…,0,…0)
V=(5,0,0,…,95,…,0)。
4, when katakana "
" character writes when correct its first stroke S
1With template T
3" substantially " unanimity, the second stroke S
2With template T
7" really " unanimity.
Therefore, if character in the dictionary "
", then "
" jis code (JIS) code of describing its character be T
3=substantially, T
7=determine.
That is, to each character, describe this character code number, with this character i pen S
iImmediate template T
jAnd the qualifier of representing this approximate (unanimity) degree.Again, this template T
jAnd qualifier, only its stroke number is described by order of strokes.
Again according to total stroke number of each character with these character classifications.
5, be ready to as shown in Figure 3 function chart again with ambiguity.
6,, from total stroke number item of input character, take out the character data of first character according in the 4th the character data.
In last example, "
" total stroke number of character is 2, so from 2 stroke items, take out first character data.
7, for the sake of simplicity, the character data that obtains in the 6th for "
" character data of character, because the 1st stroke S
1T
3=" substantially ", so along with the function curve of selecting " substantially " among Fig. 3, just can be from the 3rd eigenvector V that obtains
1The degree of approximation in, take out with respect to template T
3Degree of approximation 95%(=E
103).
Then, according to the function curve of " substantially " among Fig. 3, change this degree of approximation 95% and be qualification rate G
1, G for example
1=96%.
Equally, because the second stroke S
2, T
7=" determining ", so along with the function curve of selecting " determining " among Fig. 3, just can be from eigenvector V
2The degree of approximation in, take out with respect to template T
7Degree of approximation 95%(=E
207), it is transformed to qualification rate G
2, for example, G
2=98%.
That is, when the 6th is taken out character data, should be to each stroke S
iSelect the function curve of Fig. 4, again with eigenvector V
iCorresponding approximate value E
IjBe transformed to the qualified degree G that limits by selected function curve
i
8, with the 7th qualified degree G that tries to achieve
iInterior minimum qualification rate is as the qualified degree of the character G of the represented Code Number of this character data
m
In last example, because G
1=96%, G
2=98%, thus with respect to input character "
" the qualified degree G of word
mBe 96%(=G
1).
To meet all character datas of stroke number, repeat 7,89, thereafter.
If carry out and finish for 10 the 9th, at resulting qualified degree G
mIts Code Number with wherein giving first candidate of the highest character of qualified degree G as input character, is exported in (this has to the number of character data) lining.
Then, template pattern T
1-T
26Be that some is determined below considering, that is:
1, the structural elements of Chinese character have " horizontal stroke ", and " perpendicular stroke ", " left-falling stroke ", " bending " etc., its kind has qualification.
Even 2 seem identical stroke, also can be owing to the difference that " left-fallings stroke " literary style of being out of shape such as " bends " appears in the difference of wieling the pen.Again, to strokes such as " left-falling strokes ", owing to do not stipulate the original length and angle, in the stroke that may produce different distortion, template pattern T for example
1-T
3, only be for distinguishing other templates that deformation type is prepared.
3, because very complicated basic configuration frequency of occurrences in all Chinese characters is extremely low, thus do not define template, and handle with other recognition methods.
Also has template pattern T
1-T
26In the stroke part represented with dotted line asking degree of approximation E
IjThe time, expression can reduce or ignore their evaluation.
The following describes a structure example of the present invention.
Fig. 1 is the system diagram of one embodiment of the present of invention.
Fig. 2 to Fig. 7 is its key diagram.
In Fig. 1, the coordinate input medium of (1) presentation graphic tablet etc. is by the coordinate sequence P of a stroke part of this input medium (1) input
o-P
n, with this coordinate sequence P
o-P
nDeliver to broken line compressor circuit (2), the sequence of broken line information and terminal point information thereof is carried out compressed transform.Promptly, for example, if stroke (coordinate sequence) pre-service of input is the broken line #1-#4 shown in Fig. 4 B, this stroke so, its every the broken line #1-#4 both angle shown in Fig. 4 A (direction) quantizes by 8 direction numbers, again the length of this broken line #1-#4 and the coordinate values of each initial point and terminal point are carried out conversion, obtain data such shown in Fig. 4 C.
And a stroke of these data partly by memory buffer (3), is delivered to degree of approximation counting circuit (401)-(426), calculation template style T
1-T
26Degree of approximation E
Ij(aforementioned the 2nd) again, carries out this degree of approximation E
IjCalculating is according to rale store circuit (501
-(526) algorithm described in, independent and concurrently to each template pattern T
jCalculating is tried to achieve.
Then, with the degree of approximation E that calculates
IjDeliver to eigenvector memory buffer (6).Among the figure, the structure of this memory buffer (6) has been represented on model ground, existing and template pattern T
1-T
2626 memory blocks of corresponding row direction have K memory block with the maximum stroke number K of identification character respective column direction again.Therefore, in this memory buffer (6), with regard to the eigenvector V of its each stroke S of character
i, be stored in each stroke S
iIn.(aforementioned the 3rd)
And, the eigenvector V of this character
iHandle by aforementioned 5-10 item in estimating circuit (8) with character code (aforementioned the 4th), will be exported for the highest Code Number of the qualified degree of input character from feature lexicon (7).
Fig. 5 is illustrated in the degree of approximation counting circuit (401), calculates input stroke Si to template T
1Degree of approximation E
I01Regular example.
That is, in Fig. 5 A, contained " Pie " in the words such as " right side ", " five " represented in exaggeration, and stroke " Pie " hereto is at template T
1The moment, shown in Fig. 5 B, be by measuring length L
1-L
4, L
h, L
w, calculate.
E
iol=(aLh-bLw-cL
1+dL
4+eL
3)/L
2
But, work as E
I01>1 o'clock E
I01=1
Work as E
I01<0 o'clock E
I01=0
A-e is the constant of obtaining, and (represents approximate value E here decimally
I01).
And, at this moment at template T
1In, because the power of dotted line stroke part is 0 and even less than 0, so for value L
3, L
4Constant e, d, a-c is little than other constant.
Again, in counting circuit (402)-(426), also by respective modules T
2-T
26Degree of approximation E
I02-E
I26Computing formula, each self-defining degree of approximation is calculated.
If the present invention according to as above carries out the identification of hand script Chinese input equipment character, at this moment,, obtain input stroke S if particularly according to the present invention
iWith the template T that pre-defines
1-T
26Degree of approximation E
Ij, and with this degree of approximation E
IjReaching qualifier is that character recognition is carried out on the basis, and the discrimination that changes naturally and even be out of shape of identifying the handwriting so can not reduce.Because template T
jResemble T
1-T
3Such processing, promptly processing has also been done in distortion to part, thereby has improved discrimination, has strengthened the recognition capability of identifying the handwriting and changing and even being out of shape.
Again, because dictionary (7) can have only 1 group and stroke S basically
iCorresponding representational template T
jAnd qualifier, so, both can make dictionary (7) miniaturization, can make the retrieval high speed of dictionary (7) again.
Available again a kind of like this method realizes the login of undefined character, and promptly by stroke of the every input of user, by the shape of the highest template of the image representation degree of approximation, the available sessions form confirms whether this template is correct shape simultaneously on the one hand.
Claims (1)
1, a kind of hand-written charactor recognition apparatus is equipped with: comprises coordinates of input devices and removes the hand-written input circuit of the broken line compressor circuit of approximate each input person's handwriting by described coordinates of input devices input with broken line,
Store the rale store circuit of good and the corresponding a plurality of Template Informations of character basic structural element in advance,
Good each character of storage in advance, the feature lexicon storer that reaches the module information the most close and represent the limit of degree of approximation with its each stroke,
Described hand-written charactor recognition apparatus is characterised in that also and comprises:
Be used for calculating respectively the degree of approximation counting circuit of a plurality of Template Informations that each stroke information that provides from above-mentioned hand-written input circuit and above-mentioned rale store circuit store,
The degree of approximation of each stroke that calculates according to above-mentioned degree of approximation counting circuit, with reference to above-mentioned feature lexicon storer, hand-written character is estimated, thereby and when estimating, changed the evaluation circuit that character judgement degree is estimated by limit the described degree of approximation with described qualifier.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP204144/88 | 1988-08-17 | ||
JPP204144/88 | 1988-08-17 | ||
JP63204144A JP2762472B2 (en) | 1988-08-17 | 1988-08-17 | Character recognition method and character recognition device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1040447A CN1040447A (en) | 1990-03-14 |
CN1020213C true CN1020213C (en) | 1993-03-31 |
Family
ID=16485566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN89106539A Expired - Fee Related CN1020213C (en) | 1988-08-17 | 1989-08-17 | Hand-written charactor recognition apparatus |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP2762472B2 (en) |
KR (1) | KR0128733B1 (en) |
CN (1) | CN1020213C (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE69227174T2 (en) * | 1991-12-19 | 1999-04-29 | Texas Instruments Inc | Character recognition |
TW274135B (en) * | 1994-09-14 | 1996-04-11 | Hitachi Seisakusyo Kk | |
JPH10124505A (en) * | 1996-10-25 | 1998-05-15 | Hitachi Ltd | Character input device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0632083B2 (en) * | 1988-06-10 | 1994-04-27 | サン電子株式会社 | Handwritten character recognition system by fuzzy reasoning |
JPH0632084B2 (en) * | 1988-07-25 | 1994-04-27 | サン電子株式会社 | Handwritten character recognition method by fuzzy reasoning |
-
1988
- 1988-08-17 JP JP63204144A patent/JP2762472B2/en not_active Expired - Fee Related
-
1989
- 1989-08-16 KR KR1019890011629A patent/KR0128733B1/en not_active IP Right Cessation
- 1989-08-17 CN CN89106539A patent/CN1020213C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JP2762472B2 (en) | 1998-06-04 |
CN1040447A (en) | 1990-03-14 |
KR0128733B1 (en) | 1998-04-15 |
KR900003771A (en) | 1990-03-27 |
JPH0253193A (en) | 1990-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1167030C (en) | Handwriteen character recognition using multi-resolution models | |
CN107330127B (en) | Similar text detection method based on text picture retrieval | |
CN1107283C (en) | Method and apparatus for character recognition of handwriting input | |
CN1488120A (en) | Method, device and computer program for recognition of a handwritten character | |
CN1131302A (en) | Method for performing string matching | |
CN1017663B (en) | Printer | |
CN1163841C (en) | On-line hand writing Chinese character distinguishing device | |
US6035063A (en) | Online character recognition system with improved standard strokes processing efficiency | |
CN1755706A (en) | Image construction method, fingerprint image construction apparatus, and program | |
CN113076465A (en) | Universal cross-modal retrieval model based on deep hash | |
JPH0139154B2 (en) | ||
JP2673871B2 (en) | Method and device for pattern recognition by neural network | |
CN1051633A (en) | target identification system | |
CN114745553A (en) | Image data storage method based on big data | |
JPH10214267A (en) | Handwritten character and symbol processor and medium recording control program for the processor | |
JPH0981730A (en) | Method and device for pattern recognition and computer controller | |
CN115914640A (en) | Data compression method for Internet of vehicles | |
CN1227373A (en) | Handwriting verification device | |
CN1020213C (en) | Hand-written charactor recognition apparatus | |
CN1035844C (en) | Method of sorting out candidate characters in character recognition system | |
CN115526310A (en) | Network model quantification method, device and equipment | |
CN110941730B (en) | Retrieval method and device based on human face feature data migration | |
KR100248601B1 (en) | On-line character recognition method and device | |
CN1019699B (en) | Hand-written charactor recognition apparatus | |
CN1303563C (en) | Method and system for compressing hand-written character template |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C15 | Extension of patent right duration from 15 to 20 years for appl. with date before 31.12.1992 and still valid on 11.12.2001 (patent law change 1993) | ||
OR01 | Other related matters | ||
C19 | Lapse of patent right due to non-payment of the annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |