JPH02224084A - Discriminating method for capital letter, small letter and character with shape similar to kanji (chinese character) and kana (japanese syllabary) - Google Patents

Discriminating method for capital letter, small letter and character with shape similar to kanji (chinese character) and kana (japanese syllabary)

Info

Publication number
JPH02224084A
JPH02224084A JP1196619A JP19661989A JPH02224084A JP H02224084 A JPH02224084 A JP H02224084A JP 1196619 A JP1196619 A JP 1196619A JP 19661989 A JP19661989 A JP 19661989A JP H02224084 A JPH02224084 A JP H02224084A
Authority
JP
Japan
Prior art keywords
character
characters
kanji
kana
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP1196619A
Other languages
Japanese (ja)
Other versions
JP2930605B2 (en
Inventor
Taiji Mori
泰二 森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuji Electric Co Ltd
Fuji Facom Corp
Original Assignee
Fuji Electric Co Ltd
Fuji Facom Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuji Electric Co Ltd, Fuji Facom Corp filed Critical Fuji Electric Co Ltd
Publication of JPH02224084A publication Critical patent/JPH02224084A/en
Application granted granted Critical
Publication of JP2930605B2 publication Critical patent/JP2930605B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To decrease erroneous discrimination by discriminating a capital letter and a small letter by using not only a size of the character but also the center coordinate of the character, when a standard size of the character is almost the same by a kind of the character. CONSTITUTION:Whether a character as a result of recognition is a character having both a capital letter and a small letter or not is decided, and when it is a character having both of them, an external form feature quantity containing its character width, character height and that which is obtained by multiplying them is compared with a threshold for setting the character as a capital letter against its standard character, and a threshold for setting it as a small letter, respectively and decided (determined definitely) as one of a capital letter and a small letter. As for that which is not any of them, that is, a character which cannot be decided by the external form feature quantity, a center line L of the line is derived from the center coordinate of a result of recognition of one line, and by comparing a difference of the center X of the character and the coordinate of the center line L with the threshold determined in advance, a capital letter or a small letter is discriminated. In such a way, an erroneous discrimination can be decreased.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 この発明は、平仮名や片仮名などの文字を認識する文字
認識装置における文字種(大文字か小文字かなど)の判
別方法に関する。なお、大文字と小文字を持つ仮名文字
の例を第6図に示す。
DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a method for determining character types (such as uppercase or lowercase) in a character recognition device that recognizes characters such as hiragana and katakana. Incidentally, an example of kana characters having uppercase and lowercase letters is shown in FIG.

〔従来の技術〕[Conventional technology]

従来、例えば大文字、小文字の判別にあた、2ては、小
文字とするしきい値のみを設け、これを文字の外形特徴
と比較して行なうものが知られている。
Conventionally, for example, when distinguishing between uppercase and lowercase letters, it has been known to provide only a threshold value for determining lowercase letters and to compare this threshold with external features of the characters.

〔発明が解決しようとするR題〕[Problem R that the invention attempts to solve]

tかしながら、この方法では一般に片仮名の大きさが漢
字よりも小さく、その比率は書体によっても変化する。
However, with this method, the size of katakana is generally smaller than that of kanji, and the ratio also changes depending on the typeface.

このため、成る文字の大きさが成る書体では大文字とな
り、別の書体では小文字となるような場合が生じ得ると
云う問題がある。
For this reason, there is a problem in that a case may occur in which the characters of the same size are uppercase in one typeface and lowercase in another typeface.

したがって、この発明の課題は大文字、小文字に別々の
判断基準を設けて判断し、また判断基準の中間の文字に
ついては、行の中心からの文字の中心座標のずれをしき
い値にもとづき判断することにより、判別精度を向上さ
せることにある。
Therefore, the problem of this invention is to set separate judgment criteria for uppercase and lowercase letters, and to judge characters in between the judgment criteria based on a threshold value of the deviation of the center coordinates of the character from the center of the line. This aims to improve the discrimination accuracy.

〔課題を解決するための手段〕 文字種によらず標準サイズが略同じな対象文字の大きさ
を正規化し、大文字も小文字も同じ標準パターンにて文
字を認識した後、認識結果の各文字についてその外接枠
の中心座標を記憶するとともに、それが大文字と小文字
の両方をもつ文字か否かを判断し、両方をもつ文字なら
ばその文字幅。
[Means for solving the problem] After normalizing the size of the target character whose standard size is approximately the same regardless of the character type and recognizing the character using the same standard pattern for both uppercase and lowercase letters, In addition to memorizing the center coordinates of the circumscribing frame, it also determines whether the character has both uppercase and lowercase letters, and if it has both, the width of the character.

文字高さおよび文字幅と文字高さを掛け合わ・(!−た
ものを含む外形特徴量を求め、該外形特徴量を文字毎に
予め定められた標準文字に対して大文字。
Multiply the character height, character width, and character height to find the external feature amount including (!-), and use the external feature amount as an uppercase character for a predetermined standard character for each character.

小文字を判定するための各しきい値とそれぞれ比較して
大文字か小文字かを確定し、これらのしきい値にもとづ
く確定ができないときはその文字に未確定なる情報を付
与するとともに、一行の確定作業を終了する毎に該未確
定文字を含む行内の各文字の中心座標から文字行の中心
線を求め、未確定文字の中心座標と中心線の座標とめ差
を予め定められたしきい値と比較1,7て判別する〔作
用〕 認識結果の文字が大文字と小文字の両方をもつ文字かど
うかを判断し、両方をもつ文字であればその文字幅9文
字高さおよびS−れらを掛げ合わせたものを含む外形特
徴量を、その標準文字に対する大文字とするしきい値、
小文字とするしきい値とそれぞれ比較して大文字、小文
字のいずれかとして判断(確定)し、そのいずれでもな
いもの、すなわち外形特徴量で判断できない文字につい
ては、−・行の認識結果の中心座標より行の中心線を求
め、文字の中心と中心線の座標とめ差を予め定めたしき
い値と比較して大文字、小文字の判別を行なうこkによ
り誤判別を少なくし、判別精度を向上させる。
It determines whether it is an uppercase or lowercase letter by comparing it with each threshold value for determining lowercase letters, and if it cannot be determined based on these thresholds, it adds undetermined information to the character, and also confirms one line. Every time the work is completed, the center line of the character line is calculated from the center coordinates of each character in the line including the undetermined character, and the difference between the center coordinates of the undetermined character and the center line is set to a predetermined threshold value. Distinguish by comparing 1 and 7 [Operation] Determine whether the recognized character has both uppercase and lowercase letters, and if it has both, multiply the character width by 9 characters and the height by A threshold value for converting external shape features, including the combined values, into uppercase letters for standard characters,
It is determined (determined) as either an uppercase or lowercase letter by comparing it with the threshold value for lowercase letters, and for characters that are neither of these, that is, characters that cannot be determined based on the external feature amount, the center coordinates of the recognition result of the line -. The center line of the line is determined from the center line, and the coordinate difference between the center of the character and the center line is compared with a predetermined threshold value to determine whether it is an uppercase or lowercase letter. This reduces misclassification and improves discrimination accuracy. .

〔実施例〕〔Example〕

第1図はこの発明の実施例を示すフローチャート、第2
図は横書き文字群の−・例とその中心線を説明するため
の説明図である。
FIG. 1 is a flowchart showing an embodiment of the invention, and FIG.
The figure is an explanatory diagram for explaining an example of a horizontally written character group and its center line.

まず、公知の画像処理により文字画像データを抽出しく
■参照)、同じく公知の手法にて対象文字を認徹する(
■参照)。次いで、この文字の中心座種情報を保存ン、
(■参照)、さらに認識結果より、対象文字が大文字、
小文字の両方を持つ文字か否かを判断しく■参照)、大
文字、小文字の両方を持つ文字であればその文字幅、高
さ、および幅と高さを掛けたものを求める(■参照)e
次に、対象文字について予め定められている、1つ以上
のその標準文字に対して大文字とするしきい値と比較し
く■参照)、その結果大文字であれば大文字と確定し7
(■参照)、大文字でなければ、7小文字とするしきい
値と比較L2(■参照)、その結果小文字であれば小文
字として確定する(■参照)、一方、どちらにも確定で
きなかった場合には、未確定である旨の情報を付加する
(@参照)。
First, the character image data is extracted using known image processing (see ■), and the target characters are recognized using the same known method (
■Reference). Next, save the central locus information of this character,
(See ■).Furthermore, the recognition results show that the target character is an uppercase letter,
Determine whether the character has both lowercase letters (see ■), and if the character has both uppercase and lowercase letters, find the character width, height, and the product of the width and height (see ■) e
Next, compare the target character with a predetermined threshold value for capitalizing one or more standard characters (see ■), and if the result is an uppercase character, it is determined to be an uppercase character.
(See ■), If it is not an uppercase letter, compare it with the threshold value L2 (see ■), and if the result is a lowercase letter, it is determined as a lowercase letter (see ■).On the other hand, if neither can be determined Add information to the effect that it is undetermined (see @).

以上のステップ■−[相]を繰り返し、一行の認識結果
を得る(■参照)。次いで、行中に未確定の文字があれ
ば(■参照)、行内の横書き文字の各中心座標から公知
の手法、例えば最小二乗法などを用いて行の中心線の近
似式、 Y=ax+b を求め(0参照)、未確定文字のX方向中心座標Xcを
近似式に代入し、第2図(o )に示す如きY方向の座
標Y4.を得る(0参照)、なお、第2図の「×」印は
各文字の中心位置を示す、そして、この座標YLと未確
定文字のY方向の中心座標Y。
Repeat the above step ■-[phase] to obtain a recognition result for one line (see ■). Next, if there is an undetermined character in the line (see ■), the approximation formula for the center line of the line, Y=ax+b, is calculated from the center coordinates of each horizontally written character in the line using a known method such as the method of least squares. (see 0), and substitute the X-direction center coordinate Xc of the undefined character into the approximation formula to obtain the Y-direction coordinate Y4. as shown in FIG. 2(o). (see 0). Note that the "x" mark in FIG. 2 indicates the center position of each character, and this coordinate YL and the center coordinate Y in the Y direction of the undetermined character.

とめ差(Yt、  Ye)を求め2、これを予め定めた
標準文字に対するしきい値と比較しく■参照)、その結
果から大文字か小文字かを判別する([相]。
Determine the stop difference (Yt, Ye) 2, compare it with a predetermined threshold value for standard characters (see ①), and determine whether it is an uppercase or lowercase letter from the result ([Phase]).

ぐi?l参照)、・つまり、上記差(〜’i、  ’t
c)につき、小文字の場合の方が大文字の場合よりも大
きくなることを利用して判別する。
Gui? l), ・In other words, the above difference (~'i, 't
Regarding c), the discrimination is made by utilizing the fact that lowercase letters are larger than uppercase letters.

ところで、以上では文字種によってその標準的な大、き
さ(標準サイズ)が変わらないものと仮定して大文字、
小文字を判別するようにしている。
By the way, in the above, it is assumed that the standard size (standard size) does not change depending on the character type, and uppercase letters,
It is designed to recognize lowercase letters.

しかし、印刷文書等では文字種によって標準的な大きさ
が異なるものも多い(例えば、1印刷文書では漢字の方
が仮名よりも一船的に大きい)。標準サイズの例を第7
図に示す。また、漢字と仮名で字形が類似する文字(以
下、漢字仮名類似字形文字ともいう)も存在する。その
−例を第8図に示す。
However, in many printed documents, the standard size differs depending on the type of character (for example, in a single printed document, kanji are significantly larger than kana). Example of standard size is shown in 7th
As shown in the figure. In addition, there are characters with similar glyph shapes in kanji and kana (hereinafter also referred to as kanji-kana similar glyph-shaped characters). An example thereof is shown in FIG.

したがって、このような場合は以上の如き方法では対処
できないので、次のようにする。第3図はかかる場合の
方法を説明するためのフローチャートである。
Therefore, since such a case cannot be handled using the above method, the following method is used. FIG. 3 is a flowchart for explaining the method in such a case.

まず、第1図の場合と同様に、公知の画像処理により文
字画像データを抽出しく■参照)5.同じく公知の手法
にて対象文字を認識する(■参照)。
First, as in the case of FIG. 1, extract character image data using known image processing (see 5). Similarly, the target character is recognized using a known method (see ■).

次いで、認識結果から得られる文字コード、おおきさを
第4図に示すような形式で順次記憶1.(■参照)、そ
の文字コードより文字の文字種が漢字。
Next, the character codes and sizes obtained from the recognition results are sequentially stored in the format shown in FIG. 1. (See ■), the character type of the character is kanji according to its character code.

片仮名、平仮名2英字などに判別しく■参照)、その文
字が文字種の標準サイズを持っているか、または第6図
に示ず1や5.゛ゆ2.゛よ”のように小文字を持つ文
字か、もしくは第8図に示ず“力“、“夕”のよ・)な
漢字仮名類似字形文字かを、例えば第5図に示すような
形式で予め文字−1−ド毎に設定されている属性テーブ
ルTを参照して判断しく■2■参照)、小文字を持・つ
文字または漢字仮名類似字形文字ならば記憶した文字に
マ・〜りを付け(■参照)、その文字が標準サイズを持
っているならば(■参照)、その文字の大きさを文字種
毎に適切な方法、例えば頻度分布計算、平均値計算等を
用いて集計しく■参照)、−文書の認識結犀を得る([
相]参照)、その集計結果より、文字種毎にその文字種
の標準サイズを、例えば頻度分布から最も頻度の高い大
きさを求めるなどして計算しく0参照)、先に記憶した
文字の中からマークを付けた文字を検索シ、(■参照)
、ステップ◎で文字種毎に計算1.て求めた(確定1.
7だ)標準サイズの、マークを付けた文字種対応の値に
予め設定された比率を乗じる等して求められるしきい値
き、実際の文字の大きさとを比較して大文字か小文字か
の判別を行なう(0参照)。さらに、ステップ■で求め
た標準サイズの漢字と平仮名。
For distinguishing between katakana, hiragana, 2 alphabetic characters, etc. (see ■), the character has the standard size of the character type, or is not shown in Figure 6 and is 1 or 5.゛Yu2. In the format shown in Figure 5, for example, write in advance whether it is a lowercase character such as ``゛yo'' or a kanji-kana similar glyph character such as ``力'' or ``evening'' (not shown in Figure 8). Please refer to the attribute table T set for each character (see ■2)), and mark the memorized character if it has a lowercase letter or a kanji/kana similar glyph. (See ■), if the character has a standard size (see ■), calculate the size of the character using an appropriate method for each type of character, such as frequency distribution calculation, average value calculation, etc. (see ■) ), - Obtain document recognition results ([
Based on the tally result, calculate the standard size of each character type by, for example, finding the most frequent size from the frequency distribution. Search for the characters with , (see ■)
, Calculate each character type in step ◎1. (confirmed 1.
7) A threshold value is obtained by multiplying the standard size value corresponding to the marked character type by a preset ratio, etc., and compared with the actual character size to determine whether it is an uppercase or lowercase letter. (see 0). Furthermore, the standard size kanji and hiragana obtained in step ■.

片仮名とめ差を予め設定された+、きい値と比較して大
きさが異なるか否かをチエツクしく0参照)、異なる場
合には先に”7−・りを付けた漢字仮名類似字形文字に
ついて、これと類似する全ての文字に対し、例えば第5
図に示すテーブルTの文字の大きさとその文字の属する
文字種の標準の大きさの比率テーブルに予め設定されて
いる比率を、ステップ■で求めた文字種毎の標準サイズ
に掛1.3で文字の大きさを推定し、これと実際の文字
の大きさとを比較して大きさの一番近い文字を候補とす
る(■参照)。」−記ステップ0−■を〜文書が終了す
るまで、繰り返す([相]参照)、なお、漢字仮名類似
字形文字が漢字か仮名かを判別するに当たっ°ζは、そ
の前後の文字種を判別する方法も併せて用いる、−とが
望ましい。また、土、記では大文字か小文字かの判別と
、漢字仮名類似字形文字が漢字か仮名かの判別とを同時
に実施するようにしているが、そのいずれか一方のみを
実施するようにしても良いことは勿論である。
Check whether the size is different by comparing the katakana stop difference with the preset + and threshold (see 0), and if it is different, for the kanji-kana similar glyphs with "7-" prefixed. , for all characters similar to this, e.g.
The ratio of the character size in table T shown in the figure to the standard size of the character type to which the character belongs The ratio preset in the table is multiplied by the standard size for each character type obtained in step ■ by 1.3. Estimate the size, compare this with the actual character size, and select the character with the closest size as a candidate (see ■). ”-Repeat step 0-■ until the end of the document (see [phase]). In addition, in determining whether a kanji-kana-like glyph is a kanji or a kana, °ζ determines the character types before and after it. It is desirable to also use a method to do so. In addition, in Sat and Ki, the determination of uppercase or lowercase letters and the determination of whether Kanji-kana-like glyphs are kanji or kana are performed at the same time, but it is also possible to perform only one of them. Of course.

〔発明の効果〕〔Effect of the invention〕

この発明(17よれば、文字種によって文字の標準サイ
ズが略同じ場合は、文字のサイズだけでなく文字の中心
座標も使って大文字、小文字の判別を行なうように(,
7たので、誤判別を少なくすることができ、判別精度を
向上し得る利点がもたらされる。
According to this invention (17), when the standard size of characters is approximately the same depending on the character type, uppercase and lowercase letters are determined using not only the size of the character but also the center coordinates of the character (,
Therefore, there is an advantage that misclassification can be reduced and classification accuracy can be improved.

また、文字種によって文字の標準サイズが異なる場合は
、文字種毎に標準サイズを計算(確定)するようにし7
たので、大文字、小文字および漢字仮名類似字形文字の
判別精度を向上し得る利点がもたらされる。
Also, if the standard size of characters differs depending on the character type, calculate (determine) the standard size for each character type.
Therefore, it is possible to improve the accuracy of discrimination between uppercase letters, lowercase letters, and characters with similar glyphs such as kanji, kana, and kanji.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図はこの発明の一実施例を示すフローチャート、第
2図は横書き文字群の一例とその中心線を説明するため
の説明図、第3図はこの発明の他の実施例を示すフロー
チャート、第4図は認識結果の記憶態様を説明するため
の説明図、第5図1.よ文字属性テーブルの一例を示す
構成図、第6図は大文字と小文字で字形が類似な文字の
例を説明するための説明図、第7図は文字毎の標準サイ
ズの例を説明するための説明図、第8図は漢字仮名類似
字形文字の例を説明するための説明図である。 符号説明 ■7・・・中心線、P、・・・未確定文字の中心位置、
T・・・文字属性テーブル。
FIG. 1 is a flowchart showing one embodiment of this invention, FIG. 2 is an explanatory diagram for explaining an example of a horizontally written character group and its center line, and FIG. 3 is a flowchart showing another embodiment of this invention. FIG. 4 is an explanatory diagram for explaining the storage mode of recognition results, and FIG. Figure 6 is an explanatory diagram showing an example of a character attribute table with similar character shapes; Figure 7 is an explanatory diagram illustrating an example of characters with similar shapes in uppercase and lowercase; Figure 7 is an explanatory diagram illustrating an example of the standard size of each character. FIG. 8 is an explanatory diagram for explaining examples of kanji, kana, and similar glyphs. Code explanation ■7... Center line, P... Center position of undefined character,
T...Character attribute table.

Claims (1)

【特許請求の範囲】 1)文字種によらず標準サイズが略同じな対象文字の大
きさを正規化し、大文字も小文字も同じ標準パターンに
て文字を認識した後、 認識結果の各文字についてその外接枠の中心座標を記憶
するとともに、それが大文字と小文字の両方をもつ文字
か否かを判断し、両方をもつ文字ならばその文字幅、文
字高さおよび文字幅と文字高さを掛け合わせたものを含
む外形特徴量を求め、該外形特徴量を文字毎に予め定め
られた標準文字に対して大文字、小文字を判定するため
の各しきい値とそれぞれ比較して大文字か小文字かを確
定し、これらのしきい値にもとづく確定ができないとき
はその文字に未確定なる情報を付与するとともに、一行
の確定作業を終了する毎に該未確定文字を含む行内の各
文字の中心座標から文字行の中心線を求め、未確定文字
の中心座標と中心線の座標とめ差を予め定められたしき
い値と比較して判別することを特徴とする大文字、小文
字の判別方法。 2)文字種によって標準サイズが異なる対象文字の大き
さを正規化し、大文字も小文字も同じ標準パターンにて
文字を認識した後、 認識結果の各文字について文字コード、大きさを順次記
憶しつつその文字コードから文字種を判別するとともに
、その文字が標準サイズを持つ文字か、または類似な字
形の小文字を持つ文字かを文字コード毎に予め設定され
たテーブルを参照して判断し、小文字を持つ文字ならば
記憶した文字にマークを付ける一方、標準サイズを持つ
文字の実際の大きさを文字種毎に集計して一文書の認識
結果を得、文字種毎に計測したサイズを集計した値から
頻度分布または平均値を求めて文字種毎に標準サイズを
確定し、先にマークを付けた文字についてその文字種対
応の前記確定した標準サイズに所定のしきい値を設定し
て、大文字か小文字かを判別することを特徴とする大文
字、小文字の判別方法。 3)文字種によって標準サイズが異なる対象文字の大き
さを正規化し、大文字も小文字も同じ標準パターンにて
文字を認識した後、 認識結果の各文字について文字コード、大きさを順次記
憶しつつその文字コードから文字種を判別するとともに
、その文字が標準サイズを持つ文字か、または漢字と仮
名で類似な字形を持つ文字(漢字仮名類似字形文字)か
を文字コード毎に予め設定されたテーブルを参照して判
断し、漢字仮名類似字形文字ならば記憶した文字にマー
クを付ける一方、標準サイズを持つ文字の実際の大きさ
を文字種毎に集計して一文書の認識結果を得、文字種毎
に計測したサイズを集計した値から頻度分布または平均
値を求めて文字種毎に標準サイズを確定し、先にマーク
を付けた文字についてその文字種対応の前記確定した標
準サイズに所定のしきい値を設定して、漢字仮名類似字
形文字が漢字か仮名かを判別することを特徴とする漢字
仮名類似字形文字の判別方法。 4)文字種によって標準サイズが異なる対象文字の大き
さを正規化し、大文字も小文字も同じ標準パターンにて
文字を認識した後、 認識結果の各文字について文字コード、大きさを順次記
憶しつつその文字コードから文字種を判別するとともに
、その文字が標準サイズを持つ文字か、または類似な字
形の小文字を持つ文字か、もしくは漢字と仮名で類似な
字形を持つ文字(漢字仮名類似字形文字)かを文字コー
ド毎に予め設定されたテーブルを参照して判断し、小文
字を持つ文字または漢字仮名類似字形文字ならば記憶し
た文字にマークを付ける一方、標準サイズを持つ文字の
実際の大きさを文字種毎に集計して一文書の認識結果を
得、文字種毎に計測したサイズを集計した値から頻度分
布または平均値を求めて文字種毎に標準サイズを確定し
、先にマークを付けた文字についてその文字種対応の前
記確定した標準サイズに所定のしきい値を設定して、大
文字か小文字かまたは漢字仮名類似字形文字が漢字か仮
名かを判別することを特徴とする大文字、小文字および
漢字仮名類似字形文字の判別方法。 5)漢字仮名類似字形文字が漢字か仮名かを判別するに
当たっては、前後の文字種の組み合わせも判別すること
を特徴とする請求項4)に記載の大文字、小文字の判別
方法。 6)漢字仮名類似字形文字が漢字か仮名かを判別するに
当たっては、前後の文字種の組み合わせも判別すること
を特徴とする請求項5)に記載の大文字、小文字および
漢字仮名類似字形文字の判別方法。
[Claims] 1) After normalizing the size of the target character whose standard size is approximately the same regardless of the character type and recognizing the character using the same standard pattern for both uppercase and lowercase letters, the circumference of each character in the recognition result is determined. It memorizes the center coordinates of the frame, determines whether the character has both uppercase and lowercase letters, and if it has both, calculates the character width, character height, and multiplies the character width and character height. The external shape feature amount including the object is determined, and the external shape feature amount is compared with each threshold value for determining whether the character is uppercase or lowercase for a predetermined standard character for each character to determine whether it is an uppercase or lowercase letter. If it cannot be determined based on these thresholds, the character is given undetermined information, and each time the confirmation process for one line is completed, the character line is calculated from the center coordinates of each character in the line that includes the undetermined character. A method for distinguishing between uppercase and lowercase letters, characterized in that the center line of an undefined character is determined, and the difference between the center coordinates of an undefined character and the coordinates of the center line is compared with a predetermined threshold value. 2) After normalizing the size of target characters, which have different standard sizes depending on the character type, and recognizing characters using the same standard pattern for both uppercase and lowercase characters, the character code and size of each character in the recognition result are sequentially memorized and the character is In addition to determining the character type from the code, it also determines whether the character is a standard size character or a character with a similar shape with lowercase letters by referring to a preset table for each character code. While marking memorized characters, the actual size of standard-sized characters is aggregated for each character type to obtain recognition results for one document, and the frequency distribution or average is calculated from the aggregated value of the measured size for each character type. The standard size is determined for each character type by determining the value, and a predetermined threshold value is set for the determined standard size corresponding to the character type for the previously marked character to determine whether it is an uppercase or lowercase letter. Characteristic method for distinguishing between uppercase and lowercase letters. 3) After normalizing the size of target characters, which have different standard sizes depending on the character type, and recognizing characters using the same standard pattern for both uppercase and lowercase characters, the character code and size of each character in the recognition result are sequentially memorized and the character is In addition to determining the character type from the code, refer to a preset table for each character code to determine whether the character is a standard size character or a character with a similar shape between Kanji and Kana (Kanji and Kana similar shape characters). If it is a kanji, kana, or similar glyph, then the memorized character is marked, while the actual size of standard-sized characters is tallied for each character type to obtain the recognition results for one document, and measured for each character type. A standard size is determined for each character type by determining the frequency distribution or average value from the aggregated size values, and a predetermined threshold value is set for the determined standard size corresponding to the character type for the previously marked character. , a method for determining kanji-kana-like glyph-like characters, which comprises determining whether the kanji-kana-like glyph-like characters are kanji or kana. 4) After normalizing the size of the target characters, which have different standard sizes depending on the character type, and recognizing characters using the same standard pattern for both uppercase and lowercase characters, the character code and size of each character in the recognition result are sequentially memorized and the character is In addition to determining the character type from the code, it also determines whether the character is a standard-sized character, a character with a lowercase letter with a similar shape, or a character with a similar shape between Kanji and Kana (Kanji, Kana, and Similar Characters). By referring to a table set in advance for each code, the memorized characters are marked if they are lowercase characters or characters with similar glyphs to kanji, kana, etc., while the actual size of characters with standard size is determined for each character type. Obtain the recognition results for one document by aggregating them, determine the frequency distribution or average value from the aggregated values of the sizes measured for each character type, determine the standard size for each character type, and then match the previously marked characters to that character type. of uppercase letters, lowercase letters, and Kanji-kana-like glyph-like characters, wherein a predetermined threshold is set on the determined standard size of uppercase letters, lowercase letters, and Kanji-kana-like glyph-like characters to determine whether they are uppercase letters or lowercase letters, or whether the Kanji-kana-like glyph-like characters are Kanji or Kana. Discrimination method. 5) The method for determining uppercase and lowercase letters according to claim 4, wherein in determining whether a kanji-kana-like glyph character is a kanji or a kana, a combination of preceding and succeeding character types is also determined. 6) The method for determining uppercase letters, lowercase letters, and kanji-kana-like glyphs according to claim 5), wherein in determining whether the kanji-kana-like glyphs are kanji or kana, a combination of preceding and succeeding character types is also determined. .
JP1196619A 1988-11-30 1989-07-31 How to distinguish between uppercase, lowercase and Kanji Kana-like characters Expired - Fee Related JP2930605B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP63-300692 1988-11-30
JP30069288 1988-11-30

Publications (2)

Publication Number Publication Date
JPH02224084A true JPH02224084A (en) 1990-09-06
JP2930605B2 JP2930605B2 (en) 1999-08-03

Family

ID=17887928

Family Applications (1)

Application Number Title Priority Date Filing Date
JP1196619A Expired - Fee Related JP2930605B2 (en) 1988-11-30 1989-07-31 How to distinguish between uppercase, lowercase and Kanji Kana-like characters

Country Status (1)

Country Link
JP (1) JP2930605B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5729630A (en) * 1990-05-14 1998-03-17 Canon Kabushiki Kaisha Image processing method and apparatus having character recognition capabilities using size or position information

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5729630A (en) * 1990-05-14 1998-03-17 Canon Kabushiki Kaisha Image processing method and apparatus having character recognition capabilities using size or position information

Also Published As

Publication number Publication date
JP2930605B2 (en) 1999-08-03

Similar Documents

Publication Publication Date Title
US7437001B2 (en) Method and device for recognition of a handwritten pattern
JP4787275B2 (en) Segmentation-based recognition
US7349576B2 (en) Method, device and computer program for recognition of a handwritten character
US7630551B2 (en) Method and system for line extraction in digital ink
JPH02266485A (en) Information recognizing device
EP0810542A2 (en) Bitmap comparison apparatus and method
JPH02224084A (en) Discriminating method for capital letter, small letter and character with shape similar to kanji (chinese character) and kana (japanese syllabary)
US6934404B2 (en) Stamp detecting device, stamp detecting method, letter processing apparatus and letter processing method
US8045803B2 (en) Handwriting recognition system and methodology for use with a latin derived alphabet universal computer script
JP2004046723A (en) Method for recognizing character, program and apparatus used for implementing the method
JP2761679B2 (en) Online handwritten character recognition device
JP2671984B2 (en) Information recognition device
JP3911942B2 (en) Character recognition device
JP2510722B2 (en) How to distinguish uppercase and lowercase letters in English
JP2002163608A (en) Handwriting character recognizing device
JPH10162103A (en) Character recognition device
JP3266687B2 (en) Mark recognition method
JP2003030583A (en) Method and device for identifying chart classification, and method and device for identifying format classification
JP4092847B2 (en) Character recognition device and character recognition method
JPH076211A (en) On-line character recognition device
JPH0210480A (en) Character deciding method
JPH09106440A (en) Feature point detecting method for handwritten character recognition
JPH01114991A (en) Method for discriminating capital letter/small letter
JPS60186980A (en) Recognition processing system for on-line handwritten character
JPS63301383A (en) Handwritten character recognition device

Legal Events

Date Code Title Description
R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090521

Year of fee payment: 10

LAPS Cancellation because of no payment of annual fees