JPS6378287A

JPS6378287A - Character recognizing device

Info

Publication number: JPS6378287A
Application number: JP61222023A
Authority: JP
Inventors: Hiroe Fujiwara; 藤原　啓惠
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1986-09-22
Filing date: 1986-09-22
Publication date: 1988-04-08

Abstract

PURPOSE:To discriminate small characters and ordinary characters by determining the result of recognition for objective characters of recognition by using the size ratio and positional relation of images of objective characters of recognition to surrounding images of the image of objective characters of recognition. CONSTITUTION:A character cutting-out section 2 cuts out an image of objective characters for recognition in a rectangular form from an image inputted on an image inputting section. A candidate of recognition extracting section 3 finds, for instance, the direction, number, position etc. of strokes of the image of objective characters of recognition obtained on the cutting out section 3 as the feature quantity of objective characters of recognition and collates with the feature quantity of each character registered beforehand on a dictionary, and extracts possible characters for recognition successively from nearer one. On the other hand, a character attribute determining section 4 determines the attribute of objective characters relation of image of objective characters of recognition to surrounding character images of objective characters of recognition obtained on the cutting out section 2. A result of recognition determining section 5 determines the result of recognition on the possible characters of recognition obtained on the extracting section 3 by using the attribute of objective characters of recognition determined by the determining section 4.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、新聞・雑誌等の活字及び手書き文字を認識し
、例えば、ＪＩＳコード等の情報片に変換する文字認識
装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a character recognition device that recognizes printed and handwritten characters from newspapers, magazines, etc., and converts them into pieces of information such as JIS codes, for example.

（従来の技術）従来の文字認識装置では、文字部分の外接長方形領域を
一定の大きさの正方形になるように伸縮して文字パター
ンに正規化を施し特徴量を求め認識を行なっていた（例
えば、５９年度電子通信学会総合全国人会原田智夫他「
加重方向指数ヒストグラムと擬似ベイズ識別法を用いた
手書き漢字・ひらがな認ｍ（ＩＩ）Ｊ）。この方法を用
いた装置では。(Prior art) In conventional character recognition devices, character patterns are normalized by stretching and contracting the circumscribed rectangular area of a character part into a square of a certain size, and then feature quantities are obtained and recognition is performed (for example, , 1959 Institute of Electronics and Communication Engineers General National Association Tomoo Harada et al.
Handwritten kanji and hiragana recognition using weighted directional index histograms and pseudo-Bayesian identification m(II)J). In a device using this method.

例えば、第５図（ａ）に示す入力画像も第５図（ｂ）に
示す入力画像も第５図（ｃ）に示すように正規化される
ため、認識結果が対象文字の大きさに左右されにくいと
いう利点があった。For example, since both the input image shown in FIG. 5(a) and the input image shown in FIG. 5(b) are normalized as shown in FIG. 5(c), the recognition result depends on the size of the target character. It had the advantage of being less likely to be exposed.

（発明が解決しようとする問題点）しかしながら、上記のような方法を用いた装置では、例
えば、第６図（ａ）に示すような画像は第６図（ｂ）に
示すように正規化され、促音・撥音を表わす文字や句読
点等のように小さく書き表わす文字の場合、正規化によ
り普通の大きさの文字と大きさに差がなくなり、識別が
難しくなるという問題点を有していた。(Problems to be Solved by the Invention) However, in the apparatus using the above method, for example, the image shown in FIG. 6(a) is normalized as shown in FIG. 6(b). In the case of small characters such as letters representing consonants and suffixes, punctuation marks, etc., normalization eliminates the difference in size from normal-sized characters, making it difficult to identify them.

本発明はかかる点に鑑み、小さく書き表わす文字と普通
の大きさの文字とを識別することのできる文字認識装置
を提供することを目的とする。In view of the above, an object of the present invention is to provide a character recognition device that can distinguish between small-sized characters and normal-sized characters.

（問題点を解決するための手段）本発明は、前記問題点を解決するために、認識対象文字
画像の周辺のＮ個の文字画像に対する前記認識対象文字
画像のサイズ比及び位置関係を用いて認識対象文字に対
する認識結果を決定するように構成している。(Means for Solving the Problems) In order to solve the above problems, the present invention uses the size ratio and positional relationship of the recognition target character image with respect to N character images surrounding the recognition target character image. The system is configured to determine recognition results for characters to be recognized.

（作　用）本発明は前記した構成により、文字の大きさ及び位置の
情報が認識結果に反映されるため、小さく書き表わされ
た文字と普通の大きさの文字との識別が可能となる。(Function) With the above-described configuration, the present invention reflects information on the size and position of the characters in the recognition results, making it possible to distinguish between small characters and normal-sized characters. .

（実施例）以下本発明の実施例について図面を参照しながら説明す
る。(Example) Examples of the present invention will be described below with reference to the drawings.

第１図は、本発明の文字認識装置の一実施例を示す構成
図である。第１図において、」は画像入力部であり、認
識対象文字を含む画像を入力する。FIG. 1 is a block diagram showing an embodiment of a character recognition device of the present invention. In FIG. 1, "" is an image input section, into which an image containing characters to be recognized is input.

２は文字切り出し部であり、画像入力部１で入力された
画像から認識対象文字画像を矩形で切り出す。３は認識
候補抽出部であり、文字切り出し部２で′４（）られた
認識対象文字画像について、例えば。Reference numeral 2 denotes a character cutting section, which cuts out a rectangular character image to be recognized from the image input by the image input section 1. Reference numeral 3 denotes a recognition candidate extracting section, which extracts, for example, the recognition target character image that has been extracted by the character cutting section 2.

ストロークの方向性、数１位置等をＬ＆識対象文字の特
微量として求め、予め辞書に９．録されている各文字の
特微量と照合し、近い方から順に認識候補文字を抽出す
る。４は文字属性決定部であり、文字切り出し部２で得
られた認識対象文字画像の周辺の文字画像に対する当認
識対象文字画像のサイズ比及び位置関係を基に当認識対
象文字の属性を決定する。５は認識結果決定部であり、
認識候補抽出部３で得られた認識候補文字について文字
属性決定部４で決定された認識対象文字の属性を用いて
認識結果を決定する。６は表示部であり。The direction of the stroke, the position of the number 1, etc. are determined as the characteristic quantities of the character to be identified, and are stored in the dictionary in advance in 9. The system compares the recorded feature values of each character and extracts recognition candidate characters in order of proximity. Reference numeral 4 denotes a character attribute determination unit, which determines the attributes of the recognition target character based on the size ratio and positional relationship of the recognition target character image with respect to character images surrounding the recognition target character image obtained by the character cutting unit 2. . 5 is a recognition result determination unit;
For the recognition candidate characters obtained by the recognition candidate extracting section 3, a recognition result is determined using the attributes of the recognition target character determined by the character attribute determining section 4. 6 is a display section.

認識結果決定部５で得られた認識結果を表示する６以上
のように構成された本実施例の文字認識装置について、
以下横書きの画像を入力した場合を例に、その動作を説
明する。Regarding the character recognition device of this embodiment configured as described above, which displays the recognition result obtained by the recognition result determination unit 5,
The operation will be explained below using an example where a horizontally written image is input.

画像入力部１では、第２図（ａ）に示すような画像を入
力する。文字切り出し部２では、画像入力部１で入力さ
れた画像から第２図（ｂ）に示すように認識対象文字画
像を各々縦幅１１．横幅ｗ１の矩形で切り出す。認識候
補抽出部３では、前記各認識対象文字画像の各画素に関
して第３図（ｂ）に示す各方向に着目画素を含んでＭ個
以上（Ｍは予め設定）連なっているか否かを調べて着目
画素に方向コードを付与し、方向コードごとに画素の連
結性を調べてストロークを抽出する。例えば、第２図（
ｂ）の認識対象文字画像Ｃ３のストロークを抽出すると
第３図（ａ）に示すようになる。そしてストロークの方
向性、数２位置等を特徴とする求め、予め辞書にｉＺＱ
されている各文字の特微量と照合し、距離の近い方から
順に認識候補文字を抽出する。第４図に耐記認識対象文
字画像の認識候補文字を示す（左から右へ第１位候補か
ら順に並んでいる）。文字ツバ（性決定部４では、認識
対象文字画像Ｃ，の縦幅１．及び横幅Ｗ、が認識対象文
字画像の周辺の文字画像Ｃｉ−ｚ　ｒ　Ｃｉ−ｔ　＋　
Ｃｉ＋１＋Ｃ１ｏの縦幅の平均値り及び横幅の平均値Ｗ
と比較してｌ　＋　＜　ｎ　Ｘ　Ｌかつｗ、＜ｍＸＷ　
（０＜ｎ＜１゜０　＜　ｍ　＜　１　）であるとき、認
識対象文字画像Ｃ１の位置が周辺の文字画像ｃｌ−，，
ｃ、−□ｒｃｔ＊ｘ＋Ｃ１や２の位置と比較して下にあ
れば、促音・撥音を表わす文字あるいは句読点あるいは
′」′等のように下方に小さく書き表わす文字であり、
認識対象文字画像Ｃ１の位置が周辺の文字画像Ｃ１−２
＋Ｃ１−□＋Ｃｉやｔ＋ｃｉ＊ｚの位置と比較して上に
あれば、′「′等のように上方に小さ−く書き表わす文
字であるというように認識対象文字の属性を決定する。The image input section 1 inputs an image as shown in FIG. 2(a). The character cutting section 2 divides the image input by the image input section 1 into character images to be recognized, each with a vertical width of 11.5 mm, as shown in FIG. 2(b). Cut out a rectangle with width w1. The recognition candidate extracting unit 3 checks whether or not each pixel of each recognition target character image is connected in each direction as shown in FIG. A direction code is assigned to the pixel of interest, and the connectivity of pixels is examined for each direction code to extract strokes. For example, in Figure 2 (
When the strokes of the recognition target character image C3 in b) are extracted, the result is as shown in FIG. 3(a). Then, find the direction of the stroke, the number 2 position, etc., and enter iZQ in the dictionary in advance.
The recognition candidate characters are extracted in descending order of distance by comparing them with the characteristic quantities of each character. FIG. 4 shows recognition candidate characters of the memorized recognition target character image (listed in order from left to right, starting with the first candidate). Character brim (in the gender determination unit 4, the vertical width 1. and the horizontal width W of the recognition target character image C) are the character images Ci-z r Ci-t + around the recognition target character image.
Average vertical width and average horizontal width W of Ci+1+C1o
Compared with l + < n X L and w, < mXW
(0<n<1°0<m<1), the position of the recognition target character image C1 is the surrounding character image cl-,,
c, -□rct*x+C If it is lower than the position of 1 or 2, it is a letter representing a consonant or a cursor, or a punctuation mark, or a letter written downward in small size such as ``''''.
The position of the recognition target character image C1 is a surrounding character image C1-2
If it is above the position of +C1-□+Ci or t+ci*z, the attribute of the character to be recognized is determined such that it is a character written in small letters upwards, such as ``''.

例えば、ｎ＝ｍ＝２／３とすると前記認識対象文字画像
Ｃ１においては、第２図（ｂ）に示すように１３＜ｎＸ
Ｌかつ１１くｍｘＷであり、Ｃ１の位置はＣ工、　Ｃ，
、Ｃ４，Ｃ３の位置に比較して下にあるので、当認識対
象文字の属性は下方に小さく書き表わす文字であると決
定される。認識結果決定部５では、認識候補抽出部３で
抽出された認識候補文字について文字属性決定部４で決
定された認識対象文字の属性を用いて認識結果を決定す
る。For example, if n=m=2/3, in the recognition target character image C1, 13<nX
L and 11 m x W, and the position of C1 is C, C,
, C4, and C3, the attribute of the character to be recognized is determined to be a character that is written in a smaller size downward. The recognition result determination unit 5 determines a recognition result for the recognition candidate characters extracted by the recognition candidate extraction unit 3 using the attributes of the recognition target character determined by the character attribute determination unit 4.

例えば、前記認識対象文字画像Ｃ３については、第４図
に示した認識対象文字画像Ｃ１認識候補文字のうち小さ
く書き表わす「ア」を第１位候補として認識結果を決定
する。For example, for the recognition target character image C3, the recognition result is determined by setting "A" written in small size among the recognition candidate characters in the recognition target character image C1 shown in FIG. 4 as the first candidate.

上記実施例の動作説明においては横書きの画像を対象に
説明したが、縦書きの画像を入力した場合、認識対象文
字画像の位置については認識対象文字画像の周辺の文字
画像に対し左右どちらにあるかを判断することにより同
様に認識結果を決定することができる。In the explanation of the operation of the above example, the explanation was given for a horizontally written image, but when a vertically written image is input, the position of the character image to be recognized is on the left or right side of the character image surrounding the character image to be recognized. The recognition result can be similarly determined by determining whether

（発明の効果）以上説明したように、本発明によれば１句読点や促音・
撥音を表わす文字等のように小さく書き表わす文字の識
別が可能になるので、認識精度を向上させることができ
る。(Effects of the Invention) As explained above, according to the present invention, one punctuation mark, consonant,
Since it becomes possible to identify characters written in small size, such as characters representing cursive sounds, recognition accuracy can be improved.

[Brief explanation of the drawing]

第１図は本発明の一実施例による文字認識装置の構成図
、第２図は本発明の一実施例において認識結果を決定す
る方法の一例の説明図、第３図は文字認識方法の説明図
、第４図は本発明の一実施例における認識対象文字に対
するＬｆ！、識候補文字を示す図、第５図及び第６図は
従来の技術において認識対象文字画像を正規化する方法
の一例の説明図である。１・・・画像入力部、　２・・・文字切り出し部、３・
・・認識候補抽出部、　４・・・文字属性決定部、　５
・・・認識結果決定部、　６・・・表示部。特許出願人　松下電器産業株式会社第１図第３区第４区第５図第６図Fig. 1 is a block diagram of a character recognition device according to an embodiment of the present invention, Fig. 2 is an explanatory diagram of an example of a method for determining recognition results in an embodiment of the present invention, and Fig. 3 is an explanation of a character recognition method. Figure 4 shows Lf! for characters to be recognized in one embodiment of the present invention! , illustrating candidate characters, and FIGS. 5 and 6 are explanatory diagrams of an example of a conventional method for normalizing a recognition target character image. 1... Image input section, 2... Character cutting section, 3.
... Recognition candidate extraction section, 4... Character attribute determination section, 5
... Recognition result determination section, 6... Display section. Patent applicant: Matsushita Electric Industrial Co., Ltd. Figure 1, Ward 3, Ward 4, Figure 5, Figure 6

Claims

[Claims]

an image input section for inputting an image including the recognition target character; a character cutting section for cutting out a recognition target character image in a rectangular shape from the image input by the image input section; and a recognition target character image obtained from the character cutting section. a recognition candidate extracting unit that calculates the feature amount of the recognition target character, compares the feature amount of the recognition target character with the feature amount of each character registered in advance in a dictionary, and extracts a recognition candidate character for the recognition target character; a character attribute determination unit that determines an attribute of the recognition target character based on a size ratio and positional relationship of the recognition target character image with respect to N character images surrounding the recognition target character image obtained by a cutting unit; A character characterized by having a recognition result determination unit that determines a recognition result using the attributes of the recognition target character determined by the character attribute determination unit for the recognition candidate character for the recognition target character extracted by the recognition candidate extraction unit. recognition device.