JPH10187886A

JPH10187886A - Character recognizing device and method

Info

Publication number: JPH10187886A
Application number: JP8347354A
Authority: JP
Inventors: Shunji Ariyoshi; 俊二有吉; Tsutomu Sano; 力佐野; Michiko Kojima; 美知子小島
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1996-12-26
Filing date: 1996-12-26
Publication date: 1998-07-21

Abstract

PROBLEM TO BE SOLVED: To normalize the character line width at almost the fixed value and to accurately recognize the characters despite the large variance of the character line width by calculating the feature vector from the border line direction data having the normalized line width and recognizing the character based on the feature vector. SOLUTION: At a character recognition part 5, the border line direction is detected at each position of a character image and the border line direction data are acquired. Then the border line data are equality divided into seven pieces in both vertical and horizontal directions, for example, so that the images divided into (7×7) blocks are obtained. The feature vector is calculated by counting the number of direction data included in every block of the divided images. Then the feature vector is compared with a reference pattern (standard pattern) that is previously produced and stored in a reference pattern file 6 for the execution of recognition of characters. The obtained results of recognition are outputted and stored in a recognition result file 7.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字の輪郭線の方
向の分布を特徴量として文字認識する文字認識装置およ
び文字認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognizing apparatus and a character recognizing method for recognizing a character by using a distribution of the direction of the outline of the character as a feature.

【０００２】[0002]

【従来の技術】従来、被読取物上の文字を認識する文字
認識装置において、文字の輪郭線の方向の分布を特徴量
として文字認識を行なう方式が広く用いられており、そ
のような方式の一例として、たとえば、特開平１−１８
３７９３号公報に開示されている技術がある。2. Description of the Related Art Conventionally, in a character recognition apparatus for recognizing a character on an object to be read, a method of performing character recognition using a distribution of a direction of a contour of the character as a feature amount has been widely used. As an example, see, for example,
There is a technique disclosed in Japanese Patent No. 3793.

【０００３】この従来の技術を簡単に説明すると、入力
文字画像に対して２×２の論理フィルタを施すことによ
り、文字画像の各位置における輪郭線方向を検出して輪
郭線方向データを得、この得られた輪郭線方向データを
縦横それぞれ７等分して、複数のブロックに分割された
画像を得、この得られた各ブロック内の方向データの個
数を計数することにより特徴ベクトルを求め、この求め
た特徴ベクトルを、あらかじめ作成しておいた参照パタ
ーンと比較して、複合類似度法により文字認識を行なう
ものである。このような文字の輪郭線の方向分布の特徴
は、文字の変形に対して比較的安定であるため、最近の
文字認識装置に広く用いられている。[0003] Briefly explaining this conventional technique, by applying a 2x2 logical filter to an input character image, the direction of the outline at each position of the character image is detected to obtain outline direction data. The obtained contour direction data is equally divided vertically and horizontally into seven to obtain an image divided into a plurality of blocks, and the number of direction data in each of the obtained blocks is counted to obtain a feature vector. The obtained feature vector is compared with a reference pattern created in advance, and character recognition is performed by the composite similarity method. Such features of the directional distribution of the outline of a character are relatively widely used in recent character recognition devices because they are relatively stable against deformation of the character.

【０００４】[0004]

【発明が解決しようとする課題】従来の文字認識装置に
おいては、手書き文字の筆記用具としてボールペン、鉛
筆、シャープペンシルだけが使われていたが、近年、文
字認識装置の適用範囲が広がるにつれて、筆、筆ペン、
マジックインキなどによって筆記された手書き文字をも
読取る必要が生じてきた。In a conventional character recognition apparatus, only a ballpoint pen, a pencil and a mechanical pencil have been used as writing tools for handwritten characters. , Brush pen,
It has become necessary to read handwritten characters written with magic ink or the like.

【０００５】また、印刷活字の認識においても、従来は
限られた数の字体のみを認識すればよかったが、近年で
は、非常に多様な字体を認識する必要が生じてきた。そ
の結果、読取るべき文字の文字線幅が、細いものから太
いものまで、非常に多様になっている。[0005] Also, in the recognition of print characters, it has conventionally been necessary to recognize only a limited number of characters, but in recent years it has become necessary to recognize very various characters. As a result, the character line widths of the characters to be read have become very diverse from thin to thick.

【０００６】ところが、従来の文字の輪郭線の方向分布
を特徴とする文字認識方式は、このような線幅の大きな
変動には充分対応することができない。たとえば、図１
０に示すような線幅の太い文字が入力された場合には、
図３に示す普通の線幅の文字と比較して、輪郭線の位置
が大きくずれており、異なったブロックに輪郭線が移動
してしまう場合が多い。その結果、特徴ベクトルの値も
大きく変動し、誤った文字認識結果が得られてしまう可
能性が高くなる。However, the conventional character recognition method which is characterized by the directional distribution of the outline of a character cannot sufficiently cope with such a large variation in line width. For example, FIG.
If a thick character such as 0 is input,
The position of the outline is greatly shifted as compared with the character having the normal line width shown in FIG. 3, and the outline is often moved to a different block. As a result, the value of the feature vector also fluctuates greatly, and there is a high possibility that an incorrect character recognition result will be obtained.

【０００７】そこで、本発明は、文字の輪郭線の方向の
分布を特徴量として文字を認識する際に、簡易な方法に
より文字線幅をほぼ一定値に正規化することができ、文
字線幅の変動が大きな場合にも正確な文字認識が可能と
なる文字認識装置および文字認識方法を提供することを
目的とする。Therefore, according to the present invention, when recognizing a character by using the distribution of the direction of the outline of the character as a feature, the character line width can be normalized to a substantially constant value by a simple method. It is an object of the present invention to provide a character recognition device and a character recognition method capable of performing accurate character recognition even when the variation of the character is large.

【０００８】[0008]

【課題を解決するための手段】本発明の文字認識装置
は、文字の輪郭線の方向の分布を特徴量として文字認識
する文字認識装置において、入力文字画像の各位置にお
ける輪郭線方向を検出して輪郭線方向データを得る輪郭
線方向検出手段と、入力文字画像から文字線幅を求める
文字線幅算出手段と、この文字線幅算出手段で求められ
た文字線幅に応じた量だけ、前記輪郭線方向検出手段か
ら得られる輪郭線方向データを所定の方向に移動するこ
とにより文字線幅を一定値に正規化する文字線幅正規化
手段と、この文字線幅正規化手段で文字線幅が正規化さ
れた輪郭線方向データから特徴ベクトルを求める特徴ベ
クトル算出手段と、この特徴ベクトル算出手段で求めら
れた特徴ベクトルを用いて文字認識を行なう認識手段と
を具備している。A character recognition apparatus according to the present invention detects a contour direction at each position of an input character image in a character recognition apparatus for recognizing a character by using a distribution of a direction of a contour of a character as a feature quantity. Contour direction detecting means for obtaining contour line direction data, character line width calculating means for determining a character line width from an input character image, and an amount corresponding to the character line width determined by the character line width calculating means. Character line width normalizing means for normalizing the character line width to a constant value by moving the outline direction data obtained from the outline direction detecting means in a predetermined direction, and character line width by the character line width normalizing means. Are provided with feature vector calculating means for obtaining a feature vector from normalized contour direction data, and recognition means for performing character recognition using the feature vector obtained by the feature vector calculating means.

【０００９】また、本発明の文字認識装置は、被読取物
上の画像を入力する入力手段と、この入力手段で入力さ
れた画像から前記被読取物上の文字行の画像を抽出する
行抽出手段と、この行抽出手段で抽出された文字行の画
像から個々の文字画像を抽出する文字抽出手段と、この
文字抽出手段で抽出された文字画像に対して所定の論理
フィルタを施すことにより文字画像の各位置における輪
郭線方向を検出して輪郭線方向データを得る輪郭線方向
検出手段と、前記文字抽出手段で抽出された文字画像の
黒画素の総数および輪郭の全長をそれぞれ計測し、この
計測した黒画素の総数および輪郭の全長を用いて文字線
幅を求める文字線幅算出手段と、この文字線幅算出手段
で求められた文字線幅と、あらかじめ定められた正規化
文字線幅とから輪郭線の移動量を求め、この求めた輪郭
線の移動量だけ、前記輪郭線方向検出手段から得られる
輪郭線方向データを、その方向データと直交する方向に
移動することにより文字線幅を一定値に正規化する文字
線幅正規化手段と、この文字線幅正規化手段で文字線幅
が正規化された輪郭線方向データを複数のブロックに分
割し、この分割した各ブロック内の方向データの個数を
計数することにより特徴ベクトルを求める特徴ベクトル
算出手段と、この特徴ベクトル算出手段で求められた特
徴ベクトルを用いて文字認識を行なう認識手段とを具備
している。Further, the character recognition device of the present invention has input means for inputting an image on an object to be read, and line extraction for extracting an image of a character line on the object to be read from the image input by the input means. Means, character extracting means for extracting an individual character image from the image of the character line extracted by the line extracting means, and applying a predetermined logical filter to the character image extracted by the character extracting means. A contour direction detecting means for detecting contour direction at each position of the image to obtain contour direction data, and measuring a total number of black pixels and a total length of the contour of the character image extracted by the character extracting means, respectively. A character line width calculating means for obtaining a character line width using the total number of measured black pixels and the total length of the contour, a character line width obtained by the character line width calculating means, a predetermined normalized character line width, To ring The amount of movement of the line is obtained, and the contour line direction data obtained from the contour line direction detecting means is moved in the direction orthogonal to the direction data by the determined amount of movement of the contour line, so that the character line width is fixed. Character line width normalizing means for normalizing the contour line direction data having the character line width normalized by the character line width normalizing means into a plurality of blocks, and dividing the direction data in each of the divided blocks. The image processing apparatus includes a feature vector calculating unit that calculates a feature vector by counting the number, and a recognition unit that performs character recognition using the feature vector obtained by the feature vector calculating unit.

【００１０】さらに、本発明の文字認識方法は、文字の
輪郭線の方向の分布を特徴量として文字認識する文字認
識方法において、入力文字画像の各位置における輪郭線
方向を検出して輪郭線方向データを取得するとともに、
入力文字画像の黒画素の総数および輪郭の全長をそれぞ
れ計測し、この計測した黒画素の総数および輪郭の全長
を用いて文字画像の文字線幅を求め、この求めた文字線
幅に応じた量だけ、前記得られた輪郭線方向データを、
その方向データと直交する方向に移動することにより文
字線幅を一定値に正規化し、この文字線幅が正規化され
た輪郭線方向データを複数のブロックに分割し、この分
割した各ブロック内の方向データの個数を計数すること
により特徴ベクトルを求め、この求めた特徴ベクトルを
用いて文字認識を行なうことを特徴とする。Further, according to the character recognition method of the present invention, in the character recognition method for character recognition using the distribution of the direction of the outline of the character as a feature amount, the outline direction at each position of the input character image is detected. Get the data,
The total number of black pixels and the total length of the outline of the input character image are measured, and the character line width of the character image is determined using the measured total number of black pixels and the total length of the outline, and the amount corresponding to the determined character line width. Only, the obtained contour direction data is
By moving in the direction orthogonal to the direction data, the character line width is normalized to a constant value, the contour line direction data in which the character line width is normalized is divided into a plurality of blocks, and each of the divided blocks is A feature vector is obtained by counting the number of direction data, and character recognition is performed using the obtained feature vector.

【００１１】本発明によれば、文字画像の各位置におけ
る輪郭線方向データを、推定文字線幅（入力文字画像か
ら求めた文字線幅）に応じた量だけ、その方向データと
直交する方向に移動することにより、文字線幅の正規化
を行なうことで、文字の輪郭線の方向の分布を特徴量と
して文字を認識する際に、簡易な方法により文字線幅を
ほぼ一定値に正規化することができ、文字線幅の変動が
大きな場合にも正確な文字認識が可能となる。According to the present invention, the contour direction data at each position of the character image is calculated in the direction orthogonal to the direction data by an amount corresponding to the estimated character line width (character line width obtained from the input character image). By normalizing the character line width by moving the character line width, the character line width is normalized to a substantially constant value by a simple method when the character is recognized using the distribution of the direction of the outline of the character as a feature amount. This enables accurate character recognition even when the character line width varies greatly.

【００１２】[0012]

【発明の実施の形態】以下、本発明の実施の形態につい
て図面を参照して説明する。図１は、本実施の形態に係
る文字認識装置の構成を概略的に示すものである。この
文字認識装置は、被読取物としての帳票Ｐ上の画像を入
力する入力手段としての光電変換器などからなる画像入
力部１、画像入力部１で入力された画像を一時記憶する
ための記憶手段としての画像ファイル２、画像ファイル
２内の画像から帳票Ｐ上の文字行の画像を抽出する行抽
出手段としての行抽出部３、行抽出部３で抽出された文
字行の画像から個々の文字画像を抽出する文字抽出手段
としての文字抽出部４、文字抽出部４で抽出された文字
画像から文字認識を行なう文字認識部５、文字認識部５
で文字認識の際に用いる参照パターンを記憶している記
憶手段としての参照パターンファイル６、および、文字
認識部５の認識結果を格納する認識結果ファイル７によ
って構成されている。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 schematically shows a configuration of a character recognition device according to the present embodiment. This character recognition device includes an image input unit 1 including a photoelectric converter or the like as input means for inputting an image on a form P as an object to be read, and a memory for temporarily storing an image input by the image input unit 1. The image file 2 as a means, a line extracting unit 3 as a line extracting means for extracting an image of a character line on the form P from the image in the image file 2, and individual lines from the character line image extracted by the line extracting unit 3 A character extracting unit 4 as a character extracting unit for extracting a character image; a character recognizing unit 5 for performing character recognition from the character image extracted by the character extracting unit 4;
And a reference pattern file 6 as storage means for storing a reference pattern used for character recognition, and a recognition result file 7 for storing recognition results of the character recognition unit 5.

【００１３】このような構成において、全体的な動作を
簡単に説明する。まず、画像入力部１によって帳票Ｐ上
の画像を入力し、画像ファイル２に格納する。次に、行
抽出部３で、画像ファイル２内の画像に対して所定の画
像処理を行なうことにより、帳票Ｐ上に記入された文字
行の画像を抽出する。次に、文字抽出部４で、抽出され
た文字行の画像から個々の文字を抽出する。次に、文字
認識部５で、抽出された個々の文字画像から複合類似度
法を用いて文字認識を行ない、その認識結果を磁気ディ
スクなどの認識結果ファイル７に出力する。In such a configuration, the overall operation will be briefly described. First, an image on the form P is input by the image input unit 1 and stored in the image file 2. Next, the line extracting unit 3 performs predetermined image processing on the image in the image file 2 to extract the image of the character line entered on the form P. Next, the character extracting unit 4 extracts individual characters from the image of the extracted character line. Next, the character recognition unit 5 performs character recognition from the extracted individual character images using the composite similarity method, and outputs the recognition result to a recognition result file 7 such as a magnetic disk.

【００１４】以下では、文字認識部５における認識処理
の内容を、図２に示すフローチャートを参照して詳細に
説明する。まず、たとえば、図３に示すような文字画像
が入力されると（Ｓ１）、この入力文字画像に対して、
たとえば、図４に示すような２×２の論理フィルタを施
すことにより、文字画像の各位置における輪郭線方向を
検出し、輪郭線方向データを取得する（Ｓ２）。すなわ
ち、図４（ａ）の論理フィルタに当てはまる位置には上
下方向の輪郭線が存在するとみなす。同様に、図４
（ｂ）（ｃ）（ｄ）は、それぞれ右上がり、左上がり、
左右方向の輪郭線を示す。このような論理フィルタ操作
を入力画像の全面に施すことにより、文字画像の各位置
における輪郭線方向の検出ができ、図５に示すような輪
郭線方向データが得られる。この図では、各位置の輪郭
線方向を矢印で示している。Hereinafter, the contents of the recognition processing in the character recognition unit 5 will be described in detail with reference to the flowchart shown in FIG. First, when a character image as shown in FIG. 3 is input (S1), for example,
For example, by applying a 2 × 2 logical filter as shown in FIG. 4, the contour direction at each position of the character image is detected, and contour direction data is obtained (S2). That is, it is considered that a vertical contour exists at a position corresponding to the logical filter in FIG. Similarly, FIG.
(B), (c), and (d) indicate upward and downward, respectively,
The outline in the left-right direction is shown. By performing such a logical filtering operation on the entire surface of the input image, the contour direction at each position of the character image can be detected, and contour direction data as shown in FIG. 5 is obtained. In this figure, the outline direction of each position is indicated by an arrow.

【００１５】次に、入力文字画像の黒画素の総数および
輪郭の全長を計測する（Ｓ３，Ｓ４）。次に、この計測
した黒画素の総数および輪郭の全長を用いて、以下の式
によって平均文字線幅を計算する（Ｓ５）。Next, the total number of black pixels and the total length of the outline of the input character image are measured (S3, S4). Next, using the measured total number of black pixels and the total length of the contour, an average character line width is calculated by the following equation (S5).

【００１６】（平均文字線幅）＝２×（黒画素の総数）
／（輪郭の全長）次に、上記のように求められた平均文字線幅を用いて、
以下の式によって輪郭線移動量を計算する（Ｓ６）。こ
こで、正規化文字線幅は、あらかじめ定められた定数
で、正規化処理によって文字線幅がこの値に設定される
ものである。(Average character line width) = 2 × (total number of black pixels)
/ (Total length of contour) Next, using the average character line width obtained as described above,
The contour movement amount is calculated by the following equation (S6). Here, the normalized character line width is a predetermined constant, and the character line width is set to this value by normalization processing.

【００１７】（輪郭線移動量）＝｛（平均文字線幅）−
（正規化文字線幅）｝／２この場合、入力された文字画像の線幅が太い場合は、輪
郭線移動量は正の値になり、細い場合は、負の値にな
る。(Contour line moving amount) = ｛(average character line width) −
(Normalized character line width)｝ / 2 In this case, if the line width of the input character image is large, the outline movement amount is a positive value, and if the line width is small, the outline value is a negative value.

【００１８】次に、ステップＳ２で求められた文字画像
の各位置の輪郭線方向データを、ステップＳ６で求めら
れた輪郭線移動量だけ移動させる（Ｓ７）。この場合、
移動する方向は、その方向データに直交する方向とす
る。すなわち、図６に示す白矢印で示した方向に各方向
データを移動する。ただし、図６で示した方向は、輪郭
線移動量が正の場合の移動方向であり、負の場合はこれ
とは１８０度反対の方向に移動するものとする。Next, the outline direction data at each position of the character image obtained in step S2 is moved by the outline movement amount obtained in step S6 (S7). in this case,
The moving direction is a direction orthogonal to the direction data. That is, each direction data is moved in the direction indicated by the white arrow shown in FIG. However, the direction shown in FIG. 6 is the moving direction when the contour line moving amount is positive, and moves in the opposite direction by 180 degrees when the contour line moving amount is negative.

【００１９】このように、各輪郭線方向データを移動さ
せることにより、輪郭線方向データは、図７（ａ）から
図７（ｂ）に示すように変換される。図７で明らかなよ
うに、入力文字画像の太い文字線幅が細められ、あらか
じめ定められた文字線幅（正規化文字線幅）を持った輪
郭線方向データに変換される。なお、入力文字画像の線
幅が細い場合には、同様な処理で一定の文字線幅に太め
られた輪郭線方向データを得ることができる。As described above, by moving each contour line direction data, the contour line direction data is converted as shown in FIG. 7 (a) to FIG. 7 (b). As is clear from FIG. 7, the thick character line width of the input character image is reduced, and the input character image is converted into contour direction data having a predetermined character line width (normalized character line width). When the line width of the input character image is small, contour line direction data thickened to a certain character line width can be obtained by the same processing.

【００２０】次に、ステップＳ７で得られた図７（ｂ）
の輪郭線方向データを縦横それぞれ７等分することによ
り、図８に示すような７×７個のブロックに分割された
画像を得る（Ｓ８）。Next, FIG. 7B obtained in step S7
By dividing the contour direction data into 7 in each of the vertical and horizontal directions, an image divided into 7 × 7 blocks as shown in FIG. 8 is obtained (S8).

【００２１】次に、図９に示すように、各ブロックに含
まれる各方向の個数を並べて特徴ベクトルとする（Ｓ
９）。すなわち、ステップＳ８で得られた図８に示すよ
うな画像の各ブロックに含まれる各方向データの個数を
計数することにより特徴ベクトルを計算する。本例の場
合、方向が４方向でのブロック数が７×７個あるので、
特徴ベクトルの次元数は、４×７×７＝１９６次元とな
る。Next, as shown in FIG. 9, the number of each direction included in each block is arranged to form a feature vector (S
9). That is, the feature vector is calculated by counting the number of each direction data included in each block of the image as shown in FIG. 8 obtained in step S8. In the case of this example, since the number of blocks in the four directions is 7 × 7,
The dimension number of the feature vector is 4 × 7 × 7 = 196 dimensions.

【００２２】次に、ステップＳ９で得られた特徴ベクト
ルを、あらかじめ作成して参照パターンファイル６に格
納されている参照パターン（標準パターン）と比較する
ことにより、文字認識を行なう（Ｓ１０）。文字認識の
方式としては、たとえば、複合類似度法を用いるとよ
い。なお、ここで用いる参照パターン自体も、同じ方式
で文字線幅正規化された文字画像から作成されたもので
ある。Next, character recognition is performed by comparing the feature vector obtained in step S9 with a reference pattern (standard pattern) previously created and stored in the reference pattern file 6 (S10). As a method of character recognition, for example, a composite similarity method may be used. Note that the reference pattern itself used here is also created from a character image whose character line width has been normalized by the same method.

【００２３】次に、ステップＳ１０で得られた認識結果
を出力し、認識結果ファイル７に格納する（Ｓ１１）。
このように、上記実施の形態によれば、文字画像の各位
置における輪郭線方向データを、推定文字線幅（入力文
字画像から求めた文字線幅）に応じた量だけ、その方向
データと直交する方向に移動することにより、文字線幅
の正規化を行なうことで、文字の輪郭線の方向の分布を
特徴量として文字を認識する際に、簡易な方法により文
字線の幅をほぼ一定値に正規化することができ、たとえ
ば、図１０に示すような文字線幅の太い文字画像が入力
された場合など、文字線幅の変動が大きな場合にも正確
な文字認識が可能になる。Next, the recognition result obtained in step S10 is output and stored in the recognition result file 7 (S11).
As described above, according to the above embodiment, the contour direction data at each position of the character image is orthogonal to the direction data by an amount corresponding to the estimated character line width (character line width obtained from the input character image). By performing normalization of the character line width by moving in the direction in which the character line is moved, the width of the character line can be set to a substantially constant value by a simple method when recognizing the character using the distribution of the direction of the outline of the character as a feature amount. The character can be accurately recognized even when the character line width greatly fluctuates, for example, when a character image having a large character line width as shown in FIG. 10 is input.

【００２４】なお、本発明は前記実施の形態に限定され
るものでなく、本発明の要旨を変えない範囲で種々変形
実施可能である。たとえば、前記実施の形態では、入力
された文字画像の文字線幅として全体的な平均文字線幅
を用いている。しかし、文字を構成する文字線の太さが
均一でない場合もしばしば存在する。そのような場合
に、全体的な平均文字線幅を用いて線幅正規化を行なう
と、正規化後の文字画像が歪んでしまう可能性がある。The present invention is not limited to the above-described embodiment, but can be variously modified without departing from the spirit of the present invention. For example, in the above embodiment, the overall average character line width is used as the character line width of the input character image. However, there are often cases in which the thickness of character lines constituting characters is not uniform. In such a case, if line width normalization is performed using the entire average character line width, the character image after the normalization may be distorted.

【００２５】このような場合には、文字画像の各位置ご
とに局所的な文字線幅を求め、それに応じて局所的に輪
郭線移動量を計算すればよい。局所的な文字線幅は、図
１１に示すように、ある輪郭位置から、それと反対方向
に存在する輪郭位置までの距離を測定すれば得られる。
ただし、図１２に示したように、その距離が大局的な平
均文字線幅と非常に異なる場合は、正しい文字線幅に対
応していないと判断して、輪郭線方向データの移動を行
なわない。In such a case, a local character line width may be obtained for each position of the character image, and a contour line moving amount may be locally calculated in accordance with the local character line width. The local character line width can be obtained by measuring the distance from a certain contour position to a contour position existing in the opposite direction, as shown in FIG.
However, as shown in FIG. 12, when the distance is very different from the global average character line width, it is determined that the character line width does not correspond to the correct character line width, and the contour line direction data is not moved. .

【００２６】また、たとえば、正規化文字線幅を「１」
に指定すると、正規化処理によって、図１３（ａ）から
図１３（ｂ）に示すように細線化された細線化画像が得
られる。このような細線化画像は、文字線の端点や交差
点の検出など、様々な用途に用いることができる。For example, when the normalized character line width is "1"
, A thinned image as shown in FIGS. 13A to 13B is obtained by the normalization processing. Such a thinned image can be used for various purposes such as detection of an end point or an intersection of a character line.

【００２７】[0027]

【発明の効果】以上詳述したように本発明によれば、文
字の輪郭線の方向の分布を特徴量として文字を認識する
際に、簡易な方法により文字線幅をほぼ一定値に正規化
することができ、文字線幅の変動が大きな場合にも正確
な文字認識が可能となる文字認識装置および文字認識方
法を提供できる。As described above in detail, according to the present invention, when recognizing a character using the distribution of the direction of the contour of the character as a feature, the character line width is normalized to a substantially constant value by a simple method. Thus, a character recognition device and a character recognition method capable of performing accurate character recognition even when the variation of the character line width is large can be provided.

[Brief description of the drawings]

【図１】本発明の実施の形態に係る文字認識装置の構成
を概略的に示すブロック図。FIG. 1 is a block diagram schematically showing a configuration of a character recognition device according to an embodiment of the present invention.

【図２】文字認識処理を説明するフローチャート。FIG. 2 is a flowchart illustrating a character recognition process.

【図３】入力された文字画像の一例を示す図。FIG. 3 is a diagram illustrating an example of an input character image.

【図４】文字の輪郭線の方向の検出に用いる論理フィル
タを示す図。FIG. 4 is a diagram showing a logical filter used for detecting the direction of the outline of a character.

【図５】文字の輪郭線の方向の分布を示す図。FIG. 5 is a diagram illustrating a distribution of a direction of a contour line of a character.

【図６】それぞれの輪郭線方向データの移動方向を示す
図。FIG. 6 is a diagram showing a moving direction of each contour line direction data.

【図７】輪郭線方向データの移動によって文字線幅の正
規化が行なわれることを示す図。FIG. 7 is a diagram showing that character line width normalization is performed by moving contour line direction data.

【図８】文字画像の輪郭線方向データを複数のブロック
に分割した画像を示す図。FIG. 8 is a diagram illustrating an image obtained by dividing contour line direction data of a character image into a plurality of blocks.

【図９】特徴ベクトルの計算の例を説明する図。FIG. 9 is a diagram illustrating an example of calculating a feature vector.

【図１０】太い線幅の文字画像が入力された場合の問題
点を説明する図。FIG. 10 is a diagram illustrating a problem when a character image having a thick line width is input.

【図１１】局所的な文字線幅の計算法を説明する図。FIG. 11 is a diagram illustrating a method for calculating a local character line width.

【図１２】局所的な文字線幅とみなさない対応を説明す
る図。FIG. 12 is a view for explaining correspondence that is not regarded as a local character line width.

【図１３】正規化処理により細線化画像も得られること
を説明する図。FIG. 13 is a view for explaining that a thinned image can also be obtained by normalization processing.

[Explanation of symbols]

１……画像入力部（画像入力手段）２……画像ファイル（記憶手段）３……行抽出部（行抽出手段）４……文字抽出部（文字抽出手段）５……文字認識部６……参照パターンファイル（記憶手段）７……認識結果ファイル（記憶手段） DESCRIPTION OF SYMBOLS 1 ... Image input part (image input means) 2 ... Image file (storage means) 3 ... Line extraction part (line extraction means) 4 ... Character extraction part (character extraction means) 5 ... Character recognition part 6 ... ... Reference pattern file (storage means) 7 ... Recognition result file (storage means)

Claims

[Claims]

1. A character recognition apparatus for recognizing a character based on the distribution of the direction of the outline of the character as a feature amount. A contour direction detecting means for detecting a contour direction at each position of an input character image to obtain contour direction data. Character line width calculating means for calculating a character line width from an input character image; and contour line direction data obtained from the contour line direction detecting means by an amount corresponding to the character line width calculated by the character line width calculating means. Character line width normalizing means for normalizing the character line width to a fixed value by moving the character line width in a predetermined direction, and a feature vector from the contour direction data in which the character line width is normalized by the character line width normalizing means. And a recognition unit for performing character recognition using the feature vector obtained by the feature vector calculation unit.

2. The contour line direction detecting means detects a contour line direction at each position of a character image by applying a 2 × 2 logical filter to the input character image. Character recognition device.

3. The character line width calculating means measures the total number of black pixels and the total length of the outline of the character image, respectively, and calculates the character line width of the character image using the measured total number of black pixels and the total length of the outline. The character recognition device according to claim 1, wherein the character recognition is performed.

4. The character line width calculating unit according to claim 1, wherein an average character line width of the entire character image is obtained.
Or the character recognition device according to 3.

5. The character recognition device according to claim 1, wherein said character line width calculating means obtains a local character line width at each position of the character image.

6. The character line width calculating unit calculates a global character line width of a character image and a local character line width at each position of the character image.
Or the character recognition device according to 3.

7. The character line width normalizing means, wherein the character line width calculated by the character line width calculating means and a predetermined normalized character line width are used to determine a contour moving amount. Contour calculating means for moving the contour direction data obtained from the contour direction detecting means in a direction orthogonal to the direction data by the amount of movement of the contour determined by the contour moving amount calculating means. 2. The character recognition device according to claim 1, further comprising a moving unit.

8. The feature vector calculating means divides the contour line direction data whose character line width has been normalized by the character line width normalizing means into a plurality of blocks, and calculates the number of direction data in each of the divided blocks. 2. The character recognition device according to claim 1, wherein a feature vector is obtained by counting the number of characters.

9. The method according to claim 1, wherein the recognition unit compares the feature vector obtained by the feature vector calculation unit with a reference pattern created in advance and performs character recognition by a composite similarity method. Character recognition device according to the description.

10. An input means for inputting an image on an object to be read, a line extracting means for extracting an image of a character line on the object to be read from the image input by the input means, A character extracting means for extracting an individual character image from the image of the extracted character line; and a contour line direction at each position of the character image by applying a predetermined logical filter to the character image extracted by the character extracting means. And the contour line direction detecting means for detecting contour line direction data, and measuring the total number of black pixels and the total length of the contour of the character image extracted by the character extracting means, respectively. Character line width calculating means for calculating the character line width using the entire length of the character line width; and determining the moving amount of the contour line from the character line width calculated by the character line width calculating means and a predetermined normalized character line width. ,this The character line width normalization is performed by normalizing the character line width to a constant value by moving the outline direction data obtained from the outline direction detecting means by the amount of movement of the outline in the direction orthogonal to the direction data. The contour line direction data whose character line width has been normalized by the character line width normalizing means is divided into a plurality of blocks, and the number of direction data in each of the divided blocks is counted. And a recognition unit for performing character recognition using the feature vector obtained by the feature vector calculation unit.

11. A character recognition method for recognizing a character using a distribution of the direction of the outline of the character as a feature quantity, wherein the outline direction at each position of the input character image is detected to obtain outline direction data. The total number of black pixels of the image and the total length of the outline are measured, and the character line width of the character image is obtained using the measured total number of black pixels and the entire length of the outline. The obtained contour line direction data is moved in a direction orthogonal to the direction data to normalize the character line width to a constant value, and the character line width normalized contour line direction data is divided into a plurality of blocks. A character is obtained by dividing and calculating a feature vector by counting the number of direction data in each of the divided blocks, and performing character recognition using the obtained feature vector.識方 method.