JPH01201786A

JPH01201786A - Character reader

Info

Publication number: JPH01201786A
Application number: JP63026781A
Authority: JP
Inventors: Hiroshi Sasaki; 宏佐々木; Masato Suda; 正人須田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1988-02-08
Filing date: 1988-02-08
Publication date: 1989-08-14

Abstract

PURPOSE:To precisely detect and segment respective characters so as to submit them for a recognition processing by obtaining the positions of the segment of the characters and the directions of segment from projection in plural directions. CONSTITUTION:A character detection segment part 2 detects and segments respective character pictures from character string pictures which a photoelectric conversion part 1 has image-picked up and inputted in accordance with detection segment information, and submits them for the character recognition processing by a character recognition part 3. Detection segment information can be obtained by obtaining the positions of the segment of the characters and the directions of segments from the projection in plural directions by a row direction segment part 4, a projection generation part 5 and a minimum projection detection part 6 and detecting the segments of respective characters.

Description

【発明の詳細な説明】［発明の目的］“ （産業上の利用分野）本発明は読取手段で読取られた文字列画像中の各文字を
精度良く認識することのできる文字読取装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial Field of Application) The present invention relates to a character reading device that can accurately recognize each character in a character string image read by a reading means.

（従来の技術）読取手段である光電変換部にて撮像入力された文字列画
像中の文字の読取認識は、基本的には上記文字列画像中
から各文字画像を検出切出し、その文字画像中の文字パ
ターンと、予め辞書登録されている読取対象文字の標準
パタニンとの類似度を計算する等して照合し、その照合
結果に従って認識候補を求め、更に上記文字画像が示す
文字を特定することによって行われる。(Prior art) Reading and recognition of characters in a character string image captured and inputted by a photoelectric conversion unit, which is a reading means, basically involves detecting and cutting out each character image from the character string image, and then Compare the character pattern by calculating the degree of similarity between the character pattern and the standard pattern of the character to be read registered in the dictionary in advance, obtain recognition candidates according to the verification result, and further specify the character indicated by the character image. carried out by.

然し乍ら、手書き文字を読取入力対象とする文字読取装
置にあっては、手書き文字列の各文字の大きさが不揃い
であること、またその文字間隔にバラツキがある等の理
由により、文字列画像中から個々の文字画像をそれぞれ
切出す為の、所謂読取検出処理が非常に困難なものとな
っている。これ故、従来より上記文字画像の検出切出を
如何にして高精度に行なうか等の研究が種々進められて
いる。However, in character reading devices that read and input handwritten characters, the size of each character in the handwritten character string is uneven, and the spacing between the characters varies, etc. The so-called reading detection process for cutting out individual character images from each character image has become extremely difficult. For this reason, various studies have been carried out on how to detect and cut out character images with high accuracy.

ところで上述した検出切出処理は、通常、各文字が矩形
状の外接枠内にそれぞれ収まっていることを前提として
行われている。具体的には文字列方向と直角な方向に文
字列画像の射影を求め、この射影パターンから文字列方
向における文字の区切り位置を検出して行われる。然し
乍ら、手書きされた文字列、特に横書きされた日本語文
章にあっては、所謂くせ字のように、往々にしてその文
字自体が傾きを以て記載されることが多くある。By the way, the above-mentioned detection and cutting process is normally performed on the premise that each character falls within a rectangular circumscribing frame. Specifically, this is performed by obtaining a projection of a character string image in a direction perpendicular to the character string direction, and detecting character break positions in the character string direction from this projection pattern. However, in handwritten character strings, especially Japanese texts written horizontally, the characters themselves are often written with an inclination, as in so-called cursive characters.

しかもその傾きの方向が、成る１つの文字列内において
一定化していないことも多々ある。この為、前述した文
字間隔が不揃いなことと相俟って文字列を構成する各文
字画像の検出切出処理が徒に複雑化し、また多大な処理
時間を必要とする等の問題があった。しかも各文字画像
をそれぞれ高精度に検出切出することができないことに
起因して、その読取精度の向上を望むことが困難である
等の問題が生じた。Moreover, the direction of the inclination is often not constant within a single character string. For this reason, combined with the uneven character spacing mentioned above, there were problems such as the detection and extraction process of each character image forming a character string becoming unnecessarily complicated and requiring a large amount of processing time. . Furthermore, since each character image cannot be detected and cut out with high precision, problems arise, such as that it is difficult to improve the reading accuracy.

（発明が解決しようとする課題）このように従来にあっては、例えば手書きされた文字列
の各文字の向きが不揃いである場合、光電変換部にて撮
像入力された文字列画像からの各文字画像の検出切出処
理が困難であり、検出切出精度の向上を望むことができ
ないことのみならず、これに起因して文字の読取精度の
向上を望むことができない等の問題があった。(Problem to be Solved by the Invention) Conventionally, for example, when the orientation of each character of a handwritten character string is uneven, it is difficult to Detection and extraction processing of character images is difficult, and there have been problems such as not only being unable to improve detection and extraction accuracy, but also being unable to improve character reading accuracy due to this. .

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、斜体文字が含まれる等、文字の
向きが不揃いな文字列であっても、その文字列を構成す
る各文字の文字画像をそれぞれ高精度に検出切出するこ
とかでき、文字読取精度の向上を図ることのできる文字
読取装置を提供することにある。The present invention has been made in consideration of these circumstances, and its purpose is to ensure that each character string that makes up the character string is not aligned even if the characters are in irregular orientations, such as those that include italic characters. It is an object of the present invention to provide a character reading device capable of detecting and cutting out character images of characters with high precision and improving character reading accuracy.

［発明の構成］（課題を解決するための手段）本発明に係る文字読取装置は、読取手段で読取入力され
た画像から抽出された文字列をなす文字列画像に対して
、その文字列方向と交差する複数の方向についての射影
をそれぞれ求め、これらの射影を相互に比較して上記文
字列方向の各位置毎に上記射影中の最小射影値を求め、
これらの最小射影値とその最小射影値を得た射影の方向
で示される最小射影パターンから、例えばその最小射影
パターンを所定の閾値で２値化する等して前記文字列画
像における各文字の区切り位置を求め、同時にその位置
での射影の向きから文字の区切りの向きを求め、これら
の情報に従って検出切出を行なうようにしたことを特徴
とするものである。[Structure of the Invention] (Means for Solving the Problems) A character reading device according to the present invention detects the character string direction for a character string image forming a character string extracted from an image read and input by a reading means. Find the projections for a plurality of directions that intersect with each other, compare these projections with each other, and find the minimum projection value among the projections for each position in the character string direction,
From these minimum projection values and the minimum projection pattern indicated by the direction of the projection from which the minimum projection value was obtained, for example, the minimum projection pattern is binarized with a predetermined threshold value to separate each character in the character string image. The present invention is characterized in that the position is determined, and at the same time, the direction of character separation is determined from the direction of projection at that position, and detection and segmentation is performed in accordance with this information.

（作用）本発明によれば、手書きされた不揃いな傾きを持つくせ
字からなる文字列を読取入力するに際しても、複数の方
向についての射影から文字の区切りの位置と、その区切
りの向きとが求められて各文字の切れ目が検出されるの
で、この切れ目の情報に従って個々の文字画像をそれぞ
れ高精度に検出切出して認識処理に供することができる
。この結果、文字の読取入力を高精度に行なうことが可
能となる。(Function) According to the present invention, even when reading and inputting a character string consisting of handwritten characters with irregular inclinations, the positions of character breaks and the directions of the breaks can be determined from projections in a plurality of directions. Since the breaks in each character are detected, each character image can be detected and cut out with high precision according to the information on the breaks and subjected to recognition processing. As a result, it becomes possible to read and input characters with high precision.

（実施例）以下、図面を参照して本発明の一実施例につき説明する
。(Example) Hereinafter, an example of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例に係る文字読取装置の要部概
略構成図であり、第８図はその全体的な処理手続きの流
れを示している。FIG. 1 is a schematic diagram of the main parts of a character reading device according to an embodiment of the present invention, and FIG. 8 shows the flow of its overall processing procedure.

読取手段である光電変換部ｌはＴＶカメラや半導体イメ
ージセンサ等からなり、帳票上に記載された文字列情報
を画像入力し、その文字列画像をディジタル化している
（第８図の処理Ａ）。このディジタル画像が一旦フレー
ムメモリに格納され、以下に説明する文字情報の読取入
力に供せられる。The photoelectric conversion unit l, which is the reading means, is composed of a TV camera, a semiconductor image sensor, etc., inputs the character string information written on the form as an image, and digitizes the character string image (process A in Fig. 8). . This digital image is temporarily stored in a frame memory, and is used for reading and inputting character information as described below.

文字検出切出部２は、上述した如く撮像入力された文字
列画像から後述する検出切出情報に従って個々の文字画
像を検出切出し、文字認識部３による文字認識処理に供
する。The character detection and cutout unit 2 detects and cuts out individual character images from the input character string image as described above in accordance with detection and cutout information to be described later, and subjects them to character recognition processing by the character recognition unit 3.

さて上記文字の検出切出処理に供される検出切出情報は
次のようにして求められる。The detection and extraction information used in the character detection and extraction process described above is obtained as follows.

行検出切出部４１は入力画像の射影を、その縦方向およ
び横方向にそれぞれ求め、文字列画像の領域を検出して
いる（第８図の処理Ｂ、Ｃ）。この行検出切出部４で求
められた個々の文字列画像領域の像がそれぞれ文字の検
出切出処理に供される。The line detection/cutting unit 41 obtains the projection of the input image in the vertical and horizontal directions, respectively, and detects the region of the character string image (processes B and C in FIG. 8). The images of the individual character string image regions obtained by the line detection and cutout section 4 are each subjected to character detection and cutout processing.

しかして射影作成部５は、上記行検出切出部４で求めら
れた成る文字列画像について、その文字列方向と交差す
る複数の方向での射影をそれぞれ求めるものである。具
体的には、例えば第２図に示すように成る行幅で切出さ
れた文字列画像について、第３図に示す如Ｘ予め設定さ
れた複数の方向１．　ＩＩ、〜Ｖについての射影を第４
図に示すようにそれぞれ求めている（第８図の処理Ｄ　
１．Ｄ　２゜〜Ｄｎ）。尚、ここでは射影を求める方向
の１つ、■が前記文字列方向と直交する向きとして定め
られている。そして方向Ｉ、ＩＩは、横書き文字列に対
する所謂前傾文字（左斜体文字）の切れ目を検出する為
の方向として、また方向ＩＶ、　Ｖは所謂後傾文字（右
斜体文字）の切れ目を検出する為の方向としてそれぞれ
定められている。The projection creation unit 5 calculates projections of the character string image obtained by the line detection and cutting unit 4 in a plurality of directions intersecting the direction of the character string. Specifically, for example, for a character string image cut out with a line width as shown in FIG. 2, a plurality of preset directions 1. II, the projection on ~V is the fourth
Each is calculated as shown in the figure (Processing D in Figure 8).
1. D2°~Dn). Note that here, one of the directions in which projection is sought, ■, is defined as a direction perpendicular to the character string direction. Directions I and II are used to detect breaks in so-called forward-slanted characters (left italic characters) in a horizontally written character string, and directions IV and V are used to detect breaks in so-called backward-slanted characters (right italic characters). Each direction is determined as a direction.

しかして最小射影検出部６は上述した如く各方向につい
て求められた射影を、文字列方向の各位置にて相互に比
較しく第８図の処理Ｅ）、それらの射影値の中で最小な
射影値と、その射影値を得た前述した射影の方向の情報
とを第５図に示すように最小射影パターンとして求めて
いる（第８図の処理Ｅ１．Ｅ２）。即ち、第５図に示す
ように文字列方向の各位置での最小射影値を抽出した射
影パターンαと、上記文字列方向の各位置において最小
射影値を得た射影の方向βとをそれぞれ求めている。尚
、前述した第４図に示す複数の方向の射影から求められ
る第５図に示す最小射影パターンでは、領域ａの射影成
分が方向■から得られ、領域すの射影成分が方向■から
得られたことが示される。このような処理が文字行方向
の全ビットについて繰返し行われる（第８図の処理Ｆ）
。Then, the minimum projection detection unit 6 compares the projections obtained in each direction as described above at each position in the character string direction, and calculates the minimum projection among those projection values. The value and the above-mentioned projection direction information from which the projection value was obtained are obtained as a minimum projection pattern as shown in FIG. 5 (processes E1 and E2 in FIG. 8). That is, as shown in FIG. 5, the projection pattern α in which the minimum projection value was extracted at each position in the character string direction, and the projection direction β in which the minimum projection value was obtained at each position in the character string direction are determined. ing. In addition, in the minimum projection pattern shown in FIG. 5 obtained from the projections in a plurality of directions shown in FIG. It is shown that Such processing is repeated for all bits in the character line direction (processing F in Figure 8).
.

最小射影検出部６では、このようにして得られた最小射
影パターンａを所定の閾値で２値イヒしく第８図の処理
Ｇ）、その文字ピッチを推定する（第８図の処理Ｈ）。The minimum projection detection unit 6 performs a binary evaluation of the minimum projection pattern a obtained in this manner using a predetermined threshold value (processing G in FIG. 8), and estimates its character pitch (processing H in FIG. 8).

そして第６図に示すように文字の区切り位置を求め（第
８図の処理ｌ）、同時にその区切り位置での文字の区切
りの方向を前述した最小射影パターンの最小射影値を得
た射影の方向βから求めている（第８図の処理Ｊ）。こ
のような文字の区切り検出処理を全ての文字について繰
返し実行する（第８図の処理Ｋ）。Then, as shown in Fig. 6, the character break position is determined (process 1 in Fig. 8), and at the same time, the direction of the character break at that break position is determined in the projection direction that obtained the minimum projection value of the minimum projection pattern described above. It is obtained from β (processing J in Figure 8). Such a character break detection process is repeatedly executed for all characters (process K in FIG. 8).

この第６図に示すようにして検出された文字の区切り位
置の情報とその区切りの向きの情報とが前述した検出切
出情報として、前記文字列領域の検出情報と共に前記文
字検出切出部２に与えられる。The information on the position of the characters detected as shown in FIG. 6 and the information on the orientation of the characters are used together with the detection information on the character string area as the above-mentioned detection extraction information by the character detection and extraction unit 2. given to.

しかして文字検出切出部２ではこのような検出切出情報
に従い、例えば入力文字列画像が第２図に示すように与
えられた場合、これを第７図に示すように各文字の区切
り位置毎に、そこでの文字の区切りの向きに従って検出
切出処理を実行し、各文字画像をそれぞれ検出切出する
。For example, when an input character string image is given as shown in FIG. 2, the character detection and extraction section 2 calculates the separation position of each character as shown in FIG. For each character image, detection and extraction processing is executed according to the direction of character separation, and each character image is detected and extracted.

尚、検出切出された文字画像は、その切出しの方向の情
報等に従って、適宜、その文字の向きに対する正規化処
理が施される等して文字の認識処理に供される（第８図
の処理り、Ｍ）。そして文字認識処理は、上述した如く
検出切出された文字画像と辞書パターンとの類似度を計
算し、例えば最大類似度をとる辞書パターンをその認識
結果として求める等して行われる（第８図の処理Ｎ）。Note that the detected and cropped character image is subjected to normalization processing for the orientation of the character as appropriate according to the information on the direction of extraction, etc., and then subjected to character recognition processing (see Fig. 8). Processing, M). Then, the character recognition process is performed by calculating the degree of similarity between the detected and extracted character image and the dictionary pattern as described above, and, for example, finding the dictionary pattern with the maximum degree of similarity as the recognition result (see Fig. 8). processing N).

このようにして求めら銭た認識結果が、前記入力画像の
文字読取結果として出力される（第８図の処理Ｏ）。The recognition result obtained in this way is output as the character reading result of the input image (process O in FIG. 8).

かくしてこのような文字の検出切出処理機能を備えた本
装置によれば、読取入力対象とする文字列が手書きされ
た向きの不揃いなくせ字を含む場合であっても、文字パ
ターンの一部が欠けるとか、隣接文字のパターンの一部
が入り込む等の不具合を招来することなしに、各文字を
それぞれ高精度に検出切出することができるので、個々
の文字をそれぞれ高精度に読取認識して入力することが
可能となる。この結果、文字画像の誤検出切出を大幅に
低減して、その読取精度の向上を図ることが可能となる
。Thus, according to this device equipped with such a character detection and extraction processing function, even if the character string to be read and input includes handwritten cursive characters with irregular orientations, part of the character pattern is Each character can be detected and cut out with high precision without causing defects such as missing parts or parts of adjacent character patterns entering, so each character can be read and recognized with high precision. It becomes possible to input. As a result, it is possible to significantly reduce erroneously detected extraction of character images and improve the reading accuracy.

尚、本発明は上述した実施例に限定されるものではない
。ここでは５つの方向についての射影をそれぞれ求める
。ようにしたが、その方向の設定や設定数は読取対象と
する文字の性質に応じて定めれば良いものである。また
個々では横書きされた文字列に対する文字画像の検出処
理を例に説明したが、縦書きされた文字列についても同
様に適用可能である。その他、本発明はその要旨を逸脱
しない範囲で踵々変形して実施することができる。Note that the present invention is not limited to the embodiments described above. Here, projections in five directions are obtained. However, the setting of the direction and the number of settings may be determined depending on the nature of the character to be read. In addition, although the description has been made using an example of character image detection processing for horizontally written character strings, the present invention is similarly applicable to vertically written character strings. In addition, the present invention can be implemented with various modifications without departing from the gist thereof.

[Brief explanation of the drawing]

図は本発明の一実施例を示すもので、第１図は実施例に
係る文字読取装置の要部概略構成図、第２図は行検出に
よって求められた文字列画像の例を示す図、第３図は文
字列画像に対する射影を求める方向の例を示す図、第４
図は文字列画像に対する複数の方向についての射影を示
す図、第５図は第４図に示す射影から求められる最小射
影パターンの例を示す図、第６図は最小射影パターンか
ら求められる文字の区切り位置とその区切り方向を示す
図、第７図は第６図に示す検出切出情報に従う文字の検
出切出処理を模式的に示す図、第８図は実施例装置にお
ける処理手順の一例を示す図である。ｌ・・・光電変換部、２・・・文字検出切出部、３・・
・文字認識部、４・・・行検出切出部、５・・・射影作
成部、Ｂ・・・最小射影検出部。出願人代理人　弁理士　鈴江武彦第　１　図第　２　図　　　　　　　　第　３　図嘉４図第５図第６図Ａｖ＞　　　ｆ（ＩＩＩ）纂７図The figures show one embodiment of the present invention, in which Fig. 1 is a schematic diagram of the main parts of a character reading device according to the embodiment, and Fig. 2 is a diagram showing an example of a character string image obtained by line detection. Figure 3 is a diagram showing an example of the direction in which to obtain the projection for a character string image.
Figure 5 shows an example of the minimum projection pattern obtained from the projection shown in Figure 4. Figure 6 shows the projection of the character string image in multiple directions. FIG. 7 is a diagram schematically showing character detection and extraction processing according to the detection and extraction information shown in FIG. 6. FIG. 8 is an example of the processing procedure in the embodiment device. FIG. l...Photoelectric conversion unit, 2...Character detection cutting unit, 3...
-Character recognition unit, 4... Line detection cutting unit, 5... Projection creation unit, B... Minimum projection detection unit. Applicant's agent Patent attorney Takehiko Suzue Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Av > f (III) Figure 7

Claims

[Claims]

a character string detection unit that detects a character string image area forming a character string from an image read by the reading means; and a projection of the character string image obtained by the character string detection unit in a plurality of directions intersecting the character string direction. and the minimum projection value in the projections obtained for each of the plurality of directions for each position in the character string direction, and in the direction of the projection in which these minimum projection values and the minimum projection value were obtained. and means for detecting and cutting out individual character images from the character string image by determining the position and direction of character separation from the minimum projection pattern. Character reading device.