JPH01201786A - Character reader - Google Patents

Character reader

Info

Publication number
JPH01201786A
JPH01201786A JP63026781A JP2678188A JPH01201786A JP H01201786 A JPH01201786 A JP H01201786A JP 63026781 A JP63026781 A JP 63026781A JP 2678188 A JP2678188 A JP 2678188A JP H01201786 A JPH01201786 A JP H01201786A
Authority
JP
Japan
Prior art keywords
character
character string
projection
detection
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP63026781A
Other languages
Japanese (ja)
Inventor
Hiroshi Sasaki
宏 佐々木
Masato Suda
正人 須田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to JP63026781A priority Critical patent/JPH01201786A/en
Publication of JPH01201786A publication Critical patent/JPH01201786A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To precisely detect and segment respective characters so as to submit them for a recognition processing by obtaining the positions of the segment of the characters and the directions of segment from projection in plural directions. CONSTITUTION:A character detection segment part 2 detects and segments respective character pictures from character string pictures which a photoelectric conversion part 1 has image-picked up and inputted in accordance with detection segment information, and submits them for the character recognition processing by a character recognition part 3. Detection segment information can be obtained by obtaining the positions of the segment of the characters and the directions of segments from the projection in plural directions by a row direction segment part 4, a projection generation part 5 and a minimum projection detection part 6 and detecting the segments of respective characters.

Description

【発明の詳細な説明】 [発明の目的]“ (産業上の利用分野) 本発明は読取手段で読取られた文字列画像中の各文字を
精度良く認識することのできる文字読取装置に関する。
DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial Field of Application) The present invention relates to a character reading device that can accurately recognize each character in a character string image read by a reading means.

(従来の技術) 読取手段である光電変換部にて撮像入力された文字列画
像中の文字の読取認識は、基本的には上記文字列画像中
から各文字画像を検出切出し、その文字画像中の文字パ
ターンと、予め辞書登録されている読取対象文字の標準
パタニンとの類似度を計算する等して照合し、その照合
結果に従って認識候補を求め、更に上記文字画像が示す
文字を特定することによって行われる。
(Prior art) Reading and recognition of characters in a character string image captured and inputted by a photoelectric conversion unit, which is a reading means, basically involves detecting and cutting out each character image from the character string image, and then Compare the character pattern by calculating the degree of similarity between the character pattern and the standard pattern of the character to be read registered in the dictionary in advance, obtain recognition candidates according to the verification result, and further specify the character indicated by the character image. carried out by.

然し乍ら、手書き文字を読取入力対象とする文字読取装
置にあっては、手書き文字列の各文字の大きさが不揃い
であること、またその文字間隔にバラツキがある等の理
由により、文字列画像中から個々の文字画像をそれぞれ
切出す為の、所謂読取検出処理が非常に困難なものとな
っている。これ故、従来より上記文字画像の検出切出を
如何にして高精度に行なうか等の研究が種々進められて
いる。
However, in character reading devices that read and input handwritten characters, the size of each character in the handwritten character string is uneven, and the spacing between the characters varies, etc. The so-called reading detection process for cutting out individual character images from each character image has become extremely difficult. For this reason, various studies have been carried out on how to detect and cut out character images with high accuracy.

ところで上述した検出切出処理は、通常、各文字が矩形
状の外接枠内にそれぞれ収まっていることを前提として
行われている。具体的には文字列方向と直角な方向に文
字列画像の射影を求め、この射影パターンから文字列方
向における文字の区切り位置を検出して行われる。然し
乍ら、手書きされた文字列、特に横書きされた日本語文
章にあっては、所謂くせ字のように、往々にしてその文
字自体が傾きを以て記載されることが多くある。
By the way, the above-mentioned detection and cutting process is normally performed on the premise that each character falls within a rectangular circumscribing frame. Specifically, this is performed by obtaining a projection of a character string image in a direction perpendicular to the character string direction, and detecting character break positions in the character string direction from this projection pattern. However, in handwritten character strings, especially Japanese texts written horizontally, the characters themselves are often written with an inclination, as in so-called cursive characters.

しかもその傾きの方向が、成る1つの文字列内において
一定化していないことも多々ある。この為、前述した文
字間隔が不揃いなことと相俟って文字列を構成する各文
字画像の検出切出処理が徒に複雑化し、また多大な処理
時間を必要とする等の問題があった。しかも各文字画像
をそれぞれ高精度に検出切出することができないことに
起因して、その読取精度の向上を望むことが困難である
等の問題が生じた。
Moreover, the direction of the inclination is often not constant within a single character string. For this reason, combined with the uneven character spacing mentioned above, there were problems such as the detection and extraction process of each character image forming a character string becoming unnecessarily complicated and requiring a large amount of processing time. . Furthermore, since each character image cannot be detected and cut out with high precision, problems arise, such as that it is difficult to improve the reading accuracy.

(発明が解決しようとする課題) このように従来にあっては、例えば手書きされた文字列
の各文字の向きが不揃いである場合、光電変換部にて撮
像入力された文字列画像からの各文字画像の検出切出処
理が困難であり、検出切出精度の向上を望むことができ
ないことのみならず、これに起因して文字の読取精度の
向上を望むことができない等の問題があった。
(Problem to be Solved by the Invention) Conventionally, for example, when the orientation of each character of a handwritten character string is uneven, it is difficult to Detection and extraction processing of character images is difficult, and there have been problems such as not only being unable to improve detection and extraction accuracy, but also being unable to improve character reading accuracy due to this. .

本発明はこのような事情を考慮してなされたもので、そ
の目的とするところは、斜体文字が含まれる等、文字の
向きが不揃いな文字列であっても、その文字列を構成す
る各文字の文字画像をそれぞれ高精度に検出切出するこ
とかでき、文字読取精度の向上を図ることのできる文字
読取装置を提供することにある。
The present invention has been made in consideration of these circumstances, and its purpose is to ensure that each character string that makes up the character string is not aligned even if the characters are in irregular orientations, such as those that include italic characters. It is an object of the present invention to provide a character reading device capable of detecting and cutting out character images of characters with high precision and improving character reading accuracy.

[発明の構成] (課題を解決するための手段) 本発明に係る文字読取装置は、読取手段で読取入力され
た画像から抽出された文字列をなす文字列画像に対して
、その文字列方向と交差する複数の方向についての射影
をそれぞれ求め、これらの射影を相互に比較して上記文
字列方向の各位置毎に上記射影中の最小射影値を求め、
これらの最小射影値とその最小射影値を得た射影の方向
で示される最小射影パターンから、例えばその最小射影
パターンを所定の閾値で2値化する等して前記文字列画
像における各文字の区切り位置を求め、同時にその位置
での射影の向きから文字の区切りの向きを求め、これら
の情報に従って検出切出を行なうようにしたことを特徴
とするものである。
[Structure of the Invention] (Means for Solving the Problems) A character reading device according to the present invention detects the character string direction for a character string image forming a character string extracted from an image read and input by a reading means. Find the projections for a plurality of directions that intersect with each other, compare these projections with each other, and find the minimum projection value among the projections for each position in the character string direction,
From these minimum projection values and the minimum projection pattern indicated by the direction of the projection from which the minimum projection value was obtained, for example, the minimum projection pattern is binarized with a predetermined threshold value to separate each character in the character string image. The present invention is characterized in that the position is determined, and at the same time, the direction of character separation is determined from the direction of projection at that position, and detection and segmentation is performed in accordance with this information.

(作用) 本発明によれば、手書きされた不揃いな傾きを持つくせ
字からなる文字列を読取入力するに際しても、複数の方
向についての射影から文字の区切りの位置と、その区切
りの向きとが求められて各文字の切れ目が検出されるの
で、この切れ目の情報に従って個々の文字画像をそれぞ
れ高精度に検出切出して認識処理に供することができる
。この結果、文字の読取入力を高精度に行なうことが可
能となる。
(Function) According to the present invention, even when reading and inputting a character string consisting of handwritten characters with irregular inclinations, the positions of character breaks and the directions of the breaks can be determined from projections in a plurality of directions. Since the breaks in each character are detected, each character image can be detected and cut out with high precision according to the information on the breaks and subjected to recognition processing. As a result, it becomes possible to read and input characters with high precision.

(実施例) 以下、図面を参照して本発明の一実施例につき説明する
(Example) Hereinafter, an example of the present invention will be described with reference to the drawings.

第1図は本発明の一実施例に係る文字読取装置の要部概
略構成図であり、第8図はその全体的な処理手続きの流
れを示している。
FIG. 1 is a schematic diagram of the main parts of a character reading device according to an embodiment of the present invention, and FIG. 8 shows the flow of its overall processing procedure.

読取手段である光電変換部lはTVカメラや半導体イメ
ージセンサ等からなり、帳票上に記載された文字列情報
を画像入力し、その文字列画像をディジタル化している
(第8図の処理A)。このディジタル画像が一旦フレー
ムメモリに格納され、以下に説明する文字情報の読取入
力に供せられる。
The photoelectric conversion unit l, which is the reading means, is composed of a TV camera, a semiconductor image sensor, etc., inputs the character string information written on the form as an image, and digitizes the character string image (process A in Fig. 8). . This digital image is temporarily stored in a frame memory, and is used for reading and inputting character information as described below.

文字検出切出部2は、上述した如く撮像入力された文字
列画像から後述する検出切出情報に従って個々の文字画
像を検出切出し、文字認識部3による文字認識処理に供
する。
The character detection and cutout unit 2 detects and cuts out individual character images from the input character string image as described above in accordance with detection and cutout information to be described later, and subjects them to character recognition processing by the character recognition unit 3.

さて上記文字の検出切出処理に供される検出切出情報は
次のようにして求められる。
The detection and extraction information used in the character detection and extraction process described above is obtained as follows.

行検出切出部41は入力画像の射影を、その縦方向およ
び横方向にそれぞれ求め、文字列画像の領域を検出して
いる(第8図の処理B、C)。この行検出切出部4で求
められた個々の文字列画像領域の像がそれぞれ文字の検
出切出処理に供される。
The line detection/cutting unit 41 obtains the projection of the input image in the vertical and horizontal directions, respectively, and detects the region of the character string image (processes B and C in FIG. 8). The images of the individual character string image regions obtained by the line detection and cutout section 4 are each subjected to character detection and cutout processing.

しかして射影作成部5は、上記行検出切出部4で求めら
れた成る文字列画像について、その文字列方向と交差す
る複数の方向での射影をそれぞれ求めるものである。具
体的には、例えば第2図に示すように成る行幅で切出さ
れた文字列画像について、第3図に示す如X予め設定さ
れた複数の方向1. II、〜Vについての射影を第4
図に示すようにそれぞれ求めている(第8図の処理D 
1.D 2゜〜Dn)。尚、ここでは射影を求める方向
の1つ、■が前記文字列方向と直交する向きとして定め
られている。そして方向I、IIは、横書き文字列に対
する所謂前傾文字(左斜体文字)の切れ目を検出する為
の方向として、また方向IV、 Vは所謂後傾文字(右
斜体文字)の切れ目を検出する為の方向としてそれぞれ
定められている。
The projection creation unit 5 calculates projections of the character string image obtained by the line detection and cutting unit 4 in a plurality of directions intersecting the direction of the character string. Specifically, for example, for a character string image cut out with a line width as shown in FIG. 2, a plurality of preset directions 1. II, the projection on ~V is the fourth
Each is calculated as shown in the figure (Processing D in Figure 8).
1. D2°~Dn). Note that here, one of the directions in which projection is sought, ■, is defined as a direction perpendicular to the character string direction. Directions I and II are used to detect breaks in so-called forward-slanted characters (left italic characters) in a horizontally written character string, and directions IV and V are used to detect breaks in so-called backward-slanted characters (right italic characters). Each direction is determined as a direction.

しかして最小射影検出部6は上述した如く各方向につい
て求められた射影を、文字列方向の各位置にて相互に比
較しく第8図の処理E)、それらの射影値の中で最小な
射影値と、その射影値を得た前述した射影の方向の情報
とを第5図に示すように最小射影パターンとして求めて
いる(第8図の処理E1.E2)。即ち、第5図に示す
ように文字列方向の各位置での最小射影値を抽出した射
影パターンαと、上記文字列方向の各位置において最小
射影値を得た射影の方向βとをそれぞれ求めている。尚
、前述した第4図に示す複数の方向の射影から求められ
る第5図に示す最小射影パターンでは、領域aの射影成
分が方向■から得られ、領域すの射影成分が方向■から
得られたことが示される。このような処理が文字行方向
の全ビットについて繰返し行われる(第8図の処理F)
Then, the minimum projection detection unit 6 compares the projections obtained in each direction as described above at each position in the character string direction, and calculates the minimum projection among those projection values. The value and the above-mentioned projection direction information from which the projection value was obtained are obtained as a minimum projection pattern as shown in FIG. 5 (processes E1 and E2 in FIG. 8). That is, as shown in FIG. 5, the projection pattern α in which the minimum projection value was extracted at each position in the character string direction, and the projection direction β in which the minimum projection value was obtained at each position in the character string direction are determined. ing. In addition, in the minimum projection pattern shown in FIG. 5 obtained from the projections in a plurality of directions shown in FIG. It is shown that Such processing is repeated for all bits in the character line direction (processing F in Figure 8).
.

最小射影検出部6では、このようにして得られた最小射
影パターンaを所定の閾値で2値イヒしく第8図の処理
G)、その文字ピッチを推定する(第8図の処理H)。
The minimum projection detection unit 6 performs a binary evaluation of the minimum projection pattern a obtained in this manner using a predetermined threshold value (processing G in FIG. 8), and estimates its character pitch (processing H in FIG. 8).

そして第6図に示すように文字の区切り位置を求め(第
8図の処理l)、同時にその区切り位置での文字の区切
りの方向を前述した最小射影パターンの最小射影値を得
た射影の方向βから求めている(第8図の処理J)。こ
のような文字の区切り検出処理を全ての文字について繰
返し実行する(第8図の処理K)。
Then, as shown in Fig. 6, the character break position is determined (process 1 in Fig. 8), and at the same time, the direction of the character break at that break position is determined in the projection direction that obtained the minimum projection value of the minimum projection pattern described above. It is obtained from β (processing J in Figure 8). Such a character break detection process is repeatedly executed for all characters (process K in FIG. 8).

この第6図に示すようにして検出された文字の区切り位
置の情報とその区切りの向きの情報とが前述した検出切
出情報として、前記文字列領域の検出情報と共に前記文
字検出切出部2に与えられる。
The information on the position of the characters detected as shown in FIG. 6 and the information on the orientation of the characters are used together with the detection information on the character string area as the above-mentioned detection extraction information by the character detection and extraction unit 2. given to.

しかして文字検出切出部2ではこのような検出切出情報
に従い、例えば入力文字列画像が第2図に示すように与
えられた場合、これを第7図に示すように各文字の区切
り位置毎に、そこでの文字の区切りの向きに従って検出
切出処理を実行し、各文字画像をそれぞれ検出切出する
For example, when an input character string image is given as shown in FIG. 2, the character detection and extraction section 2 calculates the separation position of each character as shown in FIG. For each character image, detection and extraction processing is executed according to the direction of character separation, and each character image is detected and extracted.

尚、検出切出された文字画像は、その切出しの方向の情
報等に従って、適宜、その文字の向きに対する正規化処
理が施される等して文字の認識処理に供される(第8図
の処理り、M)。そして文字認識処理は、上述した如く
検出切出された文字画像と辞書パターンとの類似度を計
算し、例えば最大類似度をとる辞書パターンをその認識
結果として求める等して行われる(第8図の処理N)。
Note that the detected and cropped character image is subjected to normalization processing for the orientation of the character as appropriate according to the information on the direction of extraction, etc., and then subjected to character recognition processing (see Fig. 8). Processing, M). Then, the character recognition process is performed by calculating the degree of similarity between the detected and extracted character image and the dictionary pattern as described above, and, for example, finding the dictionary pattern with the maximum degree of similarity as the recognition result (see Fig. 8). processing N).

このようにして求めら銭た認識結果が、前記入力画像の
文字読取結果として出力される(第8図の処理O)。
The recognition result obtained in this way is output as the character reading result of the input image (process O in FIG. 8).

かくしてこのような文字の検出切出処理機能を備えた本
装置によれば、読取入力対象とする文字列が手書きされ
た向きの不揃いなくせ字を含む場合であっても、文字パ
ターンの一部が欠けるとか、隣接文字のパターンの一部
が入り込む等の不具合を招来することなしに、各文字を
それぞれ高精度に検出切出することができるので、個々
の文字をそれぞれ高精度に読取認識して入力することが
可能となる。この結果、文字画像の誤検出切出を大幅に
低減して、その読取精度の向上を図ることが可能となる
Thus, according to this device equipped with such a character detection and extraction processing function, even if the character string to be read and input includes handwritten cursive characters with irregular orientations, part of the character pattern is Each character can be detected and cut out with high precision without causing defects such as missing parts or parts of adjacent character patterns entering, so each character can be read and recognized with high precision. It becomes possible to input. As a result, it is possible to significantly reduce erroneously detected extraction of character images and improve the reading accuracy.

尚、本発明は上述した実施例に限定されるものではない
。ここでは5つの方向についての射影をそれぞれ求める
。ようにしたが、その方向の設定や設定数は読取対象と
する文字の性質に応じて定めれば良いものである。また
個々では横書きされた文字列に対する文字画像の検出処
理を例に説明したが、縦書きされた文字列についても同
様に適用可能である。その他、本発明はその要旨を逸脱
しない範囲で踵々変形して実施することができる。
Note that the present invention is not limited to the embodiments described above. Here, projections in five directions are obtained. However, the setting of the direction and the number of settings may be determined depending on the nature of the character to be read. In addition, although the description has been made using an example of character image detection processing for horizontally written character strings, the present invention is similarly applicable to vertically written character strings. In addition, the present invention can be implemented with various modifications without departing from the gist thereof.

【図面の簡単な説明】[Brief explanation of the drawing]

図は本発明の一実施例を示すもので、第1図は実施例に
係る文字読取装置の要部概略構成図、第2図は行検出に
よって求められた文字列画像の例を示す図、第3図は文
字列画像に対する射影を求める方向の例を示す図、第4
図は文字列画像に対する複数の方向についての射影を示
す図、第5図は第4図に示す射影から求められる最小射
影パターンの例を示す図、第6図は最小射影パターンか
ら求められる文字の区切り位置とその区切り方向を示す
図、第7図は第6図に示す検出切出情報に従う文字の検
出切出処理を模式的に示す図、第8図は実施例装置にお
ける処理手順の一例を示す図である。 l・・・光電変換部、2・・・文字検出切出部、3・・
・文字認識部、4・・・行検出切出部、5・・・射影作
成部、B・・・最小射影検出部。 出願人代理人 弁理士 鈴江武彦 第 1 図 第 2 図        第 3 図嘉4図 第5図 第6図 Av>   f(III) 纂7図
The figures show one embodiment of the present invention, in which Fig. 1 is a schematic diagram of the main parts of a character reading device according to the embodiment, and Fig. 2 is a diagram showing an example of a character string image obtained by line detection. Figure 3 is a diagram showing an example of the direction in which to obtain the projection for a character string image.
Figure 5 shows an example of the minimum projection pattern obtained from the projection shown in Figure 4. Figure 6 shows the projection of the character string image in multiple directions. FIG. 7 is a diagram schematically showing character detection and extraction processing according to the detection and extraction information shown in FIG. 6. FIG. 8 is an example of the processing procedure in the embodiment device. FIG. l...Photoelectric conversion unit, 2...Character detection cutting unit, 3...
-Character recognition unit, 4... Line detection cutting unit, 5... Projection creation unit, B... Minimum projection detection unit. Applicant's agent Patent attorney Takehiko Suzue Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Av > f (III) Figure 7

Claims (1)

【特許請求の範囲】[Claims] 読取手段で読取られた画像から文字列をなす文字列画像
領域を検出する文字列検出部と、この文字列検出部で求
められた文字列画像の文字列方向と交差する複数の方向
についての射影をそれぞれ求める手段と、上記文字列方
向の各位置毎に上記複数の方向についてそれぞれ求めら
れた射影中の最小射影値を求め、これらの最小射影値と
その最小射影値を得た射影の方向で示される最小射影パ
ターンを作成する手段と、この最小射影パターンから文
字の区切り位置と区切りの向きを求めて前記文字列画像
中から個々の文字画像を検出切出する手段とを具備した
ことを特徴とする文字読取装置。
a character string detection unit that detects a character string image area forming a character string from an image read by the reading means; and a projection of the character string image obtained by the character string detection unit in a plurality of directions intersecting the character string direction. and the minimum projection value in the projections obtained for each of the plurality of directions for each position in the character string direction, and in the direction of the projection in which these minimum projection values and the minimum projection value were obtained. and means for detecting and cutting out individual character images from the character string image by determining the position and direction of character separation from the minimum projection pattern. Character reading device.
JP63026781A 1988-02-08 1988-02-08 Character reader Pending JPH01201786A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP63026781A JPH01201786A (en) 1988-02-08 1988-02-08 Character reader

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP63026781A JPH01201786A (en) 1988-02-08 1988-02-08 Character reader

Publications (1)

Publication Number Publication Date
JPH01201786A true JPH01201786A (en) 1989-08-14

Family

ID=12202851

Family Applications (1)

Application Number Title Priority Date Filing Date
JP63026781A Pending JPH01201786A (en) 1988-02-08 1988-02-08 Character reader

Country Status (1)

Country Link
JP (1) JPH01201786A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013171309A (en) * 2012-02-17 2013-09-02 Omron Corp Character segmentation method, and character recognition device and program using the same

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013171309A (en) * 2012-02-17 2013-09-02 Omron Corp Character segmentation method, and character recognition device and program using the same

Similar Documents

Publication Publication Date Title
CN106960208B (en) Method and system for automatically segmenting and identifying instrument liquid crystal number
JPH05242292A (en) Separating method
JPH0519753B2 (en)
CN109712162A (en) A kind of cable character defect inspection method and device based on projection histogram difference
Ali et al. A novel approach to correction of a skew at document level using an Arabic script
US4596038A (en) Method and apparatus for character recognition
JPH01201786A (en) Character reader
JP2005250786A (en) Image recognition method
JPH07220081A (en) Segmenting method for graphic of image recognizing device
JP2590099B2 (en) Character reading method
JPH07160810A (en) Character recognizing device
JPH10187886A (en) Character recognizing device and method
US5272765A (en) System for processing character images
JPH0452975A (en) Fingerprint pattern sorting device
Fadeel An efficient segmentation algorithm for Arabic handwritten characters recognition system
JPH01201790A (en) Character reader
JPH11238135A (en) Method and device for image recognition
JPH01201789A (en) Character reader
JPS6027082A (en) Sloping angle detector for character string
JPH0632077B2 (en) Figure recognition device
JPH087041A (en) Character recognition method and device
Caprioli et al. Detecting and grouping words in topographic maps by means of perceptual concepts
KR20220168787A (en) Method to extract units of Manchu characters and system
JPH08147414A (en) Character string reader
JPS61161583A (en) Pattern recognizing device