JP2690483B2 - Character line extraction method - Google Patents

Character line extraction method

Info

Publication number
JP2690483B2
JP2690483B2 JP61261465A JP26146586A JP2690483B2 JP 2690483 B2 JP2690483 B2 JP 2690483B2 JP 61261465 A JP61261465 A JP 61261465A JP 26146586 A JP26146586 A JP 26146586A JP 2690483 B2 JP2690483 B2 JP 2690483B2
Authority
JP
Japan
Prior art keywords
character
width
character string
character line
specific block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP61261465A
Other languages
Japanese (ja)
Other versions
JPS63115286A (en
Inventor
慎治 佐瀬
壽夫 石川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP61261465A priority Critical patent/JP2690483B2/en
Publication of JPS63115286A publication Critical patent/JPS63115286A/en
Application granted granted Critical
Publication of JP2690483B2 publication Critical patent/JP2690483B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は、記載書式の制限がない文字行を読取る光学
的文字読取り装置(OCR)に関し、特に郵便物上に記載
されている住所を直接読取り、区分する住所読取り区分
機の文字行切出し方法に関する。 〔従来の技術〕 従来、この種のOCRにおいて、文字行を切出す場合に
は、文字行が交差していることはないという仮定を設け
ているため、黒ブロックが長く連なっている全体を文字
行とし、その連なっている方向を文字行の方向としてい
た。 〔発明が解決しようとする問題点〕 郵便物の住所の書式には、第2図(a),(b)のよ
うに文字行が交差している例が多数ある。このように交
差している文字行から縦行と横行を分離しないと後処理
の文字切出しや文字認識を正しく行なうことはできな
い。 従って、従来の技術で第2図の住所・宛名の文字行を
切出すと、第3図のように縦2行として文字行を切出す
ため横行を読取れず、第2図のような郵便物に対して、
それ以降の認識処理を行なうことは困難となる。 〔問題点を解決するための手段〕 上記問題点を解決するために、本発明の文字行切出し
方法は、走査入力された紙葉類のイメージに基づいて、
紙葉類上の文字パターンを所定の文字列を含む複数の矩
形ブロックに分け、各矩形ブロックの幅を検出する第1
のステップと、検出された前記矩形ブロックの幅を相互
間で比較する第2のステップと、比較の結果、他の矩形
ブロックの幅との差が、予め設定された値以上に大きい
幅を有する矩形ブロックを特定ブロックとして検出する
第3のステップと、前記特定ブロックを再走査して得ら
れるイメージに基づいて、前記特定ブロックに含まれる
文字の幅およびその文字の幅方向に前記イメージを投影
して得られる特定ブロック内の文字の分布密度を検出す
る第4のステップと、前記第4のステップで検出された
前記幅および分布密度に基づいて、前記特定ブロック内
に含まれる文字列のうち他の文字列と交差する交差文字
列を検出する第5のステップと、前記交差文字列を他の
文字列と分離する第6のステップとを含むものである。 このように、交差文字行を検出して、もとの文字行と
は別に切出して読取ることにより、文字行を正確に読取
り、正しい認識処理ができる。 〔実施例〕 次に、本発明の実施例について図面を参照して説明す
る。 第1図は本発明の文字行切出し方法の一実施例のフロ
ーチャートである。 まず、従来と同じ文字行切出しを行ない(ステップ
1)、各文字行の文字幅を検出する(ステップ2)。検
出された文字行の幅を相互に比較して、幅が異常に大き
いかを調べ(ステップ3)、異常に大きくなければ該文
字行は縦行として文字行イメージ処理へ進む。文字行幅
に異常に大きい行があると、該文字行について文字幅と
密度の特徴検出のために再走査を行なう(ステップ
4)。再走査の結果、他の文字行より著しく文字幅が大
きく、かつその密度の最低値が該文字行の中で他に比べ
て大きい箇所か否かを判定し(ステップ5)、文字幅が
大きく、密度の最低値が他に比べて大きければその箇所
を交差文字行として切出しを行い(ステップ6)、文字
行のイメージ処理へ進む。 第4図は第2図(a)に示した郵便物の住所の走査結
果と文字幅と密度を示す図である。 横行の文字列「1−22−1」は他の文字よりも著しく
幅が大きく、かつ密度の最低値が他の文字部分よりも大
きいのでステップ3〜5の処理において交差行と判定さ
れ、その文字の前後のスペースを切出し位置としてブロ
ック化方式によりもとの縦文字行とは別に切出しが行な
われる。 〔発明の効果〕 以上説明したように本発明は、交差している文字行を
検出し、もとの文字行とは別に切出すことにより、交差
行のある文字行を正確に読取り、正しい認識処理ができ
る効果がある。
Description: TECHNICAL FIELD The present invention relates to an optical character reader (OCR) for reading a character line having no limitation on the description format, and more particularly, to an address directly on a mail piece. The present invention relates to a method for cutting out a character line of an address reading / sorting device for reading and sorting. [Prior Art] Conventionally, in this type of OCR, when cutting out a character line, it is assumed that the character lines do not intersect. Lines were used, and the direction in which they were connected was the direction of character lines. [Problems to be Solved by the Invention] There are many examples of address formats of mails in which character lines intersect as shown in FIGS. 2 (a) and 2 (b). If vertical lines and horizontal lines are not separated from the intersecting character lines in this way, post-processing character cutting and character recognition cannot be performed correctly. Therefore, when the character line of the address / address shown in FIG. 2 is cut out by the conventional technique, the character line is cut out as two vertical lines as shown in FIG. 3 and the horizontal line cannot be read. Against
It is difficult to perform subsequent recognition processing. [Means for Solving Problems] In order to solve the above problems, the character line cutting method of the present invention is based on an image of a paper sheet scanned and input,
First, the character pattern on the paper sheet is divided into a plurality of rectangular blocks including a predetermined character string, and the width of each rectangular block is detected.
And a second step of comparing the widths of the detected rectangular blocks with each other, and as a result of the comparison, the difference between the widths of the other rectangular blocks is larger than a preset value. A third step of detecting a rectangular block as a specific block, and projecting the image in the width of a character included in the specific block and in the width direction of the character based on an image obtained by rescanning the specific block. A fourth step of detecting the distribution density of the characters in the specific block obtained as described above, and other of the character strings included in the specific block based on the width and the distribution density detected in the fourth step. And a sixth step of separating the intersecting character string from other character strings. In this way, by detecting the intersecting character line and cutting and reading it separately from the original character line, the character line can be accurately read and correct recognition processing can be performed. Embodiment Next, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a flow chart of an embodiment of the character line cutting method of the present invention. First, the same character line cutting as in the conventional case is performed (step 1), and the character width of each character line is detected (step 2). The widths of the detected character lines are compared with each other to check whether the width is abnormally large (step 3). If the width is not abnormally large, the character line is regarded as a vertical line and the process proceeds to the character line image processing. If there is a line with an abnormally large character line width, rescanning is performed for the character line to detect character width and density features (step 4). As a result of the rescanning, it is judged whether or not the character width is significantly larger than other character lines and the minimum value of the density is larger than the others in the character line (step 5), and the character width is increased. If the minimum value of the density is larger than the others, that portion is cut out as an intersecting character line (step 6), and the process proceeds to image processing of the character line. FIG. 4 is a diagram showing the scanning result, the character width, and the density of the mail address shown in FIG. 2 (a). The horizontal character string "1-22-1" is significantly wider than other characters, and the minimum value of the density is larger than other character parts. Therefore, it is determined to be a crossing line in the processing of steps 3 to 5, and Using the space before and after the character as the cutout position, cutting is performed separately from the original vertical character line by the blocking method. [Effects of the Invention] As described above, the present invention detects a crossing character line and cuts it out separately from the original character line, thereby accurately reading the character line having the crossing line and performing correct recognition. There is an effect that can be processed.

【図面の簡単な説明】 第1図は本発明の文字行切出し方法の一実施例を示すフ
ローチャート、第2図(a),(b)は文字行が交差し
ている住所記載の郵便物の例を示す図、第3図(a),
(b)は第2図に対する従来技術による文字行切出しを
した図、第4図(a),(b),(c)はそれぞれ第2
図(a)を示す郵便物の走査結果、文字行の幅、密度を
示す図である。 1〜6……ステップ。
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a flow chart showing an embodiment of a character line cutting method according to the present invention, and FIGS. 2 (a) and 2 (b) are postal items described in addresses where character lines intersect. Figure showing an example, Figure 3 (a),
(B) is a diagram obtained by cutting out character lines according to the prior art to FIG. 2, and FIGS. 4 (a), (b) and (c) are respectively the second
It is a figure which shows the scanning result of the mail item shown in FIG. 1-6 ... Step.

Claims (1)

(57)【特許請求の範囲】 1.走査入力された紙葉類のイメージに基づいて、紙葉
類上の文字パターンを所定の文字列を含む複数の矩形ブ
ロックに分け、各矩形ブロックの幅を検出する第1のス
テップと、 検出された前記矩形ブロックの幅を相互間で比較する第
2のステップと、 比較の結果、他の矩形ブロックの幅との差が、予め設定
された値以上に大きい幅を有する矩形ブロックを特定ブ
ロックとして検出する第3のステップと、 前記特定ブロックを再走査して得られるイメージに基づ
いて、前記特定ブロックに含まれる文字の幅およびその
文字の幅方向に前記イメージを投影して得られる特定ブ
ロック内の文字の分布密度を検出する第4のステップ
と、 前記第4のステップで検出された前記幅および分布密度
に基づいて、前記特定ブロック内に含まれる文字列のう
ち他の文字列と交差する交差文字列を検出する第5のス
テップと、 前記交差文字列を他の文字列と分離する第6のステップ
と を含むことを特徴とする文字行切出し方法。
(57) [Claims] The first step of dividing the character pattern on the paper sheet into a plurality of rectangular blocks including a predetermined character string based on the scanned image of the paper sheet and detecting the width of each rectangular block; And a second step of comparing the widths of the rectangular blocks with each other, and as a result of the comparison, a rectangular block having a width larger than a preset value by a difference from the widths of other rectangular blocks is set as a specific block. A third step of detecting, and within a specific block obtained by projecting the image in the width direction of the character included in the specific block and the width direction of the character based on the image obtained by rescanning the specific block A fourth step of detecting the distribution density of the characters, and a character string included in the specific block based on the width and the distribution density detected in the fourth step. A character line cutout method comprising: a fifth step of detecting an intersecting character string that intersects another character string; and a sixth step of separating the intersecting character string from another character string.
JP61261465A 1986-10-31 1986-10-31 Character line extraction method Expired - Lifetime JP2690483B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61261465A JP2690483B2 (en) 1986-10-31 1986-10-31 Character line extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61261465A JP2690483B2 (en) 1986-10-31 1986-10-31 Character line extraction method

Publications (2)

Publication Number Publication Date
JPS63115286A JPS63115286A (en) 1988-05-19
JP2690483B2 true JP2690483B2 (en) 1997-12-10

Family

ID=17362275

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61261465A Expired - Lifetime JP2690483B2 (en) 1986-10-31 1986-10-31 Character line extraction method

Country Status (1)

Country Link
JP (1) JP2690483B2 (en)

Also Published As

Publication number Publication date
JPS63115286A (en) 1988-05-19

Similar Documents

Publication Publication Date Title
US6959121B2 (en) Document image processing device, document image processing method, and memory medium
JP2690483B2 (en) Character line extraction method
Najman et al. Indexing technical drawings using title block structure recognition
JPH0660224A (en) Optical character reader
JPS6325391B2 (en)
JPH07230525A (en) Method for recognizing ruled line and method for processing table
JPH0632070B2 (en) Character recognition device
JPH087031A (en) Character frame detecting device
JP2714012B2 (en) Address area detection device
JPH0373916B2 (en)
JPH01269184A (en) Intra-document area boundary extracting system
JP2859307B2 (en) Character extraction device
JPS59180783A (en) Optical character reader
JPS63101983A (en) Character string extracting system
JP3373068B2 (en) Optical character recognition device
JPH103517A (en) Device for detecting tilt angle of document picture
JP2511131B2 (en) Character recognition device
JPH04205690A (en) Character recognition device
JPS622356B2 (en)
JPH05309341A (en) Character recognition device
JPS59206987A (en) Letter recognizing device
JPS63136181A (en) Character reader
JPS58170582A (en) Window detecting apparatus
JPH0459670B2 (en)
JPH05135204A (en) Character recognition device

Legal Events

Date Code Title Description
EXPY Cancellation because of completion of term