JP3586949B2

JP3586949B2 - Form recognition device

Info

Publication number: JP3586949B2
Application number: JP29803995A
Authority: JP
Inventors: 淳晴山本; 豊樹川原; 祐二丸山; 秀彦川上; 龍次山崎; 幹男藤田
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1995-11-16
Filing date: 1995-11-16
Publication date: 2004-11-10
Anticipated expiration: 2015-11-16
Also published as: JPH09138837A

Description

【０００１】
【発明の属する技術分野】
本発明は、帳票のような枠罫線と文字を含む文書画像において枠罫線の構造を認識し、帳票を識別し帳票内に記入されている特定の文字領域を切り出し、文字を自動認識するための帳票認識装置に関する。
【０００２】
【従来の技術】
近年、文書情報の電子化に伴い、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）を始めとする文字認識技術や文書画像処理に対する要望が高まっており、帳票など表形式文書の自動読み取り技術もそのひとつである。とりわけ複数の書式が混在する帳票を処理する場合は、予め人手で帳票を種類毎に分類した後ＯＣＲ装置に読み取らせる必要があり、時間と労力の削減のため帳票の識別と文字読み取りをともに自動化する事が要望されている。
【０００３】
帳票の識別技術としては、例えば特開平２−２１７９７７号公報のように局部的な罫線構造の相違から帳票の種類を識別し、予め指示された領域を読み取る方法があり、その従来例を図１９、図２１及び図２２を用いて説明する。例えば図１９（ａ）に示すように帳票イメージ３１０に対し点線で囲む線分検出領域において水平線分を抽出する事により同図（ｂ）に示す如くｋ１〜ｋ６の６本の線分が検出される。これに対し図２０（ａ）に示す帳票３２０において同様に線分抽出を行うと、同図（ｂ）に示す如くｍ１〜ｍ７の７本の線分が検出される。したがって予め複数の帳票が識別可能な線分検出領域を設定し、領域内の線分の本数、互いの位置関係、長さを検出し比較することにより帳票３１０と３２０の識別が可能となり、予め指定された文字読み取り領域、すなわち図１９（ａ）の帳票３１０においては文字領域３１１及び３１２を、図２０（ａ）の帳票３２０においては文字領域３２１及び３２２を切り出し、文字認識を行うことにより自動読み取りを達成するものである。
【０００４】
切り出された文字の認識については、様々な方式が提案されており例えば、ニューラルネットを用いた文字認識法（森：”ＰＤＰモデルによる手書き漢字認識”、電子情報通信学会論文誌、Ｖｏｌ．Ｊ７３−Ｄ−ＩＩ，Ｎｏ．８ｐｐ．１２６８−１２７４１９９０）があり、認識率も実用的なところまできているが、今回は帳票の枠線認識装置ということで文字認識の前処理に限定しているので文字認識に関しては省略するものとする。
【０００５】
【発明が解決しようとする課題】
しかしながら前記の従来の構成では、帳票の枠罫線構造の局所的な相違を検出するため図２２（ａ）に示すような帳票３３０が混在する場合、線分検出を行うと同図（ｂ）に示すようにｎ１〜ｎ７の７本の線分が検出され、図２０（ｂ）に示した場合と同一となり図２１（ａ）の文字読み取り領域３３１及び３３２を特定することが難しく、また線分検出領域の範囲を予め目視で確認して設定する必要があり、作業が複雑化するという課題があった。
【０００６】
本願発明は、前記従来技術の課題を解決するもので、多種多様な書式の帳票が混在する場合でも帳票を識別し、帳票毎の文字読み取り領域を正確に検出する信頼性の高い帳票認識装置を提供することを目的とする。
【０００７】
【課題を解決するための手段】
この課題を解決するために本発明は、２値画像から枠罫線のコーナーを検出するコーナー検出手段と、前記検出された枠罫線のコーナーの形状の組み合わせから枠罫線の構成要素を検出する構成要素検出手段と、前記構成要素どうしを連結し、枠構造情報として出力する矩形検出手段と、前記枠構造情報をもとに最外郭の外形枠を検出し、外形枠の構造の特徴を外形枠構造特徴として抽出する外形枠検出手段と、抽出された外形枠毎に内部の構造特徴を内部構造特徴として抽出する特徴抽出手段と、１つ以上の帳票の書式として外形枠及び内部構造特徴を予め登録した枠構造参照テーブルと、前記検出された外形枠構造特徴と前記枠構造参照テーブルを検索し帳票の候補を決め、決められた候補に対して前記検出された内部構造特徴と照合することにより帳票の種別を特定する枠構造照合手段とを設けたものである。
さらに、枠罫線を含む２値画像を記憶する画像メモリと、前記検出された外形枠構造特徴及び内部構造特徴から文字読み取り対象領域の座標に基づき前記画像メモリから文字領域を切り出す文字切り出し手段と、切り出された文字を認識する文字認識手段とを有するようにしたものである。
【０００８】
これにより、多種多様な書式の帳票が混在する場合でも帳票を識別し、帳票毎の文字読み取り領域を正確に検出でき、信頼性の高い帳票認識装置が実現できる。
【０００９】
【発明の実施の形態】
本発明の請求項１記載の発明は、２値画像から枠罫線のコーナーを検出するコーナー検出手段と、前記検出された枠罫線のコーナーの形状の組み合わせから枠罫線の構成要素を検出する構成要素検出手段と、前記構成要素どうしを連結し、枠構造情報として出力する矩形検出手段と、前記枠構造情報をもとに最外郭の外形枠を検出し、外形枠の構造の特徴を外形枠構造特徴として抽出する外形枠検出手段と、抽出された外形枠毎に内部の構造特徴を内部構造特徴として抽出する特徴抽出手段と、１つ以上の帳票の書式として外形枠及び内部構造特徴を予め登録した枠構造参照テーブルと、前記検出された外形枠構造特徴と前記枠構造参照テーブルを検索し帳票の候補を決め、決められた候補に対して前記検出された内部構造特徴と照合することにより帳票の種別を特定する枠構造照合手段とを具備する帳票認識装置としたものであり、帳票内の外形枠の構造に基づき帳票の書式カテゴリを判定し、さらに個々の外形枠毎に予め参照テーブルに登録した見本となる枠罫線構造と照合するため、正確に帳票の種別を特定できるという作用を有する。
【００１０】
請求項２に記載の発明は、更に、枠罫線と文字を含む２値画像から水平及び垂直方向の所定長以上のランを検出することにより枠罫線を抽出する罫線抽出手段を有し、前記抽出した枠罫線からなるパターンのコーナーを検出することを特徴とするもので、文字が消去され枠罫線のみからコーナー検出が可能となり高精度に構成要素が検出できるという作用を有する。
【００１１】
請求項３に記載の発明は、構成要素検出手段は、前記コーナーの形状の組み合わせから罫線の屈曲や交差による構成要素としてＬ字要素、Ｔ字要素及び十字要素の構成要素を検出するもので、コーナーの組合せから罫線の構成要素を検出することにより傾いて読み取られた帳票からでも安定して構成要素が検出できるために正確に帳票の種別を特定でき帳票毎の文字読み取り領域を正確に検出できるという作用を有する。
請求項４に記載の発明は、外形枠検出手段は、頂点座標を連結させて、独立した外形枠毎に外形枠構造特徴として抽出することを特徴とするもので、外形枠の頂点座標を連結することで容易に外形枠構造を抽出することができるという作用を有する。
請求項５に記載の発明は、外形枠構造特徴は、対象帳票内の外形枠の総数、位置及び形状を抽出することを特徴とするもので、外形枠の頂点座標を連結することで容易に外形枠構造を抽出することができるという作用を有する。
請求項６に記載の発明は、特徴抽出手段は、各外形枠毎に内部構造特徴として内部の枠の数、位置、幅及び高さを抽出することを特徴とするもので、抽出された各外形枠毎の内部構造を用いて帳票の種別を正確に特定することができるという作用を有する。
請求項７に記載の発明は、さらに、枠罫線を含む２値画像を記憶する画像メモリと、前記検出された外形枠構造特徴及び内部構造特徴から文字読み取り対象領域の座標に基づき前記画像メモリから文字領域を切り出す文字切り出し手段と、切り出された文字を認識する文字認識手段とを有することを特徴とするもので、正確に帳票の種別を特定でき帳票毎の文字読み取り領域を正確に検出できるという作用を有する。
請求項８に記載の発明は、２値画像から枠罫線のコーナーを検出するステップと、前記検出された枠罫線のコーナーの形状の組み合わせから枠罫線の構成要素を検出するステップと、前記構成要素どうしを連結し、枠構造情報として出力するステップと、枠構造情報をもとに最外郭の外形枠を検出し、外形枠の構造の特徴を外形枠構造特徴として抽出するステップと、個々の外形枠に囲まれた内部の構造特徴を内部構造特徴として抽出するステップと、１つ以上の帳票の書式として外形枠及び内部構造特徴を予め登録した枠構造参照テーブルと、前記検出された外形枠構造特徴と前記枠構造参照テーブルを検索し帳票の候補を決め、決められた候補に対して前記検出された内部構造特徴と照合することにより帳票の種別を特定することにより帳票の種別を認識するステップとを具備するもので、帳票内の外形枠の構造に基づき帳票の書式カテゴリを判定し、さらに個々の外形枠毎に予め参照テーブルに登録した見本となる枠罫線構造と照合するため、正確に帳票の種別を特定できるという作用を有する。
【００１２】
以下、本発明の実施の形態について、図１から図２１を用いて説明する。
（実施の形態１）
図１は帳票認識装置のブロック構成図を示し、１は帳票を読み取り２値画像を得る画像入力手段、２は前記２値画像を記憶する画像メモリ、３は２値画像の水平及び垂直方向の罫線を抽出する罫線抽出手段、４は２値図形のコーナーを検出するコーナー検出手段、５はコーナー形状の組み合わせから罫線の屈曲や交差によるＬ字要素、Ｔ字要素、十字要素を検出する構成要素検出手段、６は構成要素どうしを連結し、連結形態を枠構造情報として出力する矩形検出手段と、７は帳票内の矩形構造を含む最外郭の外形枠を検出し、帳票の大局的な枠構造特徴を外形枠構造特徴として抽出する外形枠検出手段、８は個々の外形枠に囲まれた内部の枠の数、位置、幅、高さ等の特徴を内部構造特徴として抽出する特徴抽出手段、９は外形枠の大局的な特徴である外形枠構造特徴に基づき帳票を分類し、個々の外形枠の構造特徴として前記外形枠構造特徴及び前記内部構造特徴を登録する枠構造参照テーブル、１０は枠構造参照テーブル上の枠構造特徴との照合により帳票の種別を認識し文字読み取り対象枠を検出する枠構造照合手段、１１は読み取り対象枠の文字領域を切り出す文字切り出し手段、１２は切り出された文字を認識する文字認識手段である。
【００１３】
以上のように構成された帳票認識装置について、その動作を説明する。
画像入力手段１により帳票を読み取り、文字部が１、背景が０の値をもつ２値画像に変換し画像メモリ２に記憶し、罫線抽出手段３により２値画像を水平及び垂直方向に走査し所定長以上に値”１”が連続する線を抽出し、コーナー検出手段４により水平及び垂直の罫線からなるパターンのコーナーを検出し、構成要素検出手段５によりコーナー形状の組み合わせから罫線の屈曲部や交差部のＬ字、Ｔ字、及び十字の構成要素を検出し、矩形検出手段６により構成要素どうしを連結しその連結形態を検出し、外形枠検出手段７により帳票内の矩形構造を含む最外郭の外形枠を検出し、外形枠の総数、位置、形状等の大局的な特徴を抽出し、特徴抽出手段８により個々の外形枠に囲まれた内部の枠の数、位置、幅、高さ等の特徴を抽出し、枠構造参照テーブル９に予め見本となる帳票の外形枠の構造及び個々の外形枠の内部の特徴を登録しておき、枠構造照合手段１０により対象とする帳票の枠構造情報と枠構造参照テーブル９上の見本となる枠構造と照合し帳票を特定し、その帳票における読み取り対象枠を認識し頂点の座標を検出し、文字切り出し手段１１により読み取り対象枠の４頂点の座標に基づき文字領域を切り出し、文字認識手段１２により切り出された文字を認識するものである。
【００１４】
次に図１における各構成要素の動作を詳細に説明する。帳票を読み取り２値画像を出力する画像入力手段１は、読み取り線密度を約４００ｄｐｉ程度とし、原稿である帳票にＬＥＤ（発光ダイオード）等で照明しその反射光を一次元のＣＣＤカメラで読み取り、任意の閾値で２値化して文字部が値１、背景が値０の２値画像を出力するものである。また、照明は原稿である帳票の枠線や記入された文字の色によって異なるが、例えば青・黒および赤等の枠線に対して、黒や青等で数字や記号および文字が記入された場合、緑あるいは黄緑の波長（５５０〜５７０ｎｍ付近）のＬＥＤを用いることが多い。２値化処理においては、固定閾値法や浮動閾値法（森、大津：”認識問題としての二値化と各種方法の検討”、情報処理学会、イメージプロセッシング１５−１，Ｎｏｖ．１９７７）が良く知られている２値化処理法であり、原稿に合わせて任意の２値化処理法を選択すればよい。
【００１５】
このように２値化された画像データは画像メモリ２に格納され、メモリ上の画像データに対し以降に説明する罫線抽出と文字切り出しを行う。
【００１６】
次に罫線抽出手段３について図２を用いて説明する。図２は罫線抽出手段３における画像処理のブロック構成図を示し、２０は画像メモリ２からの２値画像、２１は水平方向にパターンを縮める水平方向収縮手段、２２は水平方向にパターンを延長する水平方向延長手段、２３は垂直方向にパターンを縮める垂直方向収縮手段、２４は垂直方向にパターンを延長する垂直方向延長手段、２５は水平方向延長手段２２と垂直方向延長手段２４の出力のＮＯＲ演算を行うＮＯＲ回路である。画像メモリ２の２値画像に対し、水平方向収縮手段２１において水平方向にｈ画素縮めることにより、水平方向にｈ画素以下の幅の線や文字が消滅し、続く水平方向延長手段２２において水平方向にｈ画素延長することによりｈ画素より長い水平線分が抽出される。同様に垂直方向収縮手段２３において垂直方向にｖ画素縮めることにより、垂直方向にｖ画素以下の幅の線や文字が消滅し、続く垂直方向延長手段２４において垂直方向にｖ画素延長することによりｖ画素より長い垂直線分が抽出される。ＮＯＲ回路２５により水平方向延長手段２２と垂直方向延長手段２４の出力のＮＯＲ演算を行うと、文字が消去され枠罫線のみが残る２値画像が得られ、枠罫線及び背景がそれぞれ”０”及び”１”の値をもつ。
【００１７】
図２における水平及び垂直方向の収縮及び延長処理について図３及び図４を用いてさらに詳細に説明する。
【００１８】
図３は水平及び垂直方向収縮手段２１及び２３の動作を示すフロー図を示しており、２値画像を水平方向または垂直方向に１ラインずつ順次走査し終了ラインまで処理し、各ライン毎にｎ画素の収縮処理をおこなうとき、ランレングスのカウント値をＣとすると、ステップ３１において各ラインの走査開始時にカウント値Ｃに０を設定し、ステップ３２において１画素データを読み込み、ステップ３３において画素の値が０（白）か１（黒）かを判定し、０のときステップ３４へ進みカウント値Ｃに０を設定し、さらにステップ３５へ進み黒ランではないので値０を出力する。ステップ３３の判定で値が１のときはステップ３６へ進みカウント値Ｃがｎ以上かどうかの判定を行い、ｎ未満のときステップ３７へ進みカウント値Ｃをインクリメントし、さらにステップ３５へ進みその走査位置までに所定の黒ランが存在しないので値０を出力する。ステップ３６の判定でカウント値Ｃがｎ以上のときステップ３８へ進みその走査位置までにｎ画素以上のランレングスをもつ黒ランが存在するので値１を出力する。以上の処理を１ラインの終了まで行うことにより、そのライン上の黒ランがｎ画素縮められる。次のラインを処理するときは再びステップ３１から同様の処理を繰り返す。このようにして全画面の走査が終了すると、水平または垂直方向にｎ画素以上のランレングスをもつ線分が抽出される。
【００１９】
同様に図４は水平及び垂直方向延長手段２２及び２４の動作を示すフロー図を示しており、２値画像を水平方向または垂直方向に１ラインずつ順次走査し終了ラインまで処理し、各ライン毎にｎ画素の延長処理をおこなうとき、ランレングスのカウント値をＣとすると、ステップ４１において各ラインの走査開始時にカウント値Ｃにｎを設定し、ステップ４２において１画素データを読み込み、ステップ４３において画素の値が０（白）か１（黒）かを判定し、１のときステップ４４へ進みカウント値Ｃにｎを設定し、さらにステップ４５へ進み黒ラン上にあるので値１を出力する。ステップ４３の判定で値が０のときはステップ４６へ進みカウント値Ｃが０以下かどうかの判定を行い、０より大きい場合ステップ４７へ進みカウント値Ｃをデクリメントし、さらにステップ４５へ進みその走査位置は黒ランからｎ画素以内の距離にあるので値１を出力する。ステップ４６の判定でカウント値Ｃが０以下のときステップ４８へ進みその走査位置は黒ランからｎ画素より大きく離れているので値０を出力する。以上の処理を１ラインの終了まで行うことにより、そのライン上の黒ランがｎ画素延長される。次のラインを処理するときは再びステップ４１から同様の処理を繰り返す。このようにして全画面の走査が終了すると、水平または垂直方向にランレングスがｎ画素延長される。
【００２０】
次にコーナー検出手段４、構成要素検出手段５、及び矩形検出手段６における一連の処理について説明するが、これらの内容は同一出願人による特願平７−０１６８６２号公報に記載されており詳細な説明は省略し、その動作を簡単に説明する。
【００２１】
まずコーナー検出手段４について図５から図７を用いて説明する。図５はコーナーを検出するための前処理として、入力画像を方向コード化画像に変換した結果を示す図、図６は方向コード１〜８と実際の方向の対応関係を示す図、図７が検出するコーナーの具体例を示す図である。図５において５１は枠罫線の画素、５２は背景の画素、数字は輪郭点に付与された方向コードをそれぞれ示しており、この場合背景パターンに対し時計回りに輪郭を追跡するときの追跡の方向を、方向コード１〜８として割り当てる。このように方向コード化された画像から方向コードの変化点、すなわちコーナーを検出する。このために３×３近傍において注目位置（中央画素）コードが指示する方向に、注目画素と同一方向コードでない位置を検出する。図５において丸で囲まれた位置は方向コードの変化点を示しているが、例えば５３の位置では図７（ａ）に示す画素配置となっており、注目画素の指示する方向”３”の位置の方向コードは”１”となっており、輪郭の方向が”３”から”１”へ変化することを意味するので”３１”というコードで表記し、座標と変化する方向コード（以下方向変化コードとよぶ）を１組として検出する。同様に画素位置５４、５５、５６は図７の（ｂ）、（ｃ）、（ｄ）に対応しており、それぞれ”１７”、”７５”、”５３”という方向変化コードが与えられ、これらのコーナー点は、ｘ座標、ｙ座標、方向変化コードを１組の特徴情報として構成要素検出手段５へ通知する。
【００２２】
次に構成要素検出手段５について図８と図９を用いて説明する。図８はコーナー点の組み合わせから構成要素を検出するための判定条件を示す図、図９は構成要素の記述形式を示す図である。図８において（ａ）（ｂ）（ｃ）（ｄ）はＬ字要素の検出例、（ｅ）（ｆ）（ｇ）（ｈ）はＴ字要素の検出例、（ｈ）は十字要素の検出例を示しており、コーナー検出手段４からのコーナー点の特徴情報を用いて、ｘ，ｙ座標が所定の距離以内にある複数のコーナー点をグループ化し、グループのメンバーであるコーナー点の方向変化コードの組み合わせから、構成要素の種類が対応付けられる。このようにして検出された構成要素の形状は、図９（ａ）に示すように４ビットのコード（以下形状コードとよぶ）で記述され、各ビットは上位から図９（ｂ）のＳ、Ｗ、Ｎ、Ｅのいずれの方向に腕が存在するかを示している。例えば図８（ａ）に示すＬ字要素はＳ方向とＥ方向に腕を有しているので、”１００１”のビットパターンで記述される。構成要素のｘ座標及びｙ座標は、グループのメンバーであるコーナー点のｘ座標及びｙ座標の平均値を与えるものとし、構成要素検出手段５はこのｘ座標、ｙ座標、及び前記構成要素の形状コードを特徴情報として、矩形検出手段６に通知する。
【００２３】
次に矩形検出手段６について図１０を用いて説明する。図１０は構成要素どうし連結関係を示す図である。図１０（ａ）は構成要素検出手段５において検出されたＬ字要素、Ｔ字要素、十字要素の一例を示すもので、矩形検出手段６は矩形情報としてこれら構成要素の連結関係を記述した連結テーブルを生成出力する。まず各構成要素に対し識別ラベルｅ１からｅ２０を付与し、次に前記構成要素検出手段５からの特徴情報（ｘ座標、ｙ座標、形状コード）に基づき、形状コードの示す腕についてｘ方向とｙ方向を探索し、連結可能な腕をもつ構成要素のうち最短距離にあるものを検出する。同図（ｂ）は構成要素の連結関係を示す連結テーブルを示すもので、各構成要素についてどの要素と連結するかをＮ、Ｓ、Ｅ、Ｗの各方向について記述する。例えばＬ字要素ｅ１の場合は、腕Ｓ及びＥに対応する構成要素としてｅ１５及びｅ２が存在し、Ｔ字要素ｅ２の場合は腕Ｓ、Ｗ、及びＥに対応する構成要素としてｅ８、ｅ１、及びｅ３が存在することになる。同図（ｂ）の属性フラグは特徴抽出手段８において使用するパラメータであり具体的内容は後述する。
【００２４】
次に外形枠検出手段７について図１１及び１２を参照しながら説明する。図１１（ａ）は外形枠の一例を示す図、図１１（ｂ）は時計回りに外形枠を追跡し再び始点に戻るまでの経路を示す図、図１２は外形枠を検出する処理のフロー図である。図１１（ａ）において太線で示される矩形６２、６３、及び６４が外形枠であり、帳票６１において閉じた枠構造の最も外側の罫線で構成される。例えば外形枠６２について、始点要素ｆ１から時計回りに外形枠を追跡し再び始点に戻るまでの経路である追跡経路は図６（ｂ）に示すように、ｆ１、ｆ２、ｆ６、ｆ１０、ｆ１２、ｆ１１、ｆ９、ｆ５、ｆ１となり、外形枠の頂点は丸印で示すように追跡の方向が変化する構成要素となりｆ１、ｆ２、ｆ１２、ｆ１１、ｆ１であり、これらを頂点経路と呼ぶ。以下追跡経路は［・・・］、頂点経路は｛・・・｝というように表記すると、外形枠６３については追跡経路は［ｆ３ｆ４ｆ８ｆ１４ｆ１３ｆ７ｆ３］となり、頂点経路は｛ｆ３ｆ４ｆ１４ｆ１３ｆ３｝であり、外形枠６４については追跡経路は［ｆ１５ｆ１６ｆ１７ｆ２０ｆ２４ｆ２９ｆ３３ｆ３２ｆ３１ｆ３０ｆ２５ｆ２１ｆ１８ｆ１５］であり、頂点は｛ｆ１５ｆ１７ｆ３３ｆ３２ｆ３１ｆ３０ｆ１５｝である。したがって図１１（ａ）に示す帳票の例では外形枠の総数Ｎは３となる。次にこれらの外形枠を抽出するための処理について図１２を参照しながら説明する。まずフロー図で用いる変数について説明する。入力変数は、図１０（ｂ）に示した構成要素連結テーブルで、テーブル上のｉ番目の構成要素に関して、識別ラベルをｅｉ、ｘ座標をｘ（ｅｉ）、ｙ座標をｙ（ｅｉ）、形状コードをｃｏｄｅ（ｅｉ）、方向ポインタをｄｋ（ｅｉ）とし、Ｓ、Ｗ、Ｎ、Ｅの各方向に対しｋ＝０、１、２、３が対応する。ｉはテーブル上の構成要素の総数をＭとすると、０から（Ｍ−１）の値をとるものとする。次に出力変数は、外形枠の総数をＮ、ｊ番目の外形枠に関して、追跡経路をＢｊ＝［ｂｊ（０）ｂｊ（１）・・・］、頂点経路をＣｊ＝｛ｃｊ（０）ｃｊ（１）・・・｝とする。ただしｊは０から（Ｎ−１）の値をとるものとする。また制御変数として、外形枠の追跡を開始した構成要素ポインタｓｐ、次を追跡するための構成要素ポインタｎｐ、現在の追跡している方向を示す変数ｄｉｒ、外形枠の追跡フラグｆとする。まず７１において外形枠総数Ｎに０を設定し、７２において外形枠を追跡中かどうかを示すフラグｆに０を設定し、更に７３において制御変数ｉ及びｊに０を設定する。以下、構成要素連結テーブルを参照しながら、個々の外形枠を時計回りに追跡する。７４において制御変数ｉが構成要素の総数Ｍ以上の場合処理は終了し、そうでない場合には７５に進む。７５においてフラグの値から現在外形枠を追跡中かどうかを判定し、フラグの値が０、すなわち外形枠の始点を探索中であるとき７６の判定に進み、フラグの値が１、すなわち外形枠を追跡中であるとき８４の判定に進む。７６の判定は現在注目している構成要素が外形枠追跡の始点であるかどうかを判定するもので、外形枠の左肩の要素、すなわちＳ方向及びＥ方向に腕をもつ構成要素かどうかを判定し、該当する場合は７８へ進み追跡フラグｆをセットし、該当しない場合は７７へ進み制御変数ｉを更新し、次の構成要素をチェックする。７９及び８０は制御変数に初期値を与えるステップであり、７９において始点ポインタを設定し、８０において制御変数ｋｋと現在の追跡方向を示す変数ｄｉｒに、方向Ｅを示す値３を設定する。次に８１及び８２において、現在注目している構成要素の識別コードｅｉを追跡経路Ｂｊ及びＣｊに登録する。そして８３において次点ポインタｎｐに現在着目している構成要素のＥ方向ポインタの値をセットする。次に再び判定７５へ戻り、今回は追跡フラグｆがセットされているので判定８４へ進む。８４は次点ポインタｎｐと始点ポインタｓｐの一致判定を行うもので、一致した場合は外形枠としてクローズしたことを意味し、９４へ進み外形枠の総数Ｎをインクリメントし、次に９５において制御変数ｊをインクリメントし、更に９６において追跡フラグｆをリセットし、７７において制御変数ｉを更新し再び７４の判定を行う。８４の判定において次点ポインタと始点ポインタが一致しないとき、次点ポインタｎｐの指す構成要素においてＳ、Ｗ、Ｎ、Ｅの各方向のポインタを検索し、次のｎｐを決める。８５は上記Ｓ、Ｗ、Ｎ、Ｅの方向を検索する際、どの方向から検索するかを決める処理で、式中の％はモジュロ演算を示す。すなわち現在の方向ｋｋに３を加えモジュロ演算を行うことにより、現在の方向に対し時計回りにみてすぐ隣の方向から、時計回り順にチェックしていく。このとき８６の判定において、該方向に腕を有するかどうかの判定を行い、腕を有さない場合は、８７でｋｋの値を更新し再び８５へ進み、腕を有する場合は判定８８へ進む。８８は腕の方向が現在の追跡方向と一致しているかどうかの判定を行うもので、一致しない場合は追跡方向が変わるため、９０において変数ｄｉｒに新たな追跡方向を設定し、９１において注目している構成要素が方向変化点、すなわち頂点であるとして、頂点経路Ｃｊに登録し、さらに８９において追跡経路Ｂｊにも登録する。８８の判定において追跡方向が変化しない場合は追跡経路Ｂｊにのみ登録する。続いて次点ポインタｎｐに求められた方向ｋの構成要素の識別コードをセットし、現在の方向を示す制御変数ｋｋにｋの値をセットし、再び７５以下の判定を繰り返す。以上の処理を終了まで実行することにより、帳票内の全ての外形枠が検出され、総数Ｎ、追跡経路Ｂｊ、及び頂点経路Ｃｊが求められる。
【００２５】
次に特徴抽出手段８について図１３から図１５を用いて説明する。図１３及び図１４は構成要素の外形枠への属性を検出するフロー図、図１５は前記構成要素連結テーブルから各外形枠の内部枠を検出するフロー図である。まず図１３のフロー図で用いられる変数について説明する。入力となるデータは図１０（ｂ）で示した構成要素連結テーブルと図１２の処理で得られた追跡経路Ｂｊ（ｊ＝０〜Ｎ−１）であり、識別ラベル、形状コード、方向ポインタは図１２と同様の表記とする。
また追跡経路Ｂｊに属する要素ｂｊ（ｈ）の総数をｐｊと表記する。これに対し出力データとしては、各構成要素がどの外形枠に所属するかを示す属性フラグｇ（ｅｉ）である。すなわち図１０（ｂ）に示した構成要素がどの外形枠に所属するかを判定するものである。判定に先立ち、あらかじめ全ての構成要素の属性フラグｇ（ｅｉ）に−１がセットされているものとする。図１３において、まず１０１から１０４において制御変数ｊ、ｈ、ｉ、ｋに０を設定し、１０９の判定において注目している構成要素の腕の方向ポインタの位置に、追跡経路上の構成要素と一致するかどうかを判定する。一致する場合は１１４へ進み属性フラグｇ（ｅｉ）にｊの値すなわち所属する外形枠の番号を設定する。これに対し一致しない場合は１１１以下の判定へ進み、条件を満たすときは方向の制御変数ｋを更新し再び１０９の判定を行う。１１１では制御変数ｉがＭ−１まで進んだかどうかを判定するもので、条件を満たすときは１０４からの処理を繰り返し、満たさない場合は１１２以下の判定に進む。１１２では制御変数ｈが追跡経路Ｂｊの要素数ｐｊまで進んだかどうかを判定するもので、条件を満たすときは１０３からの処理を繰り返し、満たさない場合は１１３の判定に進む。１１３では制御変数ｊがＮ−１まで進んだかどうかを判定するもので、条件を満たすときは１０２からの処理を繰り返し、満たさない場合は図１４に示す処理へ進む。以上の処理により各外形枠上の構成要素の属性フラグｇ（ｅｉ）が設定されたことになる。図１４で示すプロセスは、構成要素連結テーブル自身を参照し、まだ属性の決まっていない要素に対し、既に属性のきまっている構成要素の方向ポインタとの一致判定を行い、一致する場合は同一の属性フラグを与える処理である。まず１２１で制御変数ｉに０、フラグｆに０をそれぞれ設定し、次に１２８において構成要素ｅｉの属性が決まっているかどうかを判定し、決まっている場合は１３３へ進み制御変数ｉがＭ−１まで進んでいなければｉを更新し再び１２８の判定を行う。属性が決まっていない場合は１２２へ進みフラグに１をセットし、１２３へ進み制御変数ｓに０を設定し、続く１２９で既に属性の決まっている構成要素を探索する。未だ属性が決まってない場合は１３１の判定へ進み制御変数ｓがＭ−１まで進んでいなければｓを更新し再び１２９の判定を行う。属性の決まっている構成要素があった場合、１２４で制御変数ｒに０を設定し、１３０において方向ポインタの中に、注目している構成要素ｅｉが存在するかどうかを判定し、存在する場合は１３５へ進み属性フラグｇ（ｅｉ）にｇ（ｅｓ）を設定する。１３０の条件を満たさない場合は１２５においてｒをインクリメントし他の方向ポインタを見る。１３３において制御変数ｉがＭ−１まで進んだときは１３４へ進み、フラグｆの値を検査し値が０でない場合は再び１２１から処理を繰り返す。１３４の判定でフラグの値が０のままであれば、全ての構成要素の属性が与えられたことになり、終了する。次に構成要素連結テーブルから各外形枠の内部の枠を検出する処理フローについて図１５を参照しながら説明する。まず図１５で用いられる変数について説明する。入力となるデータは図１０（ｂ）で示した構成要素連結テーブルであり、識別ラベル、方向ポインタ、所属フラグは図１３及び図１４と同様の表記とする。これに対し出力データとしては、各外形枠ごとの内部枠構造テーブルＲｊと各外形枠毎の内部枠の総数ｔｊである。ただしｊは０から（Ｎ−１）の値をとるもの
とする。
【００２６】
枠の４頂点のうち左上の点を始点と定めると、内部枠構造テーブルＲｊは、個々の枠の識別番号、始点のｘ座標及びｙ座標、枠の幅ｗと高さｈを要素としてもつものとする。まず１４１から１４３において制御変数ｉ及びｊと内部枠の総数ｔｊに０を設定する。１４５の判定で対象とする構成要素が注目している外形枠に所属するかどうかを判定し、所属しない場合は判定１５９へ進み、所属する場合は判定１４６へ進む。１４６は対象としている構成要素が枠の始点であるかどうかを判定するもので、Ｅ方向及びＳ方向のポインタに連結の相手となる構成要素が存在するかどうかが判定条件である。始点である場合１４７以下の処理に進むが、始点から枠として認識する方法は、基本的に始点からＥ→Ｓ→Ｗ→Ｎと右回りに連結線をたどり、行き止まりになれば前の要素に戻り新しい構成要素を選択して探索を繰り返し、最後に始点に戻ることのできる構成要素の集まりを１つの枠とするものである。まず１４７において対象点のＥ方向に構成要素が連結しているがどうか調べ、ｎｏならば１５９へ進み、ｙｅｓならば１４８へ進み連結しているＥ方向の構成要素を対象点にする。次に１４９において対象点のＳ方向に構成要素が連結しているがどうか調べ、ｎｏならば１４７に戻り、ｙｅｓならば１５０へ進み連結しているＳ方向の構成要素を対象点にする。続いて１５１において対象点のＷ方向に構成要素が連結しているがどうか調べ、ｎｏならば１４９に戻り、ｙｅｓならば１５２へ進み連結しているＷ方向の構成要素を対象点にする。続いて１５３において対象点のＮ方向に構成要素が連結しているがどうか調べ、ｎｏならば１５１へ戻り、ｙｅｓならば１５４へ進み連結しているＮ方向の構成要素を対象点にする。続いて１５５において１５４の対象点が始点と一致するかどうか調べ、ｎｏならば１５３に戻り、ｙｅｓならば１つの枠を検出したので１５６へ進む。１５６では始点のｘ座標とＥ方向の構成要素のｘ座標の差から枠の幅ｗを算出し、また始点のｙ座標とＳ方向の構成要素のｙ座標の差から枠の高さｈを算出する。次に１５７において内部枠構造テーブルＲｊに該枠の始点のｘ座標、ｙ座標、幅ｗ、高さｈを登録し、１５８において内部枠の総数ｔｊをインクリメントする。その後１５９へ進み制御変数ｉの判定を行い、ｉが構成要素の総数Ｍ−１まで進んでいないとき１６０へ進みｉをインクリメントし再び１４５の判定に進み、ｉがＭ以上の場合は１６１へ進む。１６１では制御変数ｊの判定を行い、ｊが外形枠の総数Ｎ−１まで進んでいないとき１６２へ進みｊをインクリメントし再び１４２に戻り、ｊがＮ以上の場合は終了となる。以上図１２から図１５を用いて説明した外形枠検出手段７及び特徴抽出手段８により、帳票の外形枠構造特徴として外形枠の総数、頂点位置及び個々の外形枠の内部構造特徴として内部枠の総数、枠の始点位置、幅、高さが検出され、次の枠構造照合手段１０において枠構造参照テーブル９に登録されている帳票の枠構造と照合し、帳票を特定し文字読み取り領域を決定する。
【００２７】
以下枠構造参照テーブル９と枠構造照合手段１０について図１６と図１７を用いて説明する。図１６は枠構造参照テーブル９に登録してある帳票の具体例、図１７は枠構造参照テーブル９の構造を示す図である。図１６（ａ）に示す帳票は３個の外形枠２１１、２１２、２１３が存在し、個々の外形枠には１３個の内部枠が存在し、それぞれ（１）から（１３）の識別番号が与えられている。また同図（ｂ）に示す帳票は２個の外形枠２２１、２２２が存在し、個々の外形枠には１１個の内部枠が存在し、それぞれ（１）から（１１）の識別番号が与えられている。また同図（ｃ）に示す帳票は１個の外形枠２３１が存在し、１１個の内部枠が存在しそれぞれ（１）から（１１）の識別番号が与えられている。さらに同図（ｄ）に示す帳票は１個の外形枠２４１が存在し、１３個の内部枠が存在しそれぞれ（１）から（１３）の識別番号が与えられている。これらの帳票の枠構造の特徴は図１７（ａ）（ｂ）に示すテーブルに登録されている。図１７において（ａ）は外形枠構造を登録するテーブル、（ｂ）は内部枠構造を登録するテーブルを示す。同図（ａ）においてテーブルに登録する要素としては、外形枠の総数、頂点数、個々の外形枠における内部枠の総数、外形枠の識別番号、外形枠の頂点座標があり、入力された帳票におけるこれらの特徴をパラメータとしてテーブルを検索することにより、帳票の候補が決まる。次に候補として挙げられた帳票に対し、今度は同図（ｂ）に示す内部の枠の詳細構造を順次照合する。すなわち個々の枠について始点座標、幅、及び高さの一致判定を行い一致したとき帳票が特定できたとみなし、読み取り対象枠の識別番号に対応する入力帳票の枠の始点座標、幅、高さを文字切り出し手段１１に通知する。例えば図１６（ｄ）に示す帳票の場合、読み取り対象枠は（６）及び（７）となりそれぞれの枠の始点、幅、高さが文字切り出し手段１１に通知される。
【００２８】
次に、文字切り出し手段１１について説明する。文字切り出し手段１１は、画像メモリ２から実際の文字の２値イメージを切り出して文字認識手段１２に送るもので、その処理について図１８を用いて説明する。図１８は、帳票の枠の中に文字が描かれている２値イメージを示している。ここで、２５１は読み取り対象枠の始点（ｘｓ，ｙｓ）、２５２は文字読み取り領域、２５３は文字切り出し領域を示す。枠構造照合手段１０より通知された始点（ｘｓ，ｙｓ）、幅ｗ、高さｈに基づき、マージンｄｘとｄｙを見込んで、始点２５４（ｘｓ＋ｄｘ，ｙｓ＋ｄｙ）から幅（ｗ−２ｄｘ）、高さ（ｈ−２・ｄｙ）の領域を読み取り領域２５４として決定する。個々の文字の切り出し領域は、この読み取り領域において水平方向及び垂直方向のプロジェクションをとることにより決定され、水平及び垂直方向において所定の度数以上の領域を文字切り出し領域２５３とする。プロジェクションを用いた文字切り出しは一般的な技術であるので詳細な説明は省略する。このように切り出された個々の文字を後段の文字認識手段１２に転送し、指定された枠内の文字列を認識する事ができる。
【００２９】
【発明の効果】
以上のように本発明によれば、画像入力手段により読み取られたの帳票の２値画像に対して、罫線を抽出し罫線の交点である構成要素を検出し、各構成要素どうしを連結し枠構造を生成し、枠構造の外形枠を検出しその特徴から帳票を分類し、さらに各外形枠の内部の枠構造を見本と照合して帳票の種類を特定し、読み取り対象枠を検出することにより、多種多様な書式の帳票が混在する場合でも帳票を識別し、帳票毎の個別の文字読み取り領域を正確に検出でき、信頼性の高い文字認識が行えるという有利な効果が得られる。
【図面の簡単な説明】
【図１】本発明の一実施の形態における帳票認識装置のブロック結線を示す図
【図２】同実施の形態の帳票認識装置における罫線抽出手段でのブロック結線を示す図
【図３】同実施の形態の帳票認識装置における線分収縮手段での処理の手順を示すフローチャート
【図４】同実施の形態の帳票認識装置における線分延長手段での処理の手順を示すフローチャート
【図５】同実施の形態の帳票認識装置におけるコーナー検出手段での方向コード化と方向コード変化点検出について説明する図
【図６】同実施の形態の帳票認識装置におけるコーナー検出手段での方向コードと実際の方向との対応関係を示す図
【図７】同実施の形態の帳票認識装置におけるコーナー検出手段での方向変化コードの具体例を示す図
【図８】同実施の形態の帳票認識装置における構成要素検出手段でのコーナー点の組み合わせによる構成要素の具体例を示す図
【図９】同実施の形態の帳票認識装置における構成要素抽出手段での構成要素の形態の記述について説明する図
【図１０】同実施の形態の帳票認識装置における矩形検出手段での構成要素どうしの連結関係を示す概念図
【図１１】同実施の形態の帳票認識装置における外形枠検出手段での外形枠検出の一例を示す図
【図１２】同実施の形態の帳票認識装置における外形枠検出手段での外形枠検出の処理の手順を示すフローチャート
【図１３】同実施の形態の帳票認識装置における特徴抽出手段での構成要素の属性を検出する処理の手順を示すフローチャート
【図１４】同実施の形態の帳票認識装置における特徴抽出手段での構成要素の属性を検出する処理の手順を示すフローチャート
【図１５】同実施の形態の帳票認識装置における特徴抽出手段での内部枠構造を検出する処理の手順を示すフローチャート
【図１６】同実施の形態の帳票認識装置における枠構造参照テーブルに登録されている帳票の具体例を示す図
【図１７】同実施の形態の帳票認識装置における枠構造参照テーブルでのテーブルの構造を示す図
【図１８】同実施の形態の帳票認識装置における文字切り出し手段の処理を示す図
【図１９】従来の帳票認識装置の処理動作を示す図
【図２０】従来の帳票認識装置の処理動作を示す図
【図２１】従来の帳票認識装置の処理動作を示す図
【符号の説明】
１画像入力手段
２画像メモリ
３罫線抽出手段
４コーナー検出手段
５構成要素検出手段
６矩形検出手段
７外形枠検出手段
８特徴抽出手段
９枠構造参照テーブル
１０枠構造照合手段
１１文字切り出し手段
１２文字認識手段
２０２値画像
２１水平方向収縮手段
２２水平方向延長手段
２３垂直方向収縮手段
２４垂直方向延長手段
２５ＮＯＲ回路
５１枠罫線の画素
５２背景の画素
５３〜５６コーナー点
６１帳票
６２〜６４外形枠
２１０帳票Ａ
２１１〜２１３外形枠
２２０帳票Ｂ
２２１〜２２２外形枠
２３０帳票Ｃ
２３１外形枠
２４０帳票Ｄ
２４１外形枠
２５１枠の始点
２５２文字読み取り領域
２５３文字切り出し領域
２５４文字読み取り領域の始点[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention recognizes a frame ruled line structure in a document image including a frame ruled line and characters such as a form, identifies a form, cuts out a specific character area written in the form, and automatically recognizes characters. The present invention relates to a form recognition device.
[0002]
[Prior art]
In recent years, with the digitization of document information, there has been an increasing demand for a character recognition technology such as an OCR (Optical Character Reader) and a document image processing, and a technology for automatically reading a tabular document such as a form is one of them. In particular, when processing a form in which multiple formats are mixed, it is necessary to classify the form manually by type and then have it read by an OCR device. To reduce time and effort, both form identification and character reading are automated. It is requested to do.
[0003]
As a form identification technique, for example, there is a method of identifying a form type based on a difference in a local ruled line structure and reading an area designated in advance, as disclosed in Japanese Patent Application Laid-Open No. 2-217977. , FIG. 21 and FIG. For example, as shown in FIG. 19A, six line segments k1 to k6 are detected as shown in FIG. You. On the other hand, when line segments are similarly extracted from the form 320 shown in FIG. 20A, seven line segments m1 to m7 are detected as shown in FIG. Therefore, it is possible to set a line segment detection area in which a plurality of forms can be identified in advance, and to detect and compare the number of line segments in the area, the mutual positional relationship, and the length, thereby enabling the forms 310 and 320 to be identified. The specified character reading area, that is, the character areas 311 and 312 in the form 310 of FIG. 19A, and the character areas 321 and 322 in the form 320 of FIG. To achieve reading.
[0004]
Various methods have been proposed for recognizing cut-out characters. For example, a character recognition method using a neural network (Mori: "Handwritten Kanji Recognition by PDP Model", IEICE Transactions, Vol. J73- D-II, No. 8 pp. 1268-1274 1990), and the recognition rate has reached a practical level. However, this time, since it is a form frame line recognition device, it is limited to preprocessing of character recognition. Therefore, the character recognition is omitted.
[0005]
[Problems to be solved by the invention]
However, in the above-described conventional configuration, if a form 330 as shown in FIG. 22A is mixed to detect a local difference in the frame ruled line structure of the form, line detection is performed as shown in FIG. As shown, seven line segments n1 to n7 are detected, which is the same as the case shown in FIG. 20B, and it is difficult to specify the character reading areas 331 and 332 in FIG. It is necessary to visually confirm and set the range of the detection region in advance, and there is a problem that the operation is complicated.
[0006]
The present invention solves the above-mentioned problems of the prior art, and provides a highly reliable form recognition device that identifies forms even when forms in various formats are mixed, and accurately detects a character reading area for each form. The purpose is to provide.
[0007]
[Means for Solving the Problems]
To solve this problem, the present inventionFrom the binary image,A corner detecting means for detecting a corner,The components of the frame rule are determined from the combination of the shapes of the corners of the detected frame rule.Component detecting means for detecting, and connecting the components,Rectangle detecting means for outputting as frame structure information, andThe mostDetect the outer frame of the outer shellOutsideOuter frame detection means for extracting a feature of the structure of the frame as an outer frame structure feature;For each extracted outline frameFeature extraction means for extracting internal structural features as internal structural features;The outline frame and internal structure features are registered in advance as the format of one or more forms.A frame structure reference table;A frame structure that specifies the form type by searching the detected outer frame structure feature and the frame structure reference table to determine a form candidate, and comparing the determined candidate with the detected internal structure feature. Collation meansAre provided.
Further, an image memory for storing a binary image including a frame ruled line, and a character cutout unit for cutting out a character area from the image memory based on the coordinates of the character reading target area from the detected outer frame structure feature and internal structure feature, And character recognition means for recognizing the cut-out character.
[0008]
As a result, even when forms in various formats are mixed, the forms can be identified, the character reading area of each form can be accurately detected, and a highly reliable form recognition device can be realized.
[0009]
BEST MODE FOR CARRYING OUT THE INVENTION
The invention according to claim 1 of the present inventionA corner detecting means for detecting a corner of the frame ruled line from the binary image; a component detecting means for detecting a component of the frame ruled line from a combination of the detected corner shapes of the frame ruled line; A rectangle detection unit that outputs as frame structure information, an outline frame detection unit that detects an outermost outline frame based on the frame structure information, and extracts a feature of the outline frame structure as an outline frame structure feature,For each extracted outline frameFeature extraction means for extracting internal structural features as internal structural features; a frame structure reference table in which an external frame and internal structural features are registered in advance as one or more forms of the form; Frame structure matching means for searching a frame structure reference table to determine a form candidate, and identifying the form type by comparing the determined candidate with the detected internal structure featureAnd a form ruler that determines the format category of the form based on the structure of the outline frame in the form, and further, for each outline frame, becomes a sample frame rule line structure registered in a reference table in advance. Because of the collation, the form type can be specified accurately.WhenIt has the effect.
[0010]
The invention described in claim 2 isFurther, the image processing apparatus further includes a ruled line extracting means for extracting a frame ruled line by detecting runs having a predetermined length or more in the horizontal and vertical directions from the binary image including the frame ruled line and the character. The feature is that the character is erased, the corner can be detected only from the frame ruled line, and the component can be detected with high accuracy.
[0011]
The invention according to claim 3 is:The constituent element detecting means detects the constituent elements of the L-shaped element, the T-shaped element, and the cross element as the constituent elements by bending or intersecting the ruled line from the combination of the corner shapes. Since the component can be stably detected even from the form read by tilting by detecting, the type of the form can be specified accurately, and the character reading area for each form can be accurately detected.
The invention according to claim 4 is characterized in that the outer frame detecting means connects the vertex coordinates and extracts each of the independent outer frames as an outer frame structural feature, and connects the vertex coordinates of the outer frames. By doing so, the outer frame structure can be easily extracted.
The invention according to claim 5 is characterized in that the outline frame structure feature is to extract the total number, position, and shape of the outline frames in the target form, and to easily connect the vertex coordinates of the outline frames. This has the effect that the outer frame structure can be extracted.
The invention according to claim 6 is characterized in that the feature extracting means extracts the number, position, width and height of internal frames as internal structural features for each external frame. There is an effect that the type of the form can be accurately specified using the internal structure of each outer frame.
The invention according to claim 7, further comprising: an image memory for storing a binary image including a frame ruled line; and the image memory based on the coordinates of the character reading target area based on the detected external frame structural features and internal structural features. It is characterized by having a character extracting means for extracting a character area and a character recognizing means for recognizing the extracted character, so that the type of the form can be specified accurately and the character reading area for each form can be accurately detected. Has an action.
The invention according to claim 8, wherein a step of detecting a corner of the frame ruled line from the binary image, a step of detecting a component of the frame ruled line from a combination of the shapes of the corners of the detected frame ruled line, Connecting each other and outputting as frame structure information; detecting the outermost outer frame based on the frame structure information; and extracting the features of the outer frame structure as outer frame structure features; Extracting a structural feature inside the frame as an internal structural feature, a frame structure reference table in which an external frame and internal structural features are registered in advance as one or more forms, and the detected external frame structure Features andSearch the frame structure reference table to determine form candidates, and for the determined candidatesSaid detectedBy matching with internal structure featuresRecognizing the type of the form by identifying the type of the form, and determining the format category of the form based on the structure of the outline frame in the form, and furthermore, a reference table for each outline frame in advance. Since the collation with the frame ruled line structure as a sample registered in the above is performed, there is an effect that the type of the form can be specified accurately.
[0012]
Hereinafter, embodiments of the present invention will be described with reference to FIGS. 1 to 21.
(Embodiment 1)
FIG. 1 shows a block diagram of a form recognition apparatus, wherein 1 is an image input means for reading a form to obtain a binary image, 2 is an image memory for storing the binary image, and 3 is a horizontal and vertical direction of the binary image. Ruled line extracting means for extracting ruled lines; 4 is a corner detecting means for detecting a corner of a binary figure; 5 is a component for detecting an L-shaped element, a T-shaped element, and a cross-shaped element by bending or intersecting a ruled line from a combination of corner shapes. Detecting means, 6 is a rectangular detecting means for connecting the constituent elements and outputting the connected form as frame structure information, and 7 is detecting the outermost outer frame including the rectangular structure in the form, and forming a global frame of the form. Outer frame detecting means 8 for extracting the structural features as outer frame structural features, 8 is a feature extracting means for extracting features such as the number, position, width, height, etc. of the inner frames surrounded by the individual outer frames as internal structural features. , 9 is the outline of the outer frame A frame structure reference table for classifying the form based on the outer frame structure feature, which is a feature, and registering the outer frame structure feature and the internal structure feature as the structural features of each outer frame, 10 is a frame structure on the frame structure reference table. Frame structure matching means for recognizing the type of the form and detecting a character reading target frame by comparing with a feature, 11 is a character extracting means for extracting a character area of the reading target frame, and 12 is a character recognizing means for recognizing the extracted character. is there.
[0013]
The operation of the form recognition device configured as described above will be described.
The form is read by the image input means 1, converted into a binary image having a character portion of 1 and a background of 0, stored in the image memory 2, and scanned by the ruled line extracting means 3 in the horizontal and vertical directions. A line in which the value “1” is continuous over a predetermined length is extracted, a corner detecting means 4 detects a corner of a pattern composed of horizontal and vertical ruled lines, and a component detecting means 5 detects a bent portion of the ruled line from a combination of corner shapes. And L-shaped, T-shaped, and cross-shaped components at the intersections and crosses are detected by the rectangular detection means 6 to detect the form of connection, and the outer frame detection means 7 includes the rectangular structure in the form. The outermost outer frame is detected, and global features such as the total number, position, and shape of the outer frames are extracted. The number, position, width, Features such as height are extracted, In the reference table 9, the structure of the outer shape frame of the form as a sample and the internal characteristics of each outer shape frame are registered in advance, and the frame structure collation unit 10 stores the frame structure information of the target form and the frame structure reference table 9. The form is identified by collating with the frame structure as a sample, the frame to be read in the form is recognized, the coordinates of the vertices are detected, and the character cutout unit 11 cuts out the character area based on the coordinates of the four vertices of the frame to be read. The character recognition unit 12 recognizes the extracted character.
[0014]
Next, the operation of each component in FIG. 1 will be described in detail. The image input means 1 which reads a form and outputs a binary image has a reading linear density of about 400 dpi, illuminates a form as an original with an LED (light emitting diode) or the like, reads the reflected light thereof with a one-dimensional CCD camera, This is to output a binary image having a character portion of value 1 and a background of value 0 by binarizing with an arbitrary threshold value. In addition, the illumination varies depending on the frame line of the document as a manuscript and the color of the entered characters, but, for example, numerals, symbols and characters are written in black, blue, etc. for the frame lines such as blue, black and red. In this case, an LED having a green or yellow-green wavelength (around 550 to 570 nm) is often used. In the binarization processing, a fixed threshold method or a floating threshold method (Mori, Otsu: “Binarization as a recognition problem and various methods”, Information Processing Society of Japan, Image Processing 15-1, Nov. 1977) is good. This is a known binarization processing method, and an arbitrary binarization processing method may be selected according to a document.
[0015]
The binarized image data is stored in the image memory 2, and the image data on the memory is subjected to ruled line extraction and character cutout described below.
[0016]
Next, the ruled line extracting means 3 will be described with reference to FIG. FIG. 2 shows a block diagram of the image processing in the ruled line extracting means 3, wherein 20 is a binary image from the image memory 2, 21 is a horizontal shrinking means for shrinking the pattern in the horizontal direction, and 22 is a pattern extending in the horizontal direction. Horizontal extension means, 23 is a vertical contraction means for contracting the pattern in the vertical direction, 24 is a vertical extension means for extending the pattern in the vertical direction, 25 is a NOR operation of outputs of the horizontal extension means 22 and the vertical extension means 24 Is a NOR circuit that performs the following. By shrinking the binary image in the image memory 2 by h pixels in the horizontal direction by the horizontal shrinking means 21, lines and characters having a width of h pixels or less disappear in the horizontal direction. , A horizontal line segment longer than h pixels is extracted. Similarly, by shrinking v pixels in the vertical direction by the vertical contraction means 23, lines and characters having a width of v pixels or less disappear in the vertical direction, and v pixels are extended by v pixels in the vertical direction by the subsequent vertical extension means 24. Vertical lines longer than the pixel are extracted. When the NOR operation of the outputs of the horizontal extension means 22 and the vertical extension means 24 is performed by the NOR circuit 25, a binary image is obtained in which characters are deleted and only the frame ruled lines remain. It has a value of "1".
[0017]
The contraction and extension processing in the horizontal and vertical directions in FIG. 2 will be described in more detail with reference to FIGS.
[0018]
FIG. 3 is a flowchart showing the operation of the horizontal and vertical contraction means 21 and 23. The binary image is sequentially scanned one line at a time in the horizontal or vertical direction and processed until the end line. When the pixel contraction process is performed, assuming that the count value of the run length is C, the count value C is set to 0 at the start of scanning of each line in step 31, one pixel data is read in step 32, and the pixel value is read in step 33. It is determined whether the value is 0 (white) or 1 (black). If the value is 0, the process proceeds to step 34, where the count value C is set to 0. Further, the process proceeds to step 35, and the value 0 is output because it is not a black run. If the value in step 33 is 1, the flow advances to step 36 to determine whether or not the count value C is equal to or more than n. If the value is less than n, the flow advances to step 37 to increment the count value C. Since a predetermined black run does not exist up to the position, a value 0 is output. When the count value C is equal to or greater than n in the determination in step 36, the process proceeds to step 38, and a value 1 is output because a black run having a run length of n pixels or more exists up to the scanning position. By performing the above processing until the end of one line, the black run on that line is reduced by n pixels. When processing the next line, the same processing is repeated from step 31 again. When scanning of the entire screen is completed in this way, a line segment having a run length of n pixels or more in the horizontal or vertical direction is extracted.
[0019]
Similarly, FIG. 4 is a flow chart showing the operation of the horizontal and vertical extension means 22 and 24. The binary image is sequentially scanned one line at a time in the horizontal or vertical direction, and processed until the end line. Assuming that the count value of the run length is C when performing the extension process of n pixels at step S, the count value C is set to n at the start of scanning of each line in step 41, one pixel data is read in step 42, and It is determined whether the value of the pixel is 0 (white) or 1 (black). When the value is 1, the process proceeds to step 44, where the count value C is set to n. . If the value in step 43 is 0, the process proceeds to step 46 to determine whether or not the count value C is equal to or less than 0. If the value is greater than 0, the process proceeds to step 47 to decrement the count value C. Since the position is within n pixels from the black run, a value of 1 is output. When the count value C is equal to or less than 0 in step 46, the process proceeds to step 48, and the value 0 is output because the scanning position is farther than n pixels from the black run. By performing the above processing until the end of one line, the black run on that line is extended by n pixels. When processing the next line, the same processing is repeated from step 41 again. When scanning of the entire screen is completed in this way, the run length is extended by n pixels in the horizontal or vertical direction.
[0020]
Next, a series of processes in the corner detecting means 4, the component detecting means 5, and the rectangular detecting means 6 will be described. The details of these processes are described in Japanese Patent Application No. 7-016682 filed by the same applicant. The description is omitted, and the operation is briefly described.
[0021]
First, the corner detecting means 4 will be described with reference to FIGS. FIG. 5 is a diagram showing a result of converting an input image into a direction-coded image as preprocessing for detecting a corner, FIG. 6 is a diagram showing a correspondence relationship between direction codes 1 to 8 and actual directions, and FIG. It is a figure showing the example of the corner to detect. In FIG. 5, reference numeral 51 denotes a pixel of a frame ruled line, 52 denotes a pixel of a background, and a numeral denotes a direction code assigned to an outline point. In this case, a tracking direction when the outline is traced clockwise with respect to the background pattern. Are assigned as direction codes 1 to 8. A change point of the direction code, that is, a corner is detected from the image thus coded. For this purpose, a position that is not the same direction code as the target pixel is detected in the direction indicated by the target position (center pixel) code near 3 × 3. In FIG. 5, the circled position indicates a change point of the direction code. For example, at the position 53, the pixel arrangement is as shown in FIG. The direction code of the position is “1”, which means that the direction of the outline changes from “3” to “1”. Change codes) are detected as one set. Similarly, the pixel positions 54, 55, and 56 correspond to (b), (c), and (d) of FIG. 7, and are given direction change codes of "17", "75", and "53", respectively. For these corner points, the x-coordinate, y-coordinate, and direction change code are notified to the component detection unit 5 as a set of feature information.
[0022]
Next, the component detection means 5 will be described with reference to FIGS. FIG. 8 is a diagram showing a determination condition for detecting a component from a combination of corner points, and FIG. 9 is a diagram showing a description format of the component. In FIG. 8, (a), (b), (c), and (d) are examples of detecting an L-shaped element, (e), (f), (g), and (h) are examples of detecting a T-shaped element, and (h) is a cross-shaped element. An example of detection is shown, in which a plurality of corner points having x and y coordinates within a predetermined distance are grouped using the characteristic information of the corner points from the corner detection means 4, and the directions of the corner points which are members of the group are grouped. The type of the component is associated with the combination of the change codes. The shape of the component detected in this way is described by a 4-bit code (hereinafter referred to as a shape code) as shown in FIG. 9A, and each bit is represented by S, B in FIG. In which direction W, N, or E the arm is present is shown. For example, the L-shaped element shown in FIG. 8A has arms in the S direction and the E direction, and is described by a bit pattern of “1001”. The x-coordinate and y-coordinate of the component give the average value of the x-coordinate and y-coordinate of the corner point which is a member of the group, and the component detecting means 5 calculates the x-coordinate and y-coordinate and the shape of the component. The code is notified to the rectangle detecting means 6 as feature information.
[0023]
Next, the rectangle detecting means 6 will be described with reference to FIG. FIG. 10 is a diagram showing a connection relationship between components. FIG. 10A shows an example of an L-shaped element, a T-shaped element, and a cross element detected by the component detecting means 5, and the rectangle detecting means 6 describes a connection relationship of these components as rectangle information. Generate and output a table. First, identification labels e1 to e20 are given to each component, and then, based on the feature information (x coordinate, y coordinate, shape code) from the component detection means 5, the arm indicated by the shape code in the x direction and y A direction is searched, and a component having a connectable arm and located at the shortest distance is detected. FIG. 6B shows a connection table showing the connection relationship of the components, and describes to which component each component is connected in each of N, S, E, and W directions. For example, in the case of the L-shaped element e1, there are e15 and e2 as components corresponding to the arms S and E, and in the case of the T-shaped element e2, e8, e1, as components corresponding to the arms S, W, and E. And e3 will be present. The attribute flags shown in FIG. 3B are parameters used in the feature extracting means 8, and the specific contents will be described later.
[0024]
Next, the outer frame detection means 7 will be described with reference to FIGS. FIG. 11A is a diagram illustrating an example of an outline frame, FIG. 11B is a diagram illustrating a path from tracking the outline frame to a clockwise direction and returning to the start point, and FIG. 12 is a flow of a process of detecting the outline frame. FIG. In FIG. 11A, rectangles 62, 63, and 64 indicated by thick lines are outline frames, and are formed of outermost ruled lines of a closed frame structure in the form 61. For example, with respect to the outline frame 62, as shown in FIG. 6 (b), the tracing path from the start point element f1 to the outline frame in the clockwise direction and back to the start point is f1, f2, f6, f10, f12, f11, f9, f5, and f1, and the vertices of the outer frame become components that change the tracking direction as indicated by circles, and are f1, f2, f12, f11, and f1, and these are called vertex paths. In the following description, the tracking route is represented by [...] And the vertex route is represented by {...}. For the outer frame 63, the tracking route is [f3 f4 f8 f14 f13 f7 f3], and the vertex route is {f3 f4 f14. f13 f3}, the tracking path of the outer frame 64 is [f15 f16 f17 f20 f24 f29 f33 f32 f31 f30 f25 f21 f18 f15], and the vertex is {f15 f17 f33 f32f31 f30 f15}. Therefore, in the example of the form shown in FIG. 11A, the total number N of the outer frames is three. Next, a process for extracting these outer frames will be described with reference to FIG. First, variables used in the flowchart will be described. The input variables are the component connection table shown in FIG. 10B, and the identification label is ei, the x coordinate is x (ei), the y coordinate is y (ei), and the shape is the i-th component on the table. The code is code (ei), the direction pointer is dk (ei), and k = 0, 1, 2, and 3 correspond to each direction of S, W, N, and E. i takes a value from 0 to (M-1), where M is the total number of components on the table. Next, as output variables, the total number of outline frames is N, the tracking path is Bj = [bj (0) bj (1)...], And the vertex path is Cj = ｛cj (0) cj for the j-th outline frame. (1) ...｝ However, j takes a value from 0 to (N-1). The control variables are a component pointer sp that has started tracking of the outer frame, a component pointer np for tracking the next, a variable dir indicating the current tracking direction, and a tracking flag f of the outer frame. First, at 71, the total number N of outer frames is set to 0, at 72, a flag f indicating whether or not the outer frame is being tracked is set to 0, and at 73, 0 is set to control variables i and j. Hereinafter, the individual outline frames are tracked clockwise while referring to the component connection table. If the control variable i is equal to or greater than the total number M of components in 74, the process ends; otherwise, the process proceeds to 75. At 75, it is determined from the value of the flag whether or not the outer frame is currently being tracked. When the flag value is 0, that is, when the starting point of the outer frame is being searched, the process proceeds to 76, and when the flag value is 1, that is, at the outer frame. Is being tracked, the process proceeds to the determination at 84. The determination in step 76 is to determine whether or not the currently focused component is the starting point of the outer frame tracking, and to determine whether or not the left shoulder element of the outer frame, that is, the component having arms in the S and E directions. If so, the flow advances to 78 to set the tracking flag f, and if not, the flow advances to 77 to update the control variable i and check the next component. Steps 79 and 80 are for giving initial values to the control variables. At 79, a start point pointer is set. At 80, a value 3 indicating the direction E is set to the control variable kk and a variable dir indicating the current tracking direction. Next, at 81 and 82, the identification code ei of the component of interest at present is registered in the tracking paths Bj and Cj. Then, in step 83, the value of the E-direction pointer of the component currently focused on is set to the next point pointer np. Next, the process returns to the determination 75 again, and proceeds to the determination 84 since the tracking flag f is set this time. Reference numeral 84 denotes a determination as to whether the next point pointer np and the start point pointer sp coincide with each other. If they match, it means that the frame has been closed as an outline frame, and the routine proceeds to 94 where the total number N of outline frames is incremented. j is incremented, the tracking flag f is reset at 96, the control variable i is updated at 77, and the determination at 74 is performed again. When the next point pointer and the start point pointer do not match in the determination at 84, the pointers in the S, W, N, and E directions in the component pointed to by the next point pointer np are searched to determine the next np. Reference numeral 85 denotes a process for determining from which direction to search in the directions of S, W, N, and E. In the expression,% indicates a modulo operation. That is, by adding 3 to the current direction kk and performing a modulo operation, the clock direction is checked in the clockwise direction from the direction immediately adjacent to the current direction clockwise. At this time, in the determination at 86, it is determined whether or not the user has an arm in the direction. If the user does not have an arm, the value of kk is updated at 87 and the process proceeds to 85 again. . At 88, a determination is made as to whether or not the arm direction matches the current tracking direction. If not, the tracking direction changes. Therefore, a new tracking direction is set to the variable dir at 90, and attention is paid to 91 at 91. The registered component is registered in the vertex route Cj as a direction change point, that is, a vertex, and is also registered in 89 at the tracking route Bj. If the tracking direction does not change in the determination at 88, registration is performed only on the tracking route Bj. Subsequently, the identification code of the component in the direction k obtained is set to the next point pointer np, the value of k is set to the control variable kk indicating the current direction, and the determination of 75 or less is repeated again. By executing the above processing until the end, all the outline frames in the form are detected, and the total number N, the tracking path Bj, and the vertex path Cj are obtained.
[0025]
Next, the feature extracting means 8 will be described with reference to FIGS. 13 and 14 are flowcharts for detecting the attribute of the component to the external frame, and FIG. 15 is a flowchart for detecting the internal frame of each external frame from the component connection table. First, variables used in the flowchart of FIG. 13 will be described. The input data is the component connection table shown in FIG. 10B and the tracking route Bj (j = 0 to N−1) obtained by the processing of FIG. 12, and the identification label, shape code, and direction pointer are The notation is the same as that in FIG.
The total number of elements bj (h) belonging to the tracking route Bj is denoted as pj. On the other hand, the output data is an attribute flag g (ei) indicating which outline frame each component belongs to. That is, it is to determine which outer frame the constituent element shown in FIG. 10B belongs to. Prior to the determination, it is assumed that the attribute flags g (ei) of all the constituent elements are set to -1 in advance. In FIG. 13, first, control variables j, h, i, and k are set to 0 in 101 to 104, and a component on the tracking path is set at the position of the direction pointer of the arm of the component of interest in the determination of 109. Determine whether they match. If they match, the flow advances to step 114 to set the value of j, that is, the number of the outer frame to which the attribute flag g (ei) belongs. On the other hand, if they do not match, the process proceeds to determination of 111 or less. At 111, it is determined whether or not the control variable i has advanced to M-1. If the condition is satisfied, the process from 104 is repeated, and if not, the process proceeds to 112 and below. At 112, it is determined whether or not the control variable h has advanced to the number of elements pj of the tracking path Bj. If the condition is satisfied, the processing from 103 is repeated, and if not, the processing proceeds to determination at 113. At 113, it is determined whether or not the control variable j has advanced to N-1. If the condition is satisfied, the process from 102 is repeated, and if not, the process proceeds to the process shown in FIG. By the above processing, the attribute flag g (ei) of the component on each outer frame is set. The process shown in FIG. 14 refers to the component connection table itself, and determines whether or not an element whose attribute has not yet been determined matches the direction pointer of the component whose attribute has already been determined. This is a process of giving a flag. First, at 121, a control variable i is set to 0 and a flag f is set to 0. Next, at 128, it is determined whether or not the attribute of the component ei is determined. If it has not proceeded to 1, i is updated and the determination of 128 is performed again. If the attribute has not been determined, the process proceeds to step 122, where the flag is set to 1, the process proceeds to step 123, and the control variable s is set to 0. Then, in step 129, a component whose attribute is already determined is searched. If the attribute has not been determined yet, the process proceeds to the determination of 131, and if the control variable s has not advanced to M-1, s is updated and the determination of 129 is performed again. If there is a component whose attribute is determined, the control variable r is set to 0 at 124, and it is determined at 130 whether or not the component ei of interest exists in the direction pointer. Goes to 135 and sets g (es) in the attribute flag g (ei). If the condition of 130 is not satisfied, r is incremented at 125 to see another direction pointer. If the control variable i has advanced to M-1 in 133, the process proceeds to 134, and the value of the flag f is checked. If the value is not 0, the process is repeated from 121 again. If the value of the flag remains 0 in the determination of 134, it means that the attributes of all the constituent elements have been given, and the process ends. Next, a processing flow for detecting a frame inside each external frame from the component connection table will be described with reference to FIG. First, variables used in FIG. 15 will be described. The data to be input is the component connection table shown in FIG. 10B, and the identification label, the direction pointer, and the affiliation flag have the same notation as in FIGS. On the other hand, the output data is the internal frame structure table Rj for each external frame and the total number tj of internal frames for each external frame. Where j takes a value from 0 to (N-1)
And
[0026]
When the upper left point of the four vertices of the frame is defined as the start point, the internal frame structure table Rj has the identification number of each frame, the x and y coordinates of the start point, the width w and the height h of the frame as elements. And First, in steps 141 to 143, 0 is set to the control variables i and j and the total number tj of internal frames. In the determination of 145, it is determined whether or not the target component belongs to the outer frame of interest. If it does not, the process proceeds to determination 159, and if it does, the process proceeds to determination 146. Reference numeral 146 determines whether or not the target component is the starting point of the frame. The determination condition is whether or not there is a component to be connected to the pointer in the E direction and the S direction. If it is the start point, the process proceeds to 147 and below, but the method of recognizing it as a frame from the start point is basically to follow a connecting line clockwise from the start point in the order of E → S → W → N, and to reach the previous element if a dead end is reached. Return A new component is selected, the search is repeated, and finally a set of components that can return to the starting point is defined as one frame. First, at 147, it is checked whether or not components are connected in the E direction of the target point. If no, the process proceeds to 159, and if yes, the process proceeds to 148, and the connected component in the E direction is set as the target point. Next, in 149, it is checked whether or not components are connected in the S direction of the target point. If no, the process returns to 147. If yes, the process proceeds to 150, and the connected components in the S direction are set as the target points. Subsequently, in 151, it is checked whether or not the components are connected in the W direction of the target point. If no, the process returns to 149. If yes, the process proceeds to 152, and the connected components in the W direction are set as the target points. Subsequently, in 153, it is checked whether or not the components are connected in the N direction of the target point. If no, the process returns to 151, and if yes, the process proceeds to 154, and the connected components in the N direction are set as the target points. Subsequently, in 155, it is checked whether the target point of 154 coincides with the start point. If no, the process returns to 153. If yes, one frame is detected and the process proceeds to 156. In 156, the width w of the frame is calculated from the difference between the x coordinate of the starting point and the x coordinate of the component in the E direction, and the height h of the frame is calculated from the difference between the y coordinate of the starting point and the y coordinate of the component in the S direction. I do. Next, at 157, the x coordinate, y coordinate, width w, and height h of the start point of the frame are registered in the internal frame structure table Rj, and at 158, the total number of internal frames tj is incremented. Thereafter, the flow advances to 159 to determine the control variable i. If i has not progressed to the total number M-1 of components, the flow advances to 160, where i is incremented and the flow again advances to the determination at 145. If i is M or more, the flow advances to 161. . At 161, the control variable j is determined. When j does not advance to the total number N−1 of outline frames, the process proceeds to 162, j is incremented, and the process returns to 142 again. When j is N or more, the process ends. By the outer frame detecting means 7 and the feature extracting means 8 described with reference to FIGS. 12 to 15, the total number of outer frames, vertex positions, and the inner structural characteristics of the individual outer frames are used as the outer frame structural features of the form. The total number, the starting point position, the width, and the height of the frame are detected, and the next frame structure collating means 10 collates with the frame structure of the form registered in the frame structure reference table 9 to identify the form and determine the character reading area. I do.
[0027]
Hereinafter, the frame structure reference table 9 and the frame structure matching means 10 will be described with reference to FIGS. FIG. 16 is a diagram showing a specific example of a form registered in the frame structure reference table 9, and FIG. The form shown in FIG. 16A has three outer frames 211, 212, and 213, and each outer frame has 13 inner frames, each of which has an identification number of (1) to (13). Has been given. Also, the form shown in FIG. 2B has two outer frames 221 and 222, and each outer frame has 11 inner frames, and identification numbers (1) to (11) are assigned to the respective outer frames. Have been. Also, the form shown in FIG. 3C has one outer frame 231 and 11 inner frames, each of which is given an identification number (1) to (11). Further, the form shown in FIG. 4D has one outer frame 241 and 13 inner frames, and are assigned identification numbers (1) to (13), respectively. The features of the frame structure of these forms are registered in the tables shown in FIGS. 17A shows a table for registering the outer frame structure, and FIG. 17B shows a table for registering the inner frame structure. In FIG. 3A, the elements registered in the table include the total number of outer frames, the number of vertices, the total number of internal frames in each outer frame, the identification number of the outer frame, and the vertex coordinates of the outer frame. By searching the table using these characteristics in, as parameters, form candidates are determined. Next, the detailed structure of the inner frame shown in FIG. That is, the starting point coordinates, width, and height of each frame are determined to be coincident, and when they match, the form is considered to be specified, and the starting point coordinates, width, and height of the input form frame corresponding to the identification number of the frame to be read are determined. It notifies the character extracting means 11. For example, in the case of the form shown in FIG. 16D, the reading target frames are (6) and (7), and the start point, width, and height of each frame are notified to the character cutout unit 11.
[0028]
Next, the character extracting means 11 will be described. The character extracting unit 11 extracts an actual character binary image from the image memory 2 and sends it to the character recognizing unit 12. The processing will be described with reference to FIG. FIG. 18 shows a binary image in which characters are drawn in the form frame. Here, reference numeral 251 denotes a starting point (xs, ys) of the frame to be read, 252 denotes a character reading area, and 253 denotes a character cutout area. Based on the start point (xs, ys), the width w, and the height h notified from the frame structure matching unit 10, the width (w−2dx) and the height from the start point 254 (xs + dx, ys + dy), taking into account the margins dx and dy. The area of (h−2 · dy) is determined as the reading area 254. The cut-out area for each character is determined by taking projections in the horizontal and vertical directions in this reading area, and an area having a predetermined frequency or more in the horizontal and vertical directions is set as a character cut-out area 253. Since character segmentation using projection is a general technique, a detailed description is omitted. The individual characters cut out in this way are transferred to the character recognition means 12 at the subsequent stage, and the character string in the designated frame can be recognized.
[0029]
【The invention's effect】
As described above, according to the present invention, a ruled line is extracted from a binary image of a form read by an image input unit, components that are intersections of the ruled lines are detected, and each component is connected to each other to form a frame. Generating the structure, detecting the outer frame of the frame structure, classifying the form based on its characteristics, further collating the frame structure inside each outer frame with a sample, specifying the type of the form, and detecting the frame to be read. Accordingly, even when forms of various formats are mixed, the form can be identified, the individual character reading area of each form can be accurately detected, and the highly reliable character recognition can be performed.
[Brief description of the drawings]
FIG. 1 is a diagram showing block connection of a form recognition apparatus according to an embodiment of the present invention;
FIG. 2 is a diagram showing block connection by ruled line extracting means in the form recognition apparatus according to the embodiment;
FIG. 3 is a flowchart showing a processing procedure in a line segment contracting unit in the form recognition apparatus according to the embodiment;
FIG. 4 is an exemplary flowchart showing a procedure of processing by a line segment extending means in the form recognition apparatus according to the embodiment;
FIG. 5 is a view for explaining direction coding and direction code change point detection by a corner detecting means in the form recognition apparatus of the embodiment.
FIG. 6 is a diagram showing a correspondence relationship between a direction code and an actual direction by a corner detecting means in the form recognition apparatus according to the embodiment;
FIG. 7 is a diagram showing a specific example of a direction change code in a corner detection unit in the form recognition device of the embodiment.
FIG. 8 is a diagram showing a specific example of a component based on a combination of corner points in a component detection unit in the form recognition apparatus of the embodiment.
FIG. 9 is an exemplary view for explaining the description of the form of the component in the component extracting means in the form recognition apparatus of the embodiment.
FIG. 10 is a conceptual diagram showing a connection relationship between components in a rectangle detection unit in the form recognition apparatus of the embodiment.
FIG. 11 is a diagram showing an example of outer frame detection by the outer frame detector in the form recognition apparatus according to the embodiment;
FIG. 12 is a flowchart showing a procedure of an outer frame detection process by an outer frame detector in the form recognition apparatus according to the embodiment;
FIG. 13 is a flowchart showing a procedure of a process of detecting an attribute of a component by a feature extracting unit in the form recognition apparatus of the embodiment.
FIG. 14 is a flowchart showing a procedure of a process of detecting an attribute of a component by a feature extracting unit in the form recognition apparatus of the embodiment.
FIG. 15 is a flowchart showing a procedure of a process of detecting an internal frame structure in the feature extracting means in the form recognition apparatus of the embodiment.
FIG. 16 is a diagram showing a specific example of a form registered in a frame structure reference table in the form recognition apparatus of the embodiment.
FIG. 17 is a diagram showing a table structure in a frame structure reference table in the form recognition apparatus according to the embodiment;
FIG. 18 is a diagram showing processing of a character cutout unit in the form recognition apparatus of the embodiment.
FIG. 19 is a diagram showing a processing operation of a conventional form recognition device.
FIG. 20 is a diagram showing a processing operation of a conventional form recognition device.
FIG. 21 is a diagram showing a processing operation of a conventional form recognition device.
[Explanation of symbols]
1 Image input means
2 Image memory
3 Ruled line extraction means
4 Corner detection means
5 Component detection means
6 rectangle detection means
7 Outer frame detection means
8 Feature extraction means
9 Frame structure reference table
10 Frame structure collation means
11 Character extraction means
12 Character recognition means
20 binary image
21 Horizontal contraction means
22 Horizontal extension means
23 Vertical contraction means
24 Vertical extension means
25 NOR circuit
51 Frame ruled line pixels
52 Background Pixel
53-56 corner points
61 Reports
62-64 outer frame
210 Form A
211-213 Outer frame
220 Form B
221-222 Outer frame
230 Form C
231 Outer frame
240 Form D
241 outer frame
251 start of frame
252 character reading area
253 character cut-out area
Starting point of 254 character reading area

Claims

A corner detecting means for detecting a corner of the frame ruled line from the binary image; a component detecting means for detecting a component of the frame ruled line from a combination of the detected corner shapes of the frame ruled line; , a rectangle detection means for outputting a frame structure information, the frame structure information to detect the outline frame outermost based, and outline frame detecting means for extracting a feature of the structure of the outer shape frame as outline frame structure, wherein, feature extracting means for extracting the internal structural features as an internal structural characteristic for each extracted outline frame, the outer frame and inner structural features previously registered frame structure reference table as the format of one or more of the form, is the detection contour frame structure, wherein the determining the candidates of the frame structure by referring to searching a table form, specifying the type of a form by collating with the detected internal structural features with respect to a predetermined candidate frame Form recognition apparatus comprising a granulation checking means.

Further, the image processing apparatus further includes a ruled line extracting means for extracting a frame ruled line by detecting runs having a predetermined length or more in the horizontal and vertical directions from the binary image including the frame ruled line and the character. 2. The form recognition device according to claim 1, wherein the form is detected .

3. The component detecting unit according to claim 1, wherein the component detecting unit detects an L-shaped component, a T-shaped component, and a cross-shaped component from the combination of the shapes of the corners as components formed by bending or intersecting ruled lines . Form recognition device.

2. The form recognition apparatus according to claim 1, wherein the outer shape frame detecting means connects the vertex coordinates and extracts each of the outer shape frames as an outer shape structural feature.

5. The form recognition apparatus according to claim 4, wherein the outline frame structure feature extracts the total number, position, and shape of the outline frames in the target form.

2. The form recognition apparatus according to claim 1, wherein the feature extracting means extracts the number, position, width and height of the internal frames as internal structural features for each external frame.

Further, an image memory for storing a binary image including a frame ruled line, and a character cutout unit for cutting out a character area from the image memory based on the coordinates of the character reading target area from the detected outer frame structure feature and internal structure feature, 7. The form recognition apparatus according to claim 1, further comprising a character recognition unit that recognizes the cut-out character.

Detecting a corner of the frame ruled line from the binary image; detecting a component of the frame ruled line from a combination of the detected corner ruled shape of the frame ruled line; connecting the components to each other to form frame structure information; Outputting, and detecting the outermost outer frame based on the frame structure information, and extracting a feature of the outer frame structure as an outer frame structure feature.
Extracting the internal structural features as internal structural features for each of the extracted external frames; a frame structure reference table in which the external frames and internal structural features are registered in advance as one or more forms of the form; The form type is determined by searching the outer frame structure feature and the frame structure reference table to determine a form candidate, and identifying the form type by comparing the determined candidate with the detected internal structure feature. And a recognition step.