JP3622347B2

JP3622347B2 - Form recognition device

Info

Publication number: JP3622347B2
Application number: JP19871696A
Authority: JP
Inventors: 豊樹川原; 淳晴山本; 千尋植木; 幹男藤田; 好幸松山
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-07-29
Filing date: 1996-07-29
Publication date: 2005-02-23
Anticipated expiration: 2016-07-29
Also published as: JPH1040333A

Description

【０００１】
【発明の属する技術分野】
本発明は、帳票のような枠罫線と文字を含む文書画像において枠罫線の構造を認識し、帳票内の特定の文字領域を切り出し文字を認識するための帳票認識装置に関する。
【０００２】
【従来の技術】
近年、文書情報の電子化に伴い、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）を初めとする文字認識技術や文書画像処理に対する要望が高まっており、帳票など表形式文書の表構造認識技術もそのひとつである。
【０００３】
従来、帳票の枠罫線認識として、枠線のラン長で線分を検出する方法が良く知られており例えば特開平０１−２１７５８３号公報があり、その罫線認識装置のブロック結線図を図３１に示し説明する。
図３１において、１００１は画像入力部、１００２は画像メモリ、１００３は縦方向ランを抽出する縦方向ラン抽出部、１００４は縦方向線分を抽出する縦方向線分抽出部、１００５は横方向ランを抽出する横方向ラン抽出部、１００６は横方向線分を抽出する横方向線分抽出部、１００７は抽出された縦方向線分と横方向線分を用いて文字領域を抽出する文字領域抽出部であり、その動作を以下に説明する。
【０００４】
画像入力部１００１は、認識対象罫線を含む画像を走査し２値信号で画像メモリ１００２に格納する。縦方向ラン抽出部１００３は、画像メモリ１００２に格納されている画像を縦方向に走査して縦方向ランを抽出する。縦方向線分抽出部１００４は、抽出された縦方向のランの連結性を調べ縦方向線分を抽出する。同様の処理により、横方向ラン抽出部１００５で横方向ランを抽出し、横方向線分抽出部１００６で横方向線分を抽出する。
【０００５】
文字領域抽出部１００７は、縦方向線分抽出部１００４で抽出された縦方向線分と横方向線分抽出部１００６で抽出された横方向線分を用いて文字領域および文字記入領域を抽出するものである。
【０００６】
また、切り出された文字の認識方法については、各種方式が提案されており、例えばニューラルネットを用いた文字認識法（”ＰＤＰモデルによる手書き漢字認識”、電子情報通信学会論文誌、Ｖｏｌ．Ｊ７３−Ｄ−ＩＩ，Ｎｏ．８ｐｐ．１２６８−１２７４１９９０）があり、認識率も実用的なところまできているが、今回は帳票認識装置ということで帳票の指定枠の認識処理を中心としているので文字認識に関しては上記文献を提示し説明は省略するものとする。
【０００７】
【課題を解決するための手段】
この目的を達成するために本発明は、例えば帳票から自動的に枠線を認識し文字領域を切り出す際に、帳票を読み取り２値画像を出力するイメージスキャナーと、前記イメージスキャナーからの２値画像（文字、枠線を”０”、その他を”１”）から帳票の枠線のエッジに方向コードを付与し、その方向コードの変化する点を画像のコーナーとし、その特徴コードと座標をコーナー特徴点として抽出するコーナー検出手段と、近傍の前記コーナー特徴点の組合せから（Ｔ字、十字およびＬ字等の）枠線の構成要素を抽出する構成要素抽出手段と、前記構成要素抽出手段からの構成要素をもとに、注目構成要素を基準に右回り（あるいは左）に周辺の構成要素を検索し相手となる構成要素を連結し元に戻るまで繰り返し最小矩形を認識し矩形情報を出力する最小矩形認識手段と、前記最小矩形認識手段からの矩形情報の任意の角を基準とした連結する最小矩形を認識する構造認識手段と、前記構造認識手段からの枠線の構造から文字領域を切り出す文字切り出し手段とを設けたものである。
【０００８】
さらに、第２の課題として、枠罫線が汚れ等により断線したりあるいはコーナが直角ではなく丸みを持ったコーナの場合、ラン長による枠罫線の認識ができないという問題があった。
【０００９】
本発明は、前記従来技術の課題を解決するもので、帳票画像が傾いて入力された場合や、帳票に破線の枠罫線や断線あるいは丸みを持ったコーナが存在しても、文字枠を正確に検出することのできる信頼性の高い帳票認識装置を提供することを目的とする。
【００１０】
【課題を解決するための手段】
この第１の課題を解決するために本発明は、帳票を読み取り２値画像を得る画像入力手段と、２値画像を記憶する画像メモリと、２値画像の水平、垂直方向の実線の罫線を抽出する第１の罫線抽出手段と、２値画像の水平、垂直方向の実線及び破線の罫線を抽出する第１の罫線抽出手段と、２値図形のコーナーを検出する第１及び第２のコーナー検出手段と、コーナー形状の組み合わせから罫線の屈曲や交差によるＬ字要素、Ｔ字要素、十字要素を検出する構成要素検出手段と、構成要素同士を連結し連結形態を枠構造情報として出力する矩形検出手段と、複数の帳票のフォーマット情報を記憶するフォーマット情報記憶手段と、帳票内の矩形構造と前記フォーマット情報を照合し帳票の種別を判別し、前記第２のコーナー検出手段からのコーナー点情報をもとに実線と破線からなる罫線同士の交点を検出し、帳票内の文字読み取り対象枠を検出する枠構造照合手段と、前記文字読み取り対象枠の文字領域を切り出す文字切り出し手段と、切り出された文字を認識する文字認識手段とを設けたものである。
【００１１】
また、第２の課題を解決するために本発明は、構成要素検出手段として、第１のコーナー検出手段からのコーナーの組み合わせから罫線の屈曲や交差に罫線の断線を加えＩ字要素を検出し、さらに断線同士の構成要素あるいは断線と他の構成要素（Ｌ字要素、Ｔ字要素）とのグルーピングにより、Ｌ字要素、Ｔ字要素及び十字要素の構成要素に更新するようにしたものである。
【００１２】
これにより、帳票が画像として傾いて入力されたり、帳票内に破線の罫線や断線あるいは丸みを持ったコーナが存在しても、文字枠を正確に検出でき、信頼性の高い帳票認識装置が実現できる。
【００１３】
【発明の実施の形態】
本発明の請求項１記載の発明は、枠罫線と文字を含む帳票文書を読み取り２値画像を出力する画像入力手段と、前記２値画像を記憶する画像メモリと、前記２値画像において水平、垂直方向の実線の罫線を抽出する第１の罫線抽出手段と、実線の罫線からなるパターンのコーナーを検出する第１のコーナー検出手段と、前記２値画像において水平、垂直方向の実線及び破線の罫線を抽出する第２の罫線抽出手段と、枠罫線からなるパターンのコーナーを検出する第２のコーナー検出手段と、コーナー形状の組み合わせから罫線の屈曲や交差によるＬ字要素、Ｔ字要素、及び十字要素を検出する構成要素検出手段と、構成要素同士を連結し連結形態を枠構造情報として出力する矩形検出手段と、予め読み取り対象となる複数の帳票のフォーマット情報を記憶するフォーマット情報記憶手段と、前記枠構造情報と前記フォーマット情報とを順次照合し実線枠を検出すると共に、照合結果から帳票の種別を判別し、前記第２のコーナー検出手段からのコーナーと前記フォーマット情報とから実線枠で指定された対象枠と破線との交点である破線交点を検出し、対象枠の座標を出力する枠構造照合手段と、前記対象枠の座標に基づき前記画像メモリから文字領域を切り出す文字切り出し手段と、切り出された文字を認識する文字認識手段とを具備する帳票認識装置としたものであり、帳票の２値画像から実線及び破線の罫線を抽出し、その罫線からなるパターンのコーナー点を抽出し、コーナー点から枠罫線交点である構成要素を抽出し、相互に連結された構成要素から矩形構造を検出し、フォーマット情報と比較照合することにより、帳票が傾いて読み取られた場合や、帳票に破線の罫線が存在していても、読み取り対象の文字枠を正確に検出できるという作用を有する。
【００１４】
本発明の請求項３記載の発明は、構成要素検出手段として、第１のコーナ検出手段からのコーナーの組み合わせから罫線の断線としてＩ字要素を新たに検出し、Ｉ字要素同士のグルーピング、あるいはＩ字要素とＬ字要素、及びＴ字要素とのグルーピングにより、Ｌ字要素、Ｔ字要素あるいは十字要素の構成要素を更新することを特徴とし、枠罫線の断線や帳票枠の角部分が緩やかな曲率を持っている場合にも、構成要素を正確に検出できるという作用を有する。
【００１５】
以下、本発明の実施の形態について、図１から図３０を用いて説明する。
（実施の形態１）
図１は、本発明の実施の形態１の帳票認識装置のブロック構成図を示し、１は帳票文書を読み取り２値画像を得る画像入力手段、２は前記２値画像を記憶する画像メモリ、３は２値画像の水平、垂直方向の実線の罫線を抽出する第１の罫線抽出手段、４は実線の罫線のコーナーを検出する第１のコーナー検出手段、５は前記コーナーの組み合わせから罫線の屈曲や交差によるＬ字要素、Ｔ字要素、及び十字要素を検出する構成要素検出手段、６は構成要素同士を連結し枠構造情報として出力する矩形検出手段と、７は前記２値画像において水平、垂直方向の実線及び破線の罫線を抽出する第２の罫線抽出手段、８は前記実線及び破線の枠罫線のコーナーを検出する第２のコーナー検出手段、９は予め読み取り対象となる複数の帳票のフォーマット情報を記憶するフォーマット情報記憶手段、１０は前記枠構造情報と前記フォーマット情報とを順次照合し実線枠を検出すると共に、照合結果から帳票の種別を判別し、前記第２のコーナー検出手段からのコーナーと前記フォーマット情報とから実線枠で指定された対象枠と破線との交点である破線交点を検出し、対象枠の座標を出力する枠構造照合手段、１１は文字読み取り対象枠の文字領域を切り出す文字切り出し手段、１２は切り出された文字を認識する文字認識手段である。
【００１６】
以上のように構成された帳票認識装置について、その動作の概要を説明する。画像入力手段１により帳票を読み取り、文字部及び枠罫線部が値１、背景が値０をもつ２値画像に変換し画像メモリ２に記憶する。第１の罫線抽出手段３は、画像メモリ２からの２値画像を水平及び垂直方向に走査し所定長以上の値”１”が連続する線を抽出し、第１のコーナー検出手段４により水平及び垂直の罫線からのコーナーを検出する。構成要素検出手段５は、第１のコーナー検出手段４からのコーナーの組み合わせから罫線の屈曲部や交差部のＬ字要素、Ｔ字要素、及び十字要素の構成要素を検出する。矩形検出手段６は、構成要素検出手段５からの構成要素同士を相互に連結し枠構造情報を得る。
【００１７】
第２の罫線抽出手段７は、前記２値画像の水平、垂直方向の実線及び破線の罫線を抽出し、第２のコーナー検出手段８により実線及び破線の枠罫線からコーナーを検出する。
【００１８】
フォーマット情報記憶手段９は、予め読み取り対象となる複数の帳票のフォーマット情報を記憶する。枠構造照合手段１０は、構成要素検出手段５からの枠構造情報と前記フォーマット情報を照合し帳票の種別を判別し、枠構造情報から実線枠を検出すると共に、第２のコーナー検出手段８からのコーナーと前記実線枠および前記フォーマット情報から実線と破線との交点を検出し、実線枠とその実線枠と破線との交点から帳票内の対象枠を検出する。文字切り出し手段１１は、読み取り対象枠の４頂点の座標に基づき文字領域を切り出し、文字認識手段１２により切り出された領域から文字を認識するものである。
【００１９】
次に図１に基づいて、各構成要素の動作を詳細に説明する。
画像入力手段１は、帳票を読み取り２値画像を出力するもので、本発明の実施の形態１では読み取り線密度を約４００ｄｐｉ程度とし、原稿である帳票にＬＥＤ（発光ダイオード）等で照明しその反射光を一次元のＣＣＤカメラで読み取り、任意の閾値で２値化して文字部を値１、背景を値０の２値画像を出力する。
【００２０】
また、照明は、原稿である帳票の枠線や記入された文字の色によって異なるが、例えば青・黒および赤等の枠線に対して、黒や青等で数字や記号および文字が記入された場合、緑あるいは黄緑の波長（５５０〜５７０ｎｍ付近）のＬＥＤを用いることが多い。２値化処理においては、固定閾値法や浮動閾値法（”認識問題としての二値化と各種方法の検討”、情報処理学会、イメージプロセッシング１５−１，Ｎｏｖ．１９７７）が良く知られており、本発明の実施の形態１では２値化処理法については特に言及するものではないので、原稿に合わせて任意の２値化処理法を選択すればよい。このように２値化された画像データは画像メモリ２に格納され、各処理で必要に応じて読み出される。
【００２１】
次に第１の罫線抽出手段３について図２を用いて説明する。図２は、第１の罫線抽出手段３における画像処理のブロック構成図を示し、２０は画像メモリ２からの２値画像、２１は水平方向にパターンを縮める水平方向収縮手段、２２は水平方向にパターンを延長する水平方向延長手段、２３は垂直方向にパターンを縮める垂直方向収縮手段、２４は垂直方向にパターンを延長する垂直方向延長手段、２５は水平方向延長手段２２と垂直方向延長手段２４の出力のＮＯＲ演算を行うＮＯＲ回路である。
【００２２】
水平方向収縮手段２１は、画像メモリ２からの２値画像２０に対し、水平方向にｈ画素縮めることにより、水平方向にｈ画素以下の幅の線や文字を消滅させるものである。続く水平方向延長手段２２は、水平方向にｈ画素延長することによりｈ画素より長い水平線分のみを抽出する。
【００２３】
同様に、垂直方向収縮手段２３は、垂直方向にｖ画素縮めることにより、垂直方向にｖ画素以下の幅の線や文字が消滅させるものである。続く垂直方向延長手段２４は、垂直方向にｖ画素延長することによりｖ画素より長い垂直線分を抽出する。ＮＯＲ回路２５は、水平方向延長手段２２と垂直方向延長手段２４からの出力をＮＯＲ演算を行い、文字部が消去され枠罫線のみが残り、枠罫線及び背景がそれぞれ”０”及び”１”の値をもつ２値画像が得られる。
【００２４】
次に、水平及び垂直方向の収縮及び延長処理について図３及び図４を用いてさらに詳細に説明する。
【００２５】
図３は、水平及び垂直方向収縮手段２１及び２３の処理手順を示すフロー図で、２値画像を水平方向または垂直方向に１ラインずつ順次走査し終了ラインまで処理を行い、各ライン毎にｎ画素の収縮処理を行うとき、ランレングスのカウント値をＣとし、ステップ毎に説明する。
【００２６】
ステップ３１は、各ラインの走査開始時にカウント値Ｃに０を設定する。ステップ３２は、１画素データを読み込む。ステップ３３は、画素の値が０（白）か１（黒）かを判定し、０のときステップ３４へ、１のときステップ３６に進む。ステップ３６は、カウント値Ｃに０を設定する。ステップ３５は、黒ランではないので値０を出力する。ステップ３６は、カウント値Ｃがｎ以上かどうかの判定を行い、ｎ未満のときステップ３７へ、ｎ以上のときステップ３８に進む。ステップ３７は、カウント値Ｃをインクリメントしステップ３５に進む。ステップ３８は、ｎ画素以上のランレングスをもつ黒ランが存在するので値１を出力する。
【００２７】
以上の処理を１ラインの終了まで行うことにより、そのライン上の黒ランがｎ画素縮められる。次のラインを処理するときは再びステップ３１から同様の処理を繰り返す。このようにして全画面の走査が終了すると、水平または垂直方向にｎ画素以上のランレングスを有する線分が抽出される。
【００２８】
同様に図４は、水平及び垂直方向延長手段２２及び２４の処理手順を示すフロー図で、２値画像を水平方向または垂直方向に１ラインずつ順次走査し終了ラインまで処理し、各ライン毎にｎ画素の延長処理をおこなうとき、ランレングスのカウント値をＣとし、各ステップ毎に説明する。
【００２９】
ステップ４１は、各ラインの走査開始時にカウント値Ｃに０を設定する。ステップ４２は、１画素データを読み込む。ステップ４３は、画素の値が０（白）か１（黒）かを判定し、１のときステップ４４へ、０のときはステップ４６に進む。ステップ４４は、カウント値Ｃにｎを設定する。ステップ４５は、黒ラン上にあるので値１を出力する。
【００３０】
ステップ４６は、カウント値Ｃが０以下かどうかの判定を行い、０より大きい場合ステップ４７へ、０以下のときステップ４８へ進む。ステップ４７は、カウント値Ｃをデクリメントし、さらにステップ４５へ進む。ステップ４８は、その走査位置は黒ランからｎ画素より大きく離れているので値０を出力する。
【００３１】
以上の処理を１ラインの終了まで行うことにより、そのライン上の黒ランがｎ画素延長される。次のラインを処理するときは再びステップ４１から同様の処理を繰り返す。このようにして全画面の走査が終了すると、水平または垂直方向にランレングスがｎ画素分だけ延長される。
【００３２】
次に第１のコーナー検出手段４、構成要素検出手段５、及び矩形検出手段６における一連の処理について説明するが、これらの内容は同一出願人により特願平７−０１６８６２号に記載されており詳細な説明は省略し、その動作を簡単に説明する。
【００３３】
まず第１のコーナー検出手段４について図５から図７を用いて説明する。図５は、コーナーを検出するための前処理として、第１の罫線抽出手段で抽出された２値画像の実線の罫線の輪郭に方向コードを付与した方向コード化画像に変換した結果を示す。図６は、方向コード１〜８と実際の方向の対応関係を示す図であり、図７は検出するコーナーの具体例を示す図である。
【００３４】
図５において、５１は枠罫線の画素、５２は背景の画素、数字は輪郭点に付与された方向コードをそれぞれ示しており、この場合背景パターンを右回りの方向に輪郭を追跡しながら図６に示す方向コード１〜８を割り当てている。
【００３５】
なお、背景画素に方向コードを付与したが枠罫線の輪郭画素に付与しても良く、また背景パターンを右回りの方向に輪郭を追跡しているが、左回りに追跡しても良い。
【００３６】
コーナーの検出は、このように方向コード化画像から方向コードの変化点、すなわちコーナーを検出する。このために３×３近傍において、注目位置（中央画素）コードが指示する方向に、中央画素と同一の方向コードでない方向コードを持つ画素を検出する。図５において、丸で囲まれた位置は方向コードの変化点を示している。例えば５３の位置では、図７（ａ）に示す画素配置となっており、注目画素の指示する方向”３”の示す位置にある画素の方向コードは”１”となっており、輪郭の方向が”３”から”１”へ変化することを意味するので方向コードの変化点であるコーナーとして検出する。
【００３７】
また、方向コードの変化点は、”３１”というコード（以下方向変化コードと呼ぶ）で表記し、ｘ座標、ｙ座標と方向変化コードを１組の特徴情報として検出する。同様に画素位置５４、５５、５６は図７の（ｂ）、（ｃ）、（ｄ）に対応しており、それぞれ”１７”、”７５”、”５３”という方向変化コードが与えられ、これらのコーナー点の持つｘ座標、ｙ座標、方向変化コードを１組の特徴情報として構成要素検出手段５へ通知する。
【００３８】
次に構成要素検出手段５について図８と図９を用いて説明する。図８は、コーナー点の組み合わせから構成要素を検出するための判定条件を示す図で、図９は構成要素の記述形式を示す図である。図８において、（ａ）（ｂ）（ｃ）（ｄ）はＬ字要素の検出例、（ｅ）（ｆ）（ｇ）（ｈ）はＴ字要素の検出例、（ｉ）は十字要素の検出例を示している。
【００３９】
構成要素検出手段５は、コーナー検出手段４からのコーナーの特徴情報を用いて、ｘ座標、ｙ座標が所定の距離以内にある複数のコーナー点を一つのグループにまとめる処理（以下グループ化と呼ぶ）を行い、グループのメンバーであるコーナー点の方向変化コードの組み合わせから、構成要素の種類が対応付けられる。
【００４０】
このようにして検出された構成要素は、図９に示すように４ビットのコード（以下形状コードとよぶ）で記述され、各ビットは上位からＳ、Ｗ、Ｎ、Ｅのいずれの方向に腕が存在するかを示している。例えば図８（ａ）に示すＬ字要素はＳ方向とＥ方向に腕を有しているので、”１００１”のビットパターンで記述される。構成要素のｘ座標及びｙ座標には、グループのメンバーであるコーナー点のｘ座標及びｙ座標の平均値を与えるものとし、構成要素検出手段５は前記構成要素のｘ座標、ｙ座標、及び形状コードを特徴情報として、矩形検出手段６に通知する。
【００４１】
次に矩形検出手段６について図１０及び図１１を用いて説明する。図１０は、構成要素同士の連結関係を示す図であり、図１１は前記の連結関係から生成された最小矩形の認識を示す図である。矩形検出手段６は、矩形情報としてこれら構成要素の連結関係を記述した連結テーブル（図１０（ｂ））、および構成要素の連結関係から構成される最小矩形の位置情報を記述した矩形情報テーブル（図１１（ｂ））を生成出力するものである。
【００４２】
図１０（ａ）は、構成要素検出手段５において検出されたＬ字要素、Ｔ字要素、十字要素とその位置関係の一例を示すものである。まず、各構成要素に対し識別ラベルｅ１からｅ２０を付与し、次に構成要素検出手段５からの特徴情報（ｘ座標、ｙ座標、形状コード）に基づき、形状コードの示す腕の各方向についてｘ方向とｙ方向とをそれぞれ探索し、連結可能な腕をもつ構成要素のうち最短距離にあるものを検出し、連結テーブル（図１０（ｂ））を生成する。図１０（ｂ）は、構成要素の連結関係を示す連結テーブルを示すもので、各構成要素がどの要素と連結するかをＮ、Ｓ、Ｅ、Ｗの各方向について記述している。例えば、Ｌ字要素ｅ１の場合は、腕Ｓ及びＥに対応する構成要素としてｅ１４及びｅ２が存在し、Ｔ字要素ｅ２の場合は腕Ｓ、Ｗ、及びＥに対応する構成要素としてｅ８、ｅ１、及びｅ３が存在することになる。
【００４３】
さらに生成した連結テーブル（図１０（ｂ））を用いて、最小矩形を認識して矩形情報テーブルを生成する。図１１（ａ）は、最小矩形の認識の概念図を示すもので、ある構成要素を始点としてＥ方向、Ｓ方向、Ｗ方向、Ｎ方向の順に連結をたどっていき、始点に戻ることができれば、その４点で構成される矩形を最小矩形と呼び、図１１（ｂ）に示す最小矩形の位置、サイズ等を記述した矩形情報テーブルに登録する。例えば、要素ｅ１を始点とし時計方向回りに探索した場合は、Ｅ方向に連結する要素として要素ｅ２が存在する。次に要素ｅ２が持つＳ方向に連結をたどって要素ｅ８を参照し、次に要素ｅ８からＷ方向に連結をたどろうとするが、要素ｅ８はＷ方向の腕を持っていないので、さらにＳ方向に連結をたどると、Ｗ方向の腕を持った要素ｅ１５が存在する。次に要素ｅ１５からＷ方向に要素ｅ１４をたどり、要素ｅ１４からＮ方向にたどると始点の要素ｅ１に戻り最小矩形として認識することができる。そして、Ｅ，Ｓ，Ｗ，Ｎと方向を変えながら連結している４つの要素ｅ１、ｅ２、ｅ１５、ｅ１４を最小矩形の角の４点として、図１１（ｂ）の矩形情報テーブルの矩形識別ラベルｒ１の項目に登録し、すべての最小矩形を認識し矩形情報テーブルを生成する。
【００４４】
なお、要素ｅ１を始点として設定したが、限定するものではなくどの位置を始点にしても良く、また右回りに連結したが左回りに連結しても良い。
【００４５】
このようにして生成された連結テーブル及び矩形情報テーブルを枠構造照合手段１０に通知する。
【００４６】
次に、第２の罫線抽出手段７について図１２を用いて説明する。図１２は、第２の罫線抽出手段７における画像処理のブロック構成図を示し、２０は画像メモリ２からの２値画像、２０１は水平方向にパターンを延長する第１の水平方向延長手段、２０２は水平方向にパターンを縮める水平方向収縮手段、２０３は水平方向にパターンを延長する第２の水平方向延長手段、２０４は垂直方向にパターンを延長する第１の垂直方向延長手段、２０５は垂直方向にパターンを縮める垂直方向収縮手段、２０６は垂直方向にパターンを延長する第２の垂直方向延長手段、２０７は第２の水平方向延長手段２０３と第２の垂直方向延長手段２０６の出力のＮＯＲ演算を行うＮＯＲ回路である。
【００４７】
第１の水平方向延長手段２０１は、画像メモリ２からの２値画像２０に対し、水平方向にｈｄ画素延長することによりｈｄ画素より間隔の短い破線部分を連結する。水平方向収縮手段２０２は、水平方向に（ｈ＋ｈｄ）画素縮めることにより、水平方向に（ｈ＋ｈｄ）画素以下の幅の線や文字を消滅させ、続く第２の水平方向延長手段２０３において水平方向にｈ画素延長することにより、ｈｄ画素以下の間隔で、かつｈ画素より長い水平線分を抽出する。
【００４８】
同様に第１の垂直方向延長手段２０４は、画像メモリ２からの２値画像２０に対し、垂直方向にｖｄ画素延長することによりｖｄ画素より間隔の短い破線部分を連結する。垂直方向収縮手段２０５は、垂直方向に（ｖ＋ｖｄ）画素縮めることにより、垂直方向に（ｖ＋ｖｄ）画素以下の幅の線や文字を消滅させ、続く第２の垂直方向延長手段２０６において垂直方向にｖ画素延長することにより、ｖｄ画素以下の間隔でかつｖ画素より長い垂直線分を抽出する。
【００４９】
ＮＯＲ回路２０７は、第２の水平方向延長手段２０３と第２の垂直方向延長手段２０６の出力のＮＯＲ演算を行い、文字が消去され破線部分が実線になった枠罫線のみが残り、枠罫線及び背景がそれぞれ”０”及び”１”の値をもつ２値画像２０８が得られる。
【００５０】
また、第１及び第２のの水平及び垂直方向延長手段２０１、２０３、２０４、２０６は、第１の罫線抽出手段３の水平及び垂直方向延長手段２２及び２４と同じ処理をするものであり、水平及び垂直方向収縮手段２０２及び２０５は、第１の罫線抽出手段３の水平及び垂直方向収縮手段２１及び２３と同じ処理をするものであり詳細な説明は省略する。
【００５１】
第２のコーナー検出手段８の処理は、第１のコーナー検出手段４と同じであり、前記第２の罫線抽出手段からの実線と破線部分のコーナー点情報を出力するもであるので説明は省略する。
【００５２】
次にフォーマット情報記憶手段９について図１３及び図１４を用いて説明する。図１３（ａ）は入力された帳票５００の画像、図１３（ｂ）は第１の罫線抽出手段３の出力画像、図１３（ｃ）は第２の罫線抽出手段７の出力画像を各々示している。フォーマット情報記憶手段９には図１３（ｂ）に示す実線の枠構造情報、及び破線で区切られた枠構造情報が格納されており、例えば図１３に示す帳票に対しては図１４に示すフォーマットが対応する。
【００５３】
図１４において、ＩＤ番号１は枠５０１、ＩＤ番号２は枠５０２というように各実線枠と１対１に対応しており、各実線枠内に破線枠が存在するときは、破線フラグが”１”になっており、当該実線枠の桁数が設定されている。また、対象枠フラグが”１”の場合には、該実線枠および破線枠を対象枠として文字切り出し手段に通知する対象枠である。図１３（ｂ）の帳票５００においては、枠５０２に５桁、枠５０４に３桁、枠５０５に３桁破線枠が存在し、対応する図１４にはＩＤ番号２、４、及び５の位置に桁数として５、３、及び３が設定されている。図１４において、ｘ、ｙ座標は、各実線枠の左上、右上、右下、左下の順に登録されており、さらに各実線枠の幅と高さが登録されている。また図１４における許容値は、枠構造を照合する際の枠の幅と高さの誤差許容範囲を示すものである。
【００５４】
また、フォーマット情報記憶手段９には、複数の帳票の枠構造情報が登録されており、それぞれ帳票はレコード番号で識別される。
【００５５】
次に枠構造照合手段１０について図１５を用いて説明する。枠構造照合手段１０は、まず図１４に示したフォーマット情報を参照して入力された帳票の実線枠構造を認識し、対象フラグが”１”の場合に、前記実線枠と図１３（ｃ）の○印で示す位置のコーナー点情報とを基に、破線の交点位置を確定して、実線枠および破線枠の４角の点の座標を文字切り出し手段１１に通知するものである。
【００５６】
図１５は、枠構造照合手段１０における処理手順を示すフロー図であり、各ステップ毎に説明する。
【００５７】
まずステップ５１１は、矩形検出手段６から枠構造情報である連結テーブル及び矩形情報テーブルを読み込む。ステップ５１２は、読み込んだ連結テーブルから連結している構成要素の傾きの平均値を画像の傾き値ｇｒａｄとして（数１）によって算出する。
【００５８】
【数１】

【００５９】
ここでは、水平方向に連結している構成要素の傾きの平均値を画像の傾き値として用いたが、画像の傾きがわかるのであれば、他の手法でもかまわない。
【００６０】
次に５１３〜５１８のステップは、フォーマット情報記憶手段９に登録されている複数の帳票の中から、矩形検出手段６からの枠構造情報と最も類似している帳票を判別する処理である。まず、ステップ５１３でフォーマット情報記憶手段９からレコード番号（ｉ）の枠構造情報を取り出す。
【００６１】
次に、ステップ５１４で枠構造情報（ｉ）と連結テーブル及び矩形情報テーブルとの実線枠照合及び“累積枠相違度”の算出を行う。ここで、“累積枠相違度”とは、枠毎に求めた“枠相違度”を累積加算し帳票全体の相違度を表したものである。“枠相違度”とは、枠構造情報の枠と読み取られた帳票の枠同士を対応づけた時の枠形状の“違い”を表すもので、互いの枠の形状の差が大きほど“枠相違度”が大きくなるように定義する。すなわち、枠構造情報と読み取られた帳票のフォーマットの差が大きいほど“累積枠相違度”も大きくなる。
【００６２】
なお、ここで、枠構造の照合に枠形状の“違い”を表す枠相違度および累積枠相違度を用いたが、枠形状の“一致”を表す枠一致度および累積枠一致度を用いても良い。
【００６３】
ステップ５１５は、ステップ５１４で算出された累積枠相違度を、今までに算出されている累積枠相違度の中で最小の値（以下、最小累積枠相違度と呼ぶ）と比較し、小さい場合はステップ５１６に進み、大きい場合は、ステップ５１３に戻る。ステップ５１６で、現在の累積枠相違度を最小累積枠相違度とし、基準フォーマットのレコード番号（ｉ）、認識枠テーブルを記録更新する。ステップ５１７は、レコード番号ｉとフォーマット情報に登録されている帳票の数ｎとを比較し、ｉ≧ｎならステップ５１８に、ｉ＜ｎならｉ＝ｉ＋１しステップ５１３に戻る。
【００６４】
ステップ５１８は、最小累積枠相違度が予め設定したしきい値よりも小さければ認識枠テーブルをそのまま出力し、大きければ基準フォーマットの中に対応する帳票がなかったとして出力する。ステップ５１９は、認識した実線枠の中で対応するフォーマット情報のテーブル対象枠フラグが”１”となっている実線枠の４点の座標を基に、フォーマット情報から推定した破線の交点の位置の近傍領域から第２のコーナー検出手段からの実在のコーナー点を探索し破線交点を検出する。
【００６５】
次に、ステップ５２０は、帳票内の対象枠の検出処理で、実線枠の４角の座標と破線交点の位置から、実線と破線で構成されている枠の４角の座標を算出し対象枠として文字切り出し手段１１に通知する。例えば、図２３（ｂ）の実線枠７２０は４本の破線によって５つの枠７２１〜枠７２５で構成されている。実線枠７２０の座標をそれぞれ左上点、右上点、右下点、左下点の順に（ｘ１、ｙ１）、（ｘ２、ｙ２）、（ｘ３、ｙ３）、（ｘ４、ｙ４）として、破線交点の上側の座標を左から順に（ＸＵ０、ＹＵ０）〜（ＸＵ３、ＹＵ３）、下側の座標を（ＸＬ０、ＸＬ０）〜（ＸＬ３、ＹＬ３）とすると、通知する５つの対象枠の４角の座標は、枠７２１が、左上点、右上点、右下点、左下点の順に（ｘ１、ｙ１）、（ＸＵ０、ＹＵ０）、（ＸＬ０、ＹＬ０）、（ｘ４、ｙ４）となり、枠７２２が（ＸＵ０、ＹＵ０）、（ＸＵ１、ＹＵ１）、（ＸＬ１、ＹＬ１）、（ＸＬ０、ＹＬ０）となり、その他の枠も図２３（ｃ）の表のごとく通知される。
【００６６】
次に、ステップ５１４の実線枠照合及び“累積枠相違度”の算出及びステップ５１９の破線交点の検出について詳細に説明する。
【００６７】
ステップ５１４の実線枠照合及び“累積枠相違度”の算出について、図１６から図１９を用いて詳細に説明する。図１６は、フォーマット情報記憶手段９に登録されている枠構造情報と矩形検出手段６からの連結テーブル及び矩形情報テーブルとを照合し、“累積枠相違度”を算出する手順を示すフロー図である。図１７（ａ）は、基準になる帳票の実画像（以下、基準帳票と呼ぶ）を示し、図１７（ｂ）は、その枠構造情報を示す。図１８（ａ）は、読み取られた帳票の実画像（以下、検査帳票と呼ぶ）を示し、図１８（ｂ）は、検査帳票の連結テーブル、図１８（ｃ）は、矩形情報テーブルを示す。図１９は探索範囲を示す。また、図２０は実際の処理の結果を示す。
【００６８】
図１６に示す実線枠の照合および累積枠相違度を求める処理フローに従って、ステップ毎に説明する。
【００６９】
まず、ステップ５２１は、フォーマット情報の枠座標を（数２）により、傾き値ｇｒａｄだけ回転させ傾きを補正する。
【００７０】
【数２】

【００７１】
ステップ５２２は、フォーマット情報から位置あわせの始点とする“始点枠”を取り出し、取り出す始点枠がない場合は終了する。”始点枠”として、例えば原点に近い枠から順次選択する。
【００７２】
ステップ５２３は、フォーマット情報に記述された許容値の範囲内で高さと幅が“始点枠”と同じ枠を“始点候補枠”として矩形情報テーブルから探索する。ステップ５２４では、“始点候補枠”が存在するかどうかの判断を行い、存在しなければステップ５２５に進み、存在すればその矩形の４点座標を始点候補枠として記憶し、ステップ５２６に進む。ステップ５２５は、連結テーブルから、図１９（ａ）のように始点枠の４角の点を中心に所定の探索範囲を設定し、検査帳票の連結テーブルの全構成要素について探索し、４つの探索範囲すべてに構成要素が存在する場合に、その４点を始点候補枠として記憶する。
【００７３】
ステップ５２６は、“始点候補枠”が全くなければ、ステップ５２１に戻りフォーマット情報の次の枠を“始点枠”として選択してやり直す。
【００７４】
例えば、図１７と図１８の帳票の場合には、まず図１７（ｂ）をフォーマット情報としてＩＤ番号ｂ１の枠を始点枠として選択する。次に、ｂ１の枠と同じサイズの矩形を図１８（ｃ）の矩形情報テーブルから探索すると、ｒ１、ｒ２の矩形がそれぞれ始点候補枠として選ばれ、図２０（ａ）に示すように記憶される。仮にｒ１、ｒ２の矩形が存在しない場合には、ｅ１，ｅ２，ｅ３，ｅ４の構成要素の組み合わせと、ｅ５，ｅ６，ｅ１０，ｅ９の組み合わせが“始点候補枠”として選択されて記憶される。
【００７５】
次に、“始点候補枠”が存在すれば、次の５２７〜５３９のステップで、そのすべての“始点枠”と“始点候補枠”とが重なるように位置あわせを行い、フォーマット情報と矩形情報テーブルの各枠毎に照合し枠相違度を求め、累積加算したものを累積枠相違度として算出するもので、最終的には“累積枠相違度”が最小になる組み合わせを選ぶことになる。フォーマット情報と矩形情報テーブルとの照合について図１９（ｂ）を用いて説明する。
【００７６】
まず、ステップ５２７で、“始点枠”ｂ１の枠原点５５１と“始点候補枠”ｒ１の枠原点５５０の相対距離（ｒｘ，ｒｙ）を（数３）で算出する。
【００７７】
【数３】

【００７８】
次に、ステップ５２８は、次の対象枠の探索範囲を設定する処理で、例えば”基準枠”ｂ２の枠原点（枠の左上の点）５５７からの相対距離（ｒｘ，ｒｙ）移動したの点５５６を中心に探索範囲５５２を設定する。
【００７９】
ステップ５２９は、矩形情報テーブルから次の対象枠を例えば対象枠ｒ２として、高さと幅が許容値内であり、探索範囲の中に枠原点５６２があるかどうか探索する。ステップ５３０は、矩形が存在するかどうかを確認するもので、存在すればその矩形を枠テーブルに登録しステップ５３２に進み、存在しなければステップ５３１に進む。ステップ５３１は、図１９（ｃ）に示すように基準枠５７２の４点座標をそれぞれ相対距離（ｒｘ，ｒｙ）分移動した枠の４点（５７４、５７５、５７６、５７７）を中心に、それぞれ所定の探索範囲５７３を設定して連結テーブルの４点を探索する。
【００８０】
ステップ５３２は、探索した構成要素の点を枠テーブルに登録するが、もし構成要素の点が存在しない場合には、基準枠の点の座標をそのまま登録する。
ステップ５３３で“枠相違度”ｄｆｒａｍｅを（数４）で算出する。ここで、“枠相違度”とは、基準枠と探索した枠との相違度を示す数であり、この値が大きいほど基準枠と探索した枠とが異なっていることを示す。
【００８１】
【数４】

【００８２】
なお、本発明では枠相違度を、探索範囲に存在しない構成要素の数としているが、他の評価式、例えば、基準枠の点と探索した枠の点との距離の差（あるいは差の絶対値）の総和を枠相違度としても構わない。
【００８３】
ステップ５３４は、“枠相違度”を“累積枠相違度”に累積加算する。
ステップ５３５は、フォーマット情報のテーブルに次の枠があるかを判定し、あればステップ５３６に進み、なければ５３７に進む。
ステップ５３６は、フォーマット情報から次の基準枠を読み込み、ステップ５２８に戻る。次に、ステップ５３７で“累積枠相違度”が今までに算出されている“累積枠相違度”より小さいかどうかの判定をし、大きければステップ５３９に進み、小さければステップ５３８に進む。ステップ５３８は、枠テーブルを認識枠テーブルに登録更新する。ステップ５３９は、ほかの“始点候補枠”があるかどうか判定し、“始点候補枠”がなくなるまでステップ５２８からステップ５３９を繰り返す。
【００８４】
以上の処理をすべての“始点候補枠”について行えば、最終的に最も違いの少ない認識枠データおよび累積枠相違度が得られる。
【００８５】
実線枠の照合および累積枠相違度の処理結果を図２０に示す。始点枠ｂ１の始点候補枠が、図２０（ａ）に示すｒ１、ｒ２となっている。そこで、図２０（ｂ）に示すように、ｒ１を始点候補とした場合には、ｂ２の枠とｒ２の枠、ｂ３の枠とｅ７，ｅ８，ｅ１１の構成要素、ｂ４の枠とｒ３の枠がそれぞれ対応し、ｂ３の場合だけ枠相違度が１となるので、累積枠相違度は１となる。一方、ｒ２を始点候補とした場合には、対応する枠がほとんどなく、累積枠相違度は１０となり、累積枠相違度が最小になるｒ１を始点候補とした認識枠データが図２０（ｃ）の認識枠テーブルに登録される。
【００８６】
次に、ステップ５１９の破線交点の検出について、図２１及び図２２を用いて詳細に説明する。図２１、２２は、第２のコーナー検出手段からのコーナー点情報を基に破線交点を検出する手順を示すフロー図である。図２１において、Ｎは読み取り対象とする対象枠に存在する破線の総数である。
【００８７】
破線交点を検出する手順を、図２１、２２に示すフロー図に基づき、ステップ毎に説明する。
【００８８】
まず、６０１から６０４のステップは、実線交点間を桁数で等分し破線交点の候補位置を算出する。ステップ６０１は、制御変数ｊをリセットする。ステップ６０２は、実線枠の頂点間を等分し破線交点候補の座標を求める。ステップ６０３は、ｊをインクリメントし、ステップ６０４においてｊがＮ以上かどうかの判定をし、Ｎ以上であればステップ６０５へ進み、そうでなければステップ６０２に戻る。
【００８９】
ステップ６０２における演算内容を、図２３（ａ）を用いてさらに詳しく説明する。図２３（ａ）は、破線交点の候補の位置関係を示すもので、例えば枠７０１の頂点間を５等分することにより、上側の破線交点候補ＣＵ（０）からＣＵ（３）、及び下側の破線交点候補ＣＬ（０）からＣＬ（３）の座標が求まる。すなわち実線枠の上側の２頂点を（ｘ１，ｙ１）、（ｘ２，ｙ２）、破線で区切られた桁数をｐとすると、上側の破線交点の候補座標（ｘＵ（ｊ），ｙＵ（ｊ））は（数５）に示す内分演算で求められる。下側の破線交点の候補座標も実線枠の下側の２頂点（ｘ３，ｙ３）、（ｘ４，ｙ４）を用いて同様の計算で求められる。
【００９０】
【数５】

【００９１】
以上の手続きを枠７０２及び枠７０３に関しても行い、帳票内の全ての破線の候補交点の座標を決める。
【００９２】
次の６０５から６１１のステップでは、第２のコーナー検出手段８からのコーナー情報を１点ずつ読み込み、破線交点の候補座標の近傍に存在するかどうかを判定し、近傍に存在するコーナー点を各破線交点毎に一つのグループにまとめる。
まず、ステップ６０５は、コーナー情報をｘ座標、ｙ座標、方向変化コードの形式で１点ずつ読み込む。次にステップ６０６は、制御変数ｊをリセットする。ステップ６０７は、破線交点の候補座標を中心として±ｄを近傍領域として設定し、コーナー点が近傍領域内に存在するかどうかを判定し、近傍領域内に存在する場合はステップ６０８に進み、存在しない場合はステップ６０９に進む。
【００９３】
ステップ６０８は、該当する候補交点のグループに帰属させ、再び６０５に戻る。このとき枠の上側の破線交点の候補座標の近傍に存在する場合はグループＧＵ（ｊ）に所属させ、また枠の下側の破線交点の候補座標の近傍に存在する場合はグループＧＬ（ｊ）に所属させることにより上側のグループと下側のグループを区別する。これは、後述するペアコーナー点の成立条件において、ペアとなるコーナー点の方向変化コードが上側の交点と下側の交点で異なるからである。ステップ６０９は、制御変数ｊをインクリメントする。ステップ６１０は、ｊ≧Ｎを判定しＮの場合は６０７に、Ｙの場合は６１１に進む。６１１は、コーナ点の終了かどうかを判断し、次のステップへ進むかどうかの判定を行う。
【００９４】
次に、図２２における６２１から６２７のステップは、各候補交点のコーナー点のグループ毎に、Ｔ字要素を構成し得るコーナー点のペアを生成する。
【００９５】
まず、ステップ６２１は、制御変数ｊをリセットする。次にステップ６２２は、各破線交点毎のコーナー点のグループＧＵ（ｊ）またはＧＬ（ｊ）に対しコーナ点のペアを生成する。ステップ６２３は、ペアが存在するかどうかを判定し、ペアが存在する場合はステップ６２４へ進み、ペアが存在しない場合はステップ６２５へ進む。ステップ６２４は、破線交点の座標としてペアコーナー点の平均座標から算出する。ステップ６２５は、破線交点の座標としてステップ６０２で求めた実線枠の頂点間を等分し破線交点候補の座標をで求めた座標を採用する。次にステップ６２６においてｊをインクリメントし、ステップ６２７において次のステップへ進むかどうかｊ≧Ｎで判定を行う。
【００９６】
以上、図２１及び図２２に示した手順により、帳票の破線交点の位置が確定する。具体的には、図２４に示すように破線交点の候補位置７０７、７０８を中心として±ｄの近傍領域７０５、７０６を設定する。枠の上側のグループＧＵ（ｊ）は、方向変化コード”１７”のコーナー点の右側に方向変化コード”３１”のコーナー点が存在することからペアとして認められ、その平均座標を破線交点の座標とする。一方、枠の下側のグループＧＬ（ｊ）は、方向変化コード”７５”のコーナー点の右側に方向変化コード”５３”のコーナー点が存在することからペアと認められ、その平均座標を破線交点の座標とする。
【００９７】
次に、文字切り出し手段１１について説明する。文字切り出し手段１１は、画像メモリ２から実際の文字の２値イメージを切り出して文字認識手段１２に送るもので、その処理について図２５を用いて詳細に説明する。
【００９８】
図２５は、ある帳票の枠の中に文字が描かれている２値イメージの一部を示す。ここで、７１１〜７１４は、枠構造照合手段１０から通知された枠の４角の点、７１５は枠の４角の点７１１〜７１４で構成される領域ａ、７１６は領域ａ（７１５）よりも枠線の幅の分だけ小さくした領域ｂ、７１７は枠内に描かれている文字、７１８は文字枠である。
【００９９】
文字切り出しの処理の時には、領域ａ（７１５）で２値画像を切り出した場合、文字枠７１８の一部分まで切り出してしまい、余分な画像を含んでいるために文字認識率が低下する事がある。そこで、領域ａ（７１５）よりも枠線の幅の分小さくした領域ｂ（７１６）で画像メモリ２から切り出すことによって、文字枠７１８を除いた文字領域を文字認識手段１２に切り出し、文字７１７を認識させるために、文字認識率の低下を防ぐことができる。
【０１００】
文字認識手段１２は、例えば、ニューラルネットを用いた文字認識法（”ＰＤＰモデルによる手書き漢字認識”、電子情報通信学会論文誌、Ｖｏｌ．Ｊ７３−Ｄ−ＩＩ，Ｎｏ．８ｐｐ．１２６８−１２７４１９９０）により実現することができる。詳細な処理に関しては既存の技術であるので上記文献を提示し省略するものとする。
【０１０１】
このようにして、帳票を識別してその指定された対象枠内の文字を認識することができる。
（実施の形態２）
以下、本発明の実施の形態２について、図２６から図３０を参照しながら説明する。
【０１０２】
図２６は、本発明の実施の形態２の帳票認識装置のブロック構成図を示し、１は帳票文書を読み取り２値画像を得る画像入力手段、２は前記２値画像を記憶する画像メモリ、３は２値画像の水平、垂直方向の実線の罫線を抽出する第１の罫線抽出手段、４は実線の罫線からなるパターンのコーナーを検出する第１のコーナー検出手段、８００は前記コーナー点の組み合わせから罫線の屈曲や交差によるＬ字要素、Ｔ字要素、十字要素及びＩ字要素の構成要素を検出する第１の構成要素検出手段、８０１は構成要素同士をグルーピングさせて新しい構成要素を生成する第２の構成要素検出手段、６は構成要素同士を連結し連結形態を枠構造情報として出力する矩形検出手段と、９は予め読み取り対象となる複数の帳票のフォーマット情報を記憶するフォーマット情報記憶手段、１０は前記枠構造情報をもとに帳票内の矩形構造と前記フォーマット情報を照合し帳票の種別を判別し、帳票内の文字読み取り対象枠を検出する枠構造照合手段、１１は文字読み取り対象枠の文字領域を切り出す文字切り出し手段、１２は切り出された文字を認識する文字認識手段である。
【０１０３】
以上のように構成された帳票認識装置において、その動作を説明するが、本発明の実施の形態１の帳票認識装置のブロック構成図と異なる第１の構成要素検出手段８００と第２の構成要素検出手段８０１について詳細に説明する。
【０１０４】
まず、第１の構成要素検出手段８０１について説明するが、第１の構成要素検出手段８０１は、実施の形態１の構成要素検出手段５と基本的には同じものであり、図２７（ａ）〜（ｄ）に示す罫線の断線を意味するＩ字要素を検出するための判定条件を付加したものである。よって、Ｌ字要素、Ｔ字要素、十字要素に加えてＩ字要素を検出して、８０２の第２の構成要素検出手段に通知する。
【０１０５】
次に第２の構成要素検出手段８０２について図２８から図３０を用いて説明する。図２８は、構成要素同士のグルーピングの判定条件を示し、図２９は第１の構成要素検出手段からの構成要素をグルーピングする手順を示すフロー図を示している。また、図３０は、本発明の実施の形態２の処理結果を示す。
【０１０６】
図２８において、図２８（ａ）〜（ｃ）はＩ字要素同士をグルーピングしてそれぞれＬ字要素、Ｔ字要素、十字要素を検出する例を示している。また、図２８（ｄ）は、Ｉ字要素とＬ字要素をグルーピングしてＴ字要素を検出し、図２８（ｅ）はＩ字要素とＴ字要素をグルーピングして十字要素を検出する例を示している。
【０１０７】
次に、第２の構成要素検出手段８０２の処理手順について、図２９を用いて各ステップ毎に説明する。まず、ステップ８２０は、対象の構成要素がＩ字要素であるかどうかを判断し、Ｉ字要素であればステップ８２１に進み、Ｉ字要素でない場合にはステップ８２８に進む。ステップ８２１は、予め設定した探索範囲内で他の構成要素を探索する。
【０１０８】
ステップ８２２は、他の構成要素がＩ字要素でない場合にはステップ８２４に進み、Ｉ字要素の場合にはステップ８２３に進み、図２８（ａ）〜（ｃ）の配置になる構成要素を選択して、それぞれグルーピングし、グルーピングした構成要素の座標の平均値および形状コードの論理和をそれぞれグルーピング構成要素の座標および形状コードとして更新登録する。
【０１０９】
ステップ８２４は、他の構成要素がＬ字要素であるかの判定をして、Ｌ字要素でなければステップ８２６に進み、Ｌ字要素であれば、ステップ８２５で図２８（ｄ）のグルーピングを行う。前記と同じように、グルーピングした構成要素の座標の平均値および形状コードの論理和をとり、それぞれグルーピング構成要素の座標および形状コードとして更新登録する。
【０１１０】
ステップ８２６は、他の構成要素がＴ字要素の判定を行い、Ｔ字要素でなければステップ８２８に進み、Ｔ字要素であれば、ステップ８２７で図２８（ｅ）のグルーピングを行い、前記と同じように座標値と形状コードの更新登録を行う。ステップ８２８は、次の構成要素が存在すれば、８２０に戻り処理を繰り返し、構成要素が存在しなければ処理を終了する。
【０１１１】
また、図２８（ａ）〜（ｅ）のグルーピングの条件を満たさずに単独で存在するＩ字要素は登録を行わない。ただし、単独で存在するＬ字要素、Ｔ字要素および十字要素は、そのまま登録する。
【０１１２】
本発明の実施の形態２の構成要素検出手段の処理結果について、図３０を用いて説明する。図３０（ａ）は、枠の角部分が緩やかな曲率を持つ帳票の原画像である。図３０（ｂ）は、第１の罫線抽出手段により所定の長さ以上の線分を抽出し、コーナー点を検出した結果であり、枠線の角部分が途切れている。図３０（ｂ）の８３１、８３２、８３３は、第１の構成要素検出手段によって、枠の断線た部分のコーナー点から検出されたＩ字要素である。図３０（ｂ）の８３４と８３５は、同じく第１の構成要素検出手段によって検出されたＬ字要素である。図３０（ｃ）は、第２の構成要素検出手段によってグルーピングされた構成要素を示し、Ｌ字要素８３６は（ｂ）のＩ字要素８３１と８３２がグルーピングしたものである。ただし、図３０において左側のみ説明および図示したが右側も同様である。また、Ｔ字要素８３７は、Ｉ字要素８３３とＬ字要素８３４がグルーピングしたものである。Ｌ字要素８３８は、グルーピングする構成要素がなかったのでそのまま通知されている。このようにして、枠の角部分が緩やかな曲率を持つ場合でも、枠の交点を正確に認識することができる。
【０１１３】
【発明の効果】
以上のように本発明の効果は、第１に、２値画像の実線および破線の罫線を抽出し、罫線のコーナー形状の組み合わせから構成要素を検出し、構成要素同士を連結し矩形を検出し、複数の帳票のフォーマット情報と照合することにより、帳票が画像として傾いて入力されたり、帳票内に破線の罫線が存在しても、文字枠を正確に検出し文字認識を行うことができ、信頼性の高い帳票認識装置が実現できる。
【０１１４】
第２に、枠罫線の抽出処理によって途中できれてしまった罫線部分をＩ字要素として検出し、さらにＩ字要素同士またはＩ字要素と他の要素とをグルーピングして構成要素を検出することによって、枠の断線部分や角部分が緩やかな曲率を持つ場合でも、枠の交点を正確に認識できる信頼性の高い帳票認識装置が実現できる。
【図面の簡単な説明】
【図１】本発明の実施の形態１における帳票認識装置のブロック結線図
【図２】同実施の形態１の帳票認識装置における第１の罫線抽出手段でのブロック結線図
【図３】同実施の形態１の帳票認識装置における線分収縮手段の処理手順を示すフロー図
【図４】同実施の形態１の帳票認識装置における線分延長手段の処理手順を示すフロー図
【図５】同実施の形態１の帳票認識装置における第１のコーナー検出手段の方向コード化と方向コード変化点検出について説明する図
【図６】同実施の形態１の帳票認識装置における第１のコーナー検出手段の方向コードと実際の方向との対応関係を示す図
【図７】同実施の形態１の帳票認識装置における第１のコーナー検出手段の方向変化コードの具体例を示す図
【図８】同実施の形態１の帳票認識装置における構成要素検出手段のコーナー点の組み合わせによる構成要素の具体例を示す図
【図９】同実施の形態１の帳票認識装置における構成要素抽出手段での構成要素の形態の記述についての説明図
【図１０】同実施の形態１の帳票認識装置における矩形検出手段の構成要素同士の連結関係を示す概念図
【図１１】同実施の形態１の帳票認識装置における矩形検出手段の矩形検出の具体例を示す図
【図１２】同実施の形態１の帳票認識装置における第２の罫線抽出手段のブロック結線図
【図１３】同実施の形態１の帳票認識装置におけるフォーマット情報記憶手段の枠構造情報を説明するための帳票画像を示す図
【図１４】同実施の形態１の帳票認識装置におけるフォーマット情報記憶手段の枠構造情報に登録されるフォーマットの具体例を示す図
【図１５】同実施の形態１の帳票認識装置における枠構造照合手段の処理手順を示すフロー図
【図１６】同実施の形態１の帳票認識装置における枠構造照合手段の実線枠照合と累積枠相違度の算出の処理手順を示すフロー図
【図１７】同実施の形態１の帳票認識装置における枠構造照合手段の実線枠照合処理を説明するための基準の帳票とその枠構造情報を示す図
【図１８】同実施の形態１の帳票認識装置における枠構造照合手段の実線枠照合処理を説明するための入力された帳票と要素連結情報と矩形情報を示す図
【図１９】同実施の形態の帳票認識装置における枠構造照合手段の実線枠照合の探索範囲を示す図
【図２０】同実施の形態１の帳票認識装置における枠構造照合手段の実線枠照合の処理結果の具体例を示す図
【図２１】同実施の形態１の帳票認識装置における枠構造照合手段の破線交点を検出する手順を示すフロー図
【図２２】同実施の形態１の帳票認識装置における枠構造照合手段の破線交点を検出する手順を示すフロー図
【図２３】同実施の形態１の帳票認識装置における枠構造照合手段の破線による候補交点の位置関係を示す図
【図２４】同実施の形態１の帳票認識装置における枠構造照合手段の破線の検出処理のペアになるコーナー点の具体例を示す図
【図２５】同実施の形態１の帳票認識装置における文字きり出し手段の処理を説明する図
【図２６】本発明の実施の形態２における帳票認識装置の構成要素検出手段のブロック結線図
【図２７】本発明の実施の形態２における第１の構成要素検出手段のコーナー点の組み合わせによるＩ字構成要素の具体例を示す図
【図２８】本発明の実施の形態２における第２の構成要素検出手段のグルーピングできる構成要素の組み合わせを示す図
【図２９】本発明の実施の形態２における第２の構成要素検出手段の処理手順を示すフロー図
【図３０】本発明の実施の形態２における構成要素検出手段の処理結果を示す図
【図３１】従来例の罫線認識装置のブロック結線図
【符号の説明】
１画像入力手段
２画像メモリ
３第１の罫線抽出手段
４第１のコーナー検出手段
５構成要素検出手段
６矩形検出手段
７第２の罫線抽出手段
８第２のコーナー検出手段
９フォーマット情報記憶手段
１０枠構造照合手段
１１文字切り出し手段
１２文字認識手段
２０２値画像
２１水平方向収縮手段
２２水平方向延長手段
２３垂直方向収縮手段
２４垂直方向延長手段
２５ＮＯＲ回路
２０１第１の水平方向延長手段
２０２水平方向収縮手段
２０３第２の水平方向延長手段
２０４第１の垂直方向延長手段
２０５垂直方向収縮手段
２０６第２の垂直方向延長手段
２０７ＮＯＲ回路
８０１第１の構成要素検出手段
８０２第２の構成要素検出手段[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a form recognition apparatus for recognizing a structure of a frame ruled line in a document image including a frame ruled line and characters such as a form, and cutting out a specific character region in the form to recognize a character.
[0002]
[Prior art]
In recent years, with the digitization of document information, there has been a growing demand for character recognition technology such as OCR (Optical Character Reader) and document image processing, and table structure recognition technology for tabular documents such as forms is one of them.
[0003]
Conventionally, a method for detecting a line segment by the run length of a frame line is well known as frame ruled line recognition of a form. For example, Japanese Patent Laid-Open No. 01-217583 discloses a block connection diagram of the ruled line recognition apparatus in FIG. Shown and explained.
In FIG. 31, 1001 is an image input unit, 1002 is an image memory, 1003 is a vertical run extraction unit that extracts vertical runs, 1004 is a vertical line segment extraction unit that extracts vertical line segments, and 1005 is a horizontal run. 1006 is a horizontal line segment extraction unit that extracts horizontal line segments, and 1007 is a character area extraction unit that extracts character areas using the extracted vertical line segments and horizontal line segments. The operation will be described below.
[0004]
The image input unit 1001 scans an image including a recognition target ruled line and stores it in the image memory 1002 as a binary signal. The vertical run extraction unit 1003 scans the image stored in the image memory 1002 in the vertical direction and extracts vertical runs. The vertical line segment extraction unit 1004 examines the connectivity of the extracted vertical runs and extracts vertical line segments. By a similar process, the horizontal run extraction unit 1005 extracts the horizontal run, and the horizontal line segment extraction unit 1006 extracts the horizontal line segment.
[0005]
The character region extraction unit 1007 extracts a character region and a character entry region using the vertical direction line segment extracted by the vertical direction line segment extraction unit 1004 and the horizontal direction line segment extracted by the horizontal direction line segment extraction unit 1006. Is.
[0006]
Various methods for recognizing the extracted character have been proposed, for example, a character recognition method using a neural network ("handwritten Kanji recognition by PDP model", IEICE Transactions, Vol. J73- D-II, No. 8 pp. 1268-1274 1990), and the recognition rate has reached a practical level, but this time it is a form recognition device, so it focuses on the recognition process of the designated frame of the form. Regarding the character recognition, the above document is presented and the explanation is omitted.
[0007]
[Means for Solving the Problems]
In order to achieve this object, the present invention, for example, automatically recognizes a frame line from a form and cuts out a character area, reads out the form and outputs a binary image, and a binary image from the image scanner. A direction code is assigned to the edge of the form's frame line from “0” for characters and frame lines, and “1” for others. The point where the direction code changes is the corner of the image, and the feature code and coordinates are the corners. Corner detection means for extracting as feature points; From a combination of nearby corner feature points Based on the constituent elements from the constituent elements extracting means for extracting the constituent elements of the frame (T-shaped, cross, L-shaped, etc.) and the constituent elements from the constituent element extracting means, ) To search for neighboring components, connect the components to be partnered, and repeatedly repeat the process until it returns to the original minimum rectangle recognition means for outputting the rectangle information, and any rectangular information from the minimum rectangle recognition means Connect based on corners The smallest rectangle Structure recognition means for recognizing and character cutout means for cutting out a character region from the structure of the frame line from the structure recognition means are provided.
[0008]
Further, as a second problem, there is a problem that the frame ruled line cannot be recognized by the run length when the frame ruled line is broken due to dirt or the like, or the corner is not a right angle but a rounded corner.
[0009]
The present invention solves the above-described problems of the prior art, and even when a form image is input with an inclination, or even when a form has a broken frame ruled line or a broken or rounded corner, It is an object of the present invention to provide a highly reliable form recognition device that can be detected easily.
[0010]
[Means for Solving the Problems]
In order to solve this first problem, the present invention provides an image input means for reading a form to obtain a binary image, an image memory for storing the binary image, and a horizontal ruled line in the horizontal and vertical directions of the binary image. First ruled line extracting means for extracting, first ruled line extracting means for extracting horizontal and vertical ruled lines in a binary image, and first and second corners for detecting a corner of a binary figure A detection unit, a component element detection unit that detects L-shaped elements, T-shaped elements, and cross elements by bending or crossing a ruled line from a combination of corner shapes, and a rectangle that connects the components and outputs the connection form as frame structure information A detection unit; a format information storage unit that stores format information of a plurality of forms; a rectangular structure in the form and the format information are collated to determine the type of the form; A frame structure collating unit that detects an intersection between ruled lines composed of a solid line and a broken line based on the corner point information and detects a character reading target frame in the form; and a character cutout unit that cuts out a character area of the character reading target frame; And a character recognition means for recognizing the cut-out character.
[0011]
Further, in order to solve the second problem, the present invention detects an I-shaped element by adding a break of a ruled line to a bend or intersection of a ruled line from a combination of corners from the first corner detecting means as a component detecting means. In addition, the components of the broken lines or the grouping of the broken lines and other components (L-shaped elements, T-shaped elements) are updated to L-shaped elements, T-shaped elements, and cross-shaped elements. .
[0012]
This makes it possible to accurately detect a character frame even when a slip is input as an image or a corner with a broken ruled line, broken line, or roundness is present in the form, realizing a highly reliable form recognition device. it can.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
According to the first aspect of the present invention, an image input means for reading a form document including a frame ruled line and characters and outputting a binary image, an image memory for storing the binary image, a horizontal image in the binary image, A first ruled line extracting means for extracting a vertical solid line ruled line, a first corner detecting means for detecting a corner of a pattern made of a solid line ruled line, and horizontal and vertical solid lines and broken lines in the binary image. A second ruled line extracting unit for extracting a ruled line, a second corner detecting unit for detecting a corner of a pattern composed of frame ruled lines, and an L-shaped element, a T-shaped element by bending or intersecting a ruled line from a combination of corner shapes, and Component detection means for detecting cross elements, rectangle detection means for connecting the components and outputting the connection form as frame structure information, and formats for a plurality of forms to be read in advance. Format information storage means for storing information, the frame structure information and the format information are sequentially verified to detect a solid line frame, the form type is determined from the verification result, and the corner from the second corner detection means is detected. And a frame structure matching means for detecting a broken line intersection that is an intersection of a target frame and a broken line specified by a solid line frame from the format information, and outputting the coordinates of the target frame, and the image memory based on the coordinates of the target frame A form recognition device comprising a character cutout means for cutting out a character area from a character recognition means and a character recognition means for recognizing the cutout character, extracting a solid line and a dashed ruled line from a binary image of the form, and its ruled line Extract the corner point of the pattern consisting of, extract the component that is the intersection of the frame ruled line from the corner point, detect the rectangular structure from the components connected to each other, By comparing and collating the formatting information, such an action form is or when read tilted, even dashed borders are present in the form, can accurately detect the character frame to be read.
[0014]
In the invention according to claim 3 of the present invention, as a component detection unit, an I-shaped element is newly detected as a break of a ruled line from a combination of corners from the first corner detection unit, and grouping of the I-shaped elements, or It is characterized by updating the components of L-shaped elements, T-shaped elements or cross elements by grouping I-shaped elements, L-shaped elements, and T-shaped elements, and the borders of frame ruled lines and corners of form frames are moderated Even when it has a large curvature, it has the effect that the component can be detected accurately.
[0015]
Hereinafter, embodiments of the present invention will be described with reference to FIGS.
(Embodiment 1)
FIG. 1 is a block diagram of a form recognition apparatus according to a first embodiment of the present invention. 1 is an image input means for reading a form document and obtaining a binary image, 2 is an image memory for storing the binary image, 3 Are first ruled line extracting means for extracting horizontal ruled lines in the horizontal and vertical directions of the binary image, 4 is first corner detecting means for detecting the corners of the solid line ruled lines, and 5 is a curve of the ruled line from the combination of the corners. Component detecting means for detecting L-shaped elements, T-shaped elements and cross elements by crossing, 6 is a rectangular detecting means for connecting the constituent elements and outputting them as frame structure information, and 7 is horizontal in the binary image, Second ruled line extracting means for extracting vertical solid lines and dashed ruled lines, 8 a second corner detecting means for detecting corners of the solid and dashed frame ruled lines, and 9 a plurality of forms to be read in advance. Former Format information storage means 10 for storing data information, sequentially matching the frame structure information and the format information to detect a solid line frame, discriminating the form type from the collation result, and from the second corner detection means A frame structure matching means for detecting a broken line intersection that is an intersection of a target frame and a broken line specified by a solid line frame from the corner and the format information, and outputting coordinates of the target frame; 11 is a character region of the character reading target frame Character cutout means 12 for cutting out characters, and character recognition means 12 for recognizing the cutout characters.
[0016]
An outline of the operation of the form recognition apparatus configured as described above will be described. The form is read by the image input means 1, converted into a binary image having a value of 1 for the character part and frame ruled line part, and a value of 0 for the background, and stored in the image memory 2. The first ruled line extracting means 3 scans the binary image from the image memory 2 in the horizontal and vertical directions, extracts lines in which a value “1” of a predetermined length or more continues, and the first corner detecting means 4 performs horizontal scanning. And detect corners from vertical ruled lines. The constituent element detecting means 5 detects the constituent elements of the bent part of the ruled line, the L-shaped element, the T-shaped element, and the cross element from the combination of corners from the first corner detecting means 4. The rectangle detector 6 connects the components from the component detector 5 to each other to obtain frame structure information.
[0017]
The second ruled line extracting means 7 extracts horizontal and vertical ruled lines in the horizontal and vertical directions of the binary image, and the second corner detecting means 8 detects corners from the solid and broken frame ruled lines.
[0018]
The format information storage means 9 stores in advance format information of a plurality of forms to be read. The frame structure collating means 10 collates the frame structure information from the component element detecting means 5 with the format information to determine the type of the form, detects a solid line frame from the frame structure information, and from the second corner detecting means 8 The intersection of the solid line and the broken line is detected from the corner, the solid line frame and the format information, and the target frame in the form is detected from the intersection of the solid line frame and the solid line frame and the broken line. The character cutout unit 11 cuts out a character region based on the coordinates of the four vertices of the reading target frame, and recognizes a character from the region cut out by the character recognition unit 12.
[0019]
Next, the operation of each component will be described in detail with reference to FIG.
The image input means 1 reads a form and outputs a binary image. In the first embodiment of the present invention, the reading line density is about 400 dpi, and the form, which is a document, is illuminated with an LED (light emitting diode) or the like. The reflected light is read with a one-dimensional CCD camera, and binarized with an arbitrary threshold value to output a binary image having a character part value of 1 and a background value of 0.
[0020]
In addition, lighting differs depending on the border of the form that is the manuscript and the color of the written characters. For example, numbers, symbols, and characters are entered in black, blue, etc. for blue, black, red, etc. In this case, an LED having a green or yellow-green wavelength (around 550 to 570 nm) is often used. In the binarization processing, a fixed threshold method and a floating threshold method (“binarization as a recognition problem and examination of various methods”, Information Processing Society of Japan, Image Processing 15-1, Nov. 1977) are well known. In the first embodiment of the present invention, the binarization processing method is not particularly mentioned, and any binarization processing method may be selected in accordance with the document. The binarized image data is stored in the image memory 2 and read out as necessary in each process.
[0021]
Next, the first ruled line extraction means 3 will be described with reference to FIG. FIG. 2 is a block diagram of image processing in the first ruled line extracting means 3, wherein 20 is a binary image from the

image memory

2, 21 is a horizontal shrinking means for shrinking the pattern in the horizontal direction, and 22 is in the horizontal direction. Horizontal extension means for extending the pattern, 23 a vertical contraction means for shrinking the pattern in the vertical direction, 24 a vertical extension means for extending the pattern in the vertical direction, and 25 for the horizontal extension means 22 and the vertical extension means 24. It is a NOR circuit that performs a NOR operation on the output.
[0022]
The horizontal contraction means 21 eliminates lines and characters having a width of h pixels or less in the horizontal direction by shrinking the binary image 20 from the image memory 2 by h pixels in the horizontal direction. The subsequent horizontal extension means 22 extracts only horizontal line segments longer than h pixels by extending h pixels in the horizontal direction.
[0023]
Similarly, the vertical direction contracting means 23 is to eliminate lines and characters having a width of v pixels or less in the vertical direction by shrinking v pixels in the vertical direction. The subsequent vertical extension means 24 extracts vertical line segments longer than v pixels by extending v pixels in the vertical direction. The NOR circuit 25 performs a NOR operation on the outputs from the horizontal extension means 22 and the vertical extension means 24, erases the character part, leaves only the frame ruled lines, and the frame ruled lines and the background are "0" and "1", respectively. A binary image having values is obtained.
[0024]
Next, the contraction and extension processes in the horizontal and vertical directions will be described in more detail with reference to FIGS.
[0025]
FIG. 3 is a flowchart showing the processing procedure of the horizontal and vertical direction contracting means 21 and 23. The binary image is sequentially scanned one line at a time in the horizontal direction or the vertical direction and processed to the end line. When the pixel contraction process is performed, the run length count value is assumed to be C, and each step will be described.
[0026]
Step 31 sets 0 to the count value C at the start of scanning of each line. Step 32 reads one pixel data. In step 33, it is determined whether the pixel value is 0 (white) or 1 (black). Step 36 sets the count value C to zero. Step 35 outputs a value 0 because it is not a black run. In step 36, it is determined whether or not the count value C is equal to or larger than n. In step 37, the count value C is incremented and the process proceeds to step 35. Step 38 outputs a value 1 since there is a black run having a run length of n pixels or more.
[0027]
By performing the above processing until the end of one line, the black run on that line is reduced by n pixels. When processing the next line, the same processing is repeated from step 31 again. When scanning of the entire screen is completed in this way, a line segment having a run length of n pixels or more in the horizontal or vertical direction is extracted.
[0028]
Similarly, FIG. 4 is a flowchart showing the processing procedure of the horizontal and vertical extension means 22 and 24. The binary image is sequentially scanned one line at a time in the horizontal direction or the vertical direction and processed to the end line. When the extension process of n pixels is performed, the run length count value is assumed to be C, and each step will be described.
[0029]
Step 41 sets the count value C to 0 at the start of scanning of each line. Step 42 reads one pixel data. In step 43, it is determined whether the pixel value is 0 (white) or 1 (black). If it is 1, the process proceeds to step 44. If it is 0, the process proceeds to step 46. Step 44 sets n to the count value C. Step 45 outputs the value 1 because it is on the black run.
[0030]
In step 46, it is determined whether or not the count value C is 0 or less. If it is greater than 0, the process proceeds to step 47, and if it is 0 or less, the process proceeds to step 48. In step 47, the count value C is decremented, and the process proceeds to step 45. Step 48 outputs a value of 0 because the scan position is more than n pixels away from the black run.
[0031]
By performing the above processing until the end of one line, the black run on that line is extended by n pixels. When processing the next line, the same processing is repeated from step 41 again. When scanning of the entire screen is completed in this way, the run length is extended by n pixels in the horizontal or vertical direction.
[0032]
Next, a series of processes in the first corner detection means 4, the component detection means 5, and the rectangle detection means 6 will be described. These contents are described in Japanese Patent Application No. 7-016862 by the same applicant. Detailed description will be omitted, and the operation will be briefly described.
[0033]
First, the first corner detection means 4 will be described with reference to FIGS. FIG. 5 shows the result of conversion into a direction-coded image in which a direction code is added to the outline of the solid ruled line of the binary image extracted by the first ruled line extracting means as preprocessing for detecting a corner. FIG. 6 is a diagram illustrating a correspondence relationship between the direction codes 1 to 8 and the actual direction, and FIG. 7 is a diagram illustrating a specific example of a corner to be detected.
[0034]
In FIG. 5, reference numeral 51 denotes a frame ruled line pixel, 52 denotes a background pixel, and numerals denote direction codes assigned to contour points. In this case, while tracing the background pattern in the clockwise direction, FIG. The direction codes 1 to 8 shown in FIG.
[0035]
Although the direction code is given to the background pixel, it may be given to the outline pixel of the frame ruled line, and the background pattern is traced in the clockwise direction, but may be traced counterclockwise.
[0036]
As described above, the corner is detected by detecting the change point of the direction code, that is, the corner, from the direction-coded image. Therefore, in the vicinity of 3 × 3, a pixel having a direction code that is not the same direction code as the central pixel is detected in the direction indicated by the target position (central pixel) code. In FIG. 5, the positions surrounded by circles indicate the change points of the direction code. For example, at the position 53, the pixel arrangement shown in FIG. 7A is obtained, the direction code of the pixel at the position indicated by the direction “3” indicated by the target pixel is “1”, and the direction of the contour Means change from “3” to “1”, and is detected as a corner which is a change point of the direction code.
[0037]
The change point of the direction code is represented by a code “31” (hereinafter referred to as a direction change code), and the x coordinate, the y coordinate, and the direction change code are detected as a set of feature information. Similarly, the pixel positions 54, 55, and 56 correspond to (b), (c), and (d) of FIG. 7, and are given direction change codes of “17”, “75”, and “53”, respectively. The x-coordinate, y-coordinate, and direction change code of these corner points are notified to the component detection unit 5 as a set of feature information.
[0038]
Next, the component detection means 5 will be described with reference to FIGS. FIG. 8 is a diagram illustrating determination conditions for detecting a component from a combination of corner points, and FIG. 9 is a diagram illustrating a description format of the component. 8, (a), (b), (c), and (d) are detection examples of L-shaped elements, (e), (f), (g), and (h) are detection examples of T-shaped elements, and (i) are cross elements. An example of detection is shown.
[0039]
The component detection unit 5 uses the corner feature information from the corner detection unit 4 to process a plurality of corner points having x and y coordinates within a predetermined distance into one group (hereinafter referred to as grouping). ), And the type of the component is associated with the combination of the direction change codes of the corner points that are members of the group.
[0040]
The components detected in this way are described by a 4-bit code (hereinafter referred to as a shape code) as shown in FIG. 9, and each bit has an arm in any direction of S, W, N, and E from the top. Indicates whether or not exists. For example, since the L-shaped element shown in FIG. 8A has arms in the S and E directions, it is described with a bit pattern of “1001”. The x and y coordinates of the component are given the average values of the x and y coordinates of the corner points that are members of the group, and the component detection means 5 has the x, y and shape of the component. The rectangle is notified to the rectangle detection means 6 as feature information.
[0041]
Next, the rectangle detection means 6 will be described with reference to FIGS. FIG. 10 is a diagram illustrating a connection relationship between components, and FIG. 11 is a diagram illustrating recognition of a minimum rectangle generated from the connection relationship. The rectangle detection means 6 has a connection table (FIG. 10B) describing the connection relationship of these components as rectangle information, and a rectangle information table (position information of the minimum rectangle configured from the connection relationship of the components) ( FIG. 11B) is generated and output.
[0042]
FIG. 10A shows an example of the L-shaped element, the T-shaped element, the cross-shaped element detected by the component detecting means 5 and their positional relationship. First, identification labels e1 to e20 are assigned to each constituent element, and then x based on feature information (x coordinate, y coordinate, shape code) from the constituent element detection means 5 for each direction of the arm indicated by the shape code. Each of the direction and the y direction is searched, and the component having the shortest distance among the components having arms that can be connected is detected, and a connection table (FIG. 10B) is generated. FIG. 10B shows a connection table showing the connection relationship of the component elements, and describes which element each component element is connected to in each of N, S, E, and W directions. For example, in the case of the L-shaped element e1, e14 and e2 exist as components corresponding to the arms S and E, and in the case of the T-shaped element e2, e8, e1 as the components corresponding to the arms S, W, and E. , And e3.
[0043]
Further, using the generated concatenation table (FIG. 10B), the minimum rectangle is recognized and a rectangle information table is generated. FIG. 11A shows a conceptual diagram of the recognition of the minimum rectangle. If a certain component is used as a starting point, the connection is traced in the order of the E direction, the S direction, the W direction, and the N direction. The rectangle composed of the four points is called a minimum rectangle, and is registered in the rectangle information table describing the position, size, etc. of the minimum rectangle shown in FIG. For example, when searching clockwise from the element e1 as the starting point, the element e2 exists as an element connected in the E direction. Next, the connection is traced in the S direction of the element e2 to refer to the element e8, and then the connection is attempted from the element e8 in the W direction. When the connection is traced in the direction, there is an element e15 having an arm in the W direction. Next, when the element e14 is traced from the element e15 in the W direction and traced from the element e14 in the N direction, the element e1 is returned to the starting point and can be recognized as the minimum rectangle. Then, the four elements e1, e2, e15, and e14 that are connected to E, S, W, and N while changing the direction are set to four corners of the minimum rectangle, and the rectangle identification in the rectangle information table of FIG. It registers in the item of label r1, recognizes all the minimum rectangles, and generates a rectangle information table.
[0044]
Although the element e1 is set as the start point, the position is not limited, and any position may be set as the start point. Further, the element e1 is connected clockwise, but may be connected counterclockwise.
[0045]
The link table and the rectangular information table generated in this way are notified to the frame structure matching unit 10.
[0046]
Next, the second ruled line extraction means 7 will be described with reference to FIG. FIG. 12 is a block diagram of image processing in the second ruled

line extracting means

7, 20 is a binary image from the

image memory

2, 201 is first horizontal direction extending means for extending the pattern in the horizontal direction, 202. Is a horizontal contraction means for contracting the pattern in the horizontal direction, 203 is a second horizontal extension means for extending the pattern in the horizontal direction, 204 is a first vertical extension means for extending the pattern in the vertical direction, and 205 is a vertical direction. A vertical contraction means for contracting the pattern in a vertical direction; 206, a second vertical extension means for extending the pattern in the vertical direction; and 207, a NOR operation of outputs of the second horizontal extension means 203 and the second vertical extension means 206. It is a NOR circuit that performs
[0047]
The first horizontal extension means 201 connects the broken line portions shorter than the hd pixels by extending hd pixels in the horizontal direction to the binary image 20 from the image memory 2. The horizontal contraction unit 202 contracts (h + hd) pixels in the horizontal direction to eliminate lines and characters having a width of (h + hd) pixels or less in the horizontal direction, and the second horizontal direction extension unit 203 continues to perform h in the horizontal direction. By extending the pixels, horizontal line segments having an interval equal to or less than hd pixels and longer than h pixels are extracted.
[0048]
Similarly, the first vertical extension means 204 connects the broken line portions shorter than the vd pixels by extending the binary image 20 from the image memory 2 by vd pixels in the vertical direction. The vertical contraction unit 205 contracts (v + vd) pixels in the vertical direction to eliminate lines and characters having a width of (v + vd) pixels or less in the vertical direction, and the subsequent second vertical extension unit 206 performs v in the vertical direction. By extending the pixels, vertical line segments having an interval equal to or less than vd pixels and longer than v pixels are extracted.
[0049]
The NOR circuit 207 performs a NOR operation on the outputs of the second horizontal extension means 203 and the second vertical extension means 206, and only the frame ruled lines in which the characters are erased and the broken line portions become solid lines remain. A binary image 208 having background values of “0” and “1”, respectively, is obtained.
[0050]
The first and second horizontal and vertical extension means 201, 203, 204, and 206 perform the same processing as the horizontal and vertical extension means 22 and 24 of the first ruled line extraction means 3. The horizontal and vertical direction contraction means 202 and 205 perform the same processing as the horizontal and vertical direction contraction means 21 and 23 of the first ruled line extraction means 3, and will not be described in detail.
[0051]
The processing of the second corner detecting means 8 is the same as that of the first corner detecting means 4, and the description of the corner point information of the solid line and the broken line portion from the second ruled line extracting means is omitted. To do.
[0052]
Next, the format information storage means 9 will be described with reference to FIGS. 13A shows an input image of the form 500, FIG. 13B shows an output image of the first ruled line extraction means 3, and FIG. 13C shows an output image of the second ruled line extraction means 7. ing. The format information storage means 9 stores the solid line frame structure information shown in FIG. 13B and the frame structure information separated by a broken line. For example, the format shown in FIG. Corresponds.
[0053]
In FIG. 14, ID number 1 has a one-to-one correspondence with each solid line frame, such as frame 501 and ID number 2 frame 502, and when there is a broken line frame in each solid line frame, the broken line flag is “ 1 "and the number of digits of the solid line frame is set. Further, when the target frame flag is “1”, the target frame is a target frame that is notified to the character cutout unit using the solid line frame and the broken line frame as target frames. In the form 500 of FIG. 13B, there are five digits in the frame 502, three digits in the frame 504, and a three-digit broken line frame in the frame 505. The corresponding positions of

ID numbers

2, 4, and 5 in FIG. 5, 3, and 3 are set as the number of digits. In FIG. 14, the x and y coordinates are registered in the order of upper left, upper right, lower right, and lower left of each solid line frame, and the width and height of each solid line frame are registered. Further, the allowable value in FIG. 14 indicates an allowable error range of the width and height of the frame when the frame structures are collated.
[0054]
The format information storage means 9 registers frame structure information of a plurality of forms, and each form is identified by a record number.
[0055]
Next, the frame structure matching means 10 will be described with reference to FIG. The frame structure collating means 10 first recognizes the solid line frame structure of the input form with reference to the format information shown in FIG. 14, and when the target flag is “1”, the frame structure matching means 10 and FIG. On the basis of the corner point information at the position indicated by ◯, the position of the intersection of the broken lines is determined, and the coordinates of the four corner points of the solid line frame and the broken line frame are notified to the character cutout means 11.
[0056]
FIG. 15 is a flowchart showing a processing procedure in the frame structure matching means 10 and will be described for each step.
[0057]
First, in step 511, the connection table and the rectangle information table, which are frame structure information, are read from the rectangle detection unit 6. In step 512, the average value of the inclinations of the components connected from the read connection table is calculated as an image inclination value grad by (Equation 1).
[0058]
[Expression 1]

[0059]
Here, the average value of the inclinations of the components connected in the horizontal direction is used as the inclination value of the image, but other techniques may be used as long as the inclination of the image is known.
[0060]
Next, steps 513 to 518 are processes for discriminating a form most similar to the frame structure information from the rectangle detecting means 6 from among a plurality of forms registered in the format information storage means 9. First, in step 513, the frame structure information of the record number (i) is extracted from the format information storage means 9.
[0061]
Next, in step 514, the solid line frame comparison between the frame structure information (i), the connection table and the rectangular information table is performed, and the “accumulated frame difference degree” is calculated. Here, the “accumulated frame difference degree” represents the difference degree of the entire form by accumulating the “frame difference degree” obtained for each frame. The “frame dissimilarity” indicates the “difference” in the frame shape when the frame of the frame structure information and the frame of the read form are associated with each other. Define so that the “difference” increases. That is, the greater the difference between the frame structure information and the format of the read form, the greater the “accumulated frame difference”.
[0062]
Here, the frame dissimilarity and cumulative frame dissimilarity representing “difference” in the frame shape are used for collation of the frame structure, but the frame coincidence and cumulative frame coincidence representing “match” of the frame shape are used. Also good.
[0063]
Step 515 compares the cumulative frame dissimilarity calculated in step 514 with the minimum value (hereinafter referred to as the minimum cumulative frame dissimilarity) among the cumulative frame dissimilarities calculated so far. Proceeds to step 516, and if larger, returns to step 513. In step 516, the current cumulative frame dissimilarity is set as the minimum cumulative frame dissimilarity, and the record number (i) of the reference format and the recognition frame table are recorded and updated. In step 517, the record number i is compared with the number n of forms registered in the format information. If i ≧ n, the process returns to step 518, and if i <n, i = i + 1 and the process returns to step 513.
[0064]
A step 518 outputs the recognition frame table as it is if the minimum cumulative frame difference is smaller than a preset threshold value, and outputs it as if there is no corresponding form in the reference format if it is larger. In step 519, the position of the intersection point of the broken line estimated from the format information is determined based on the coordinates of the four points of the solid line frame in which the table target frame flag of the corresponding format information is “1” in the recognized solid line frame. An actual corner point from the second corner detection means is searched from the neighboring region, and a broken line intersection is detected.
[0065]
Next, step 520 is a process of detecting the target frame in the form, and calculates the four corner coordinates of the frame composed of the solid line and the broken line from the coordinates of the four corners of the solid line frame and the broken line. To the character cutout means 11 as follows. For example, the solid line frame 720 in FIG. 23B is composed of five frames 721 to 725 by four broken lines. The coordinates of the solid line frame 720 are respectively set to (x1, y1), (x2, y2), (x3, y3), (x4, y4) in the order of the upper left point, upper right point, lower right point, and lower left point. If the coordinates of (XU0, YU0) to (XU3, YU3) are in order from the left, and the lower coordinates are (XL0, XL0) to (XL3, YL3), the four corner coordinates of the five target frames to be notified are The frame 721 is (x1, y1), (XU0, YU0), (XL0, YL0), (x4, y4) in the order of the upper left point, upper right point, lower right point, and lower left point, and the frame 722 is (XU0, YU0). ), (XU1, YU1), (XL1, YL1), (XL0, YL0), and other frames are also notified as shown in the table of FIG.
[0066]
Next, the solid line frame matching in step 514, the calculation of “cumulative frame difference”, and the detection of the broken line intersection in step 519 will be described in detail.
[0067]
The solid line frame matching and the calculation of “cumulative frame difference” in step 514 will be described in detail with reference to FIGS. 16 to 19. FIG. 16 is a flowchart showing a procedure for comparing the frame structure information registered in the format information storage unit 9 with the concatenation table and the rectangular information table from the rectangle detection unit 6 and calculating the “accumulated frame difference”. is there. FIG. 17A shows an actual image of a reference form (hereinafter referred to as a reference form), and FIG. 17B shows its frame structure information. 18A shows a real image of the read form (hereinafter referred to as an inspection form), FIG. 18B shows an inspection form concatenation table, and FIG. 18C shows a rectangular information table. . FIG. 19 shows the search range. FIG. 20 shows the result of actual processing.
[0068]
Each step will be described in accordance with the processing flow for obtaining the solid line frame collation and the accumulated frame dissimilarity shown in FIG.
[0069]
First, in step 521, the frame coordinates of the format information are rotated by the inclination value grad according to (Expression 2) to correct the inclination.
[0070]
[Expression 2]

[0071]
Step 522 extracts the “starting point frame” as the alignment starting point from the format information, and ends if there is no starting point frame to be extracted. As the “start point frame”, for example, a frame close to the origin is sequentially selected.
[0072]
In step 523, a frame having the same height and width as the “start point frame” within the allowable value range described in the format information is searched from the rectangular information table as a “start point candidate frame”. In step 524, it is determined whether or not a “start point candidate frame” exists. If not, the process proceeds to step 525, and if present, the four-point coordinates of the rectangle are stored as the start point candidate frame, and the process proceeds to step 526. Step 525 sets a predetermined search range centering on the four corner points of the starting point frame from the connection table as shown in FIG. 19A, searches for all components of the connection table of the inspection form, and searches for four searches. If there are components in the entire range, the four points are stored as the starting point candidate frames.
[0073]
If there is no “start point candidate frame” at step 526, the process returns to step 521 to select the next frame of the format information as the “start point frame” and start over.
[0074]
For example, in the case of the forms shown in FIGS. 17 and 18, first, the frame of ID number b1 is selected as the start point frame with FIG. 17B as the format information. Next, when a rectangle having the same size as the frame b1 is searched from the rectangle information table in FIG. 18C, the rectangles r1 and r2 are selected as the start point candidate frames and stored as shown in FIG. The If the rectangles r1 and r2 do not exist, the combination of the constituent elements e1, e2, e3, and e4 and the combination of e5, e6, e10, and e9 are selected and stored as “start point candidate frames”.
[0075]
Next, if there is a “start point candidate frame”, in the next steps 527 to 539, all the “start point frame” and the “start point candidate frame” are aligned so that the format information and the rectangle information The frame difference is obtained by collating each frame of the table, and the cumulative addition is calculated as the cumulative frame difference. Finally, the combination that minimizes the “cumulative frame difference” is selected. The collation between the format information and the rectangular information table will be described with reference to FIG.
[0076]
First, in step 527, the relative distance (rx, ry) between the frame origin 551 of the “start point frame” b1 and the frame origin 550 of the “start point candidate frame” r1 is calculated by (Equation 3).
[0077]
[Equation 3]

[0078]
Next, step 528 is a process of setting the search range of the next target frame. For example, the point moved by the relative distance (rx, ry) from the frame origin (the upper left point of the frame) 557 of the “reference frame” b2. A search range 552 is set around 556.
[0079]
In step 529, the next target frame is set as the target frame r2, for example, from the rectangular information table, and a search is performed as to whether the height and width are within the allowable values and the frame origin 562 is within the search range. Step 530 is for checking whether or not a rectangle exists. If there is a rectangle, the rectangle is registered in the frame table and the process proceeds to step 532, and if not, the process proceeds to step 531. Step 531 is centered on four points (574, 575, 576, and 577) of the frame obtained by moving the four point coordinates of the reference frame 572 by the relative distance (rx, ry) as shown in FIG. A predetermined search range 573 is set to search for four points in the concatenation table.
[0080]
Step 532 registers the searched component point in the frame table. If the component point does not exist, the coordinates of the reference frame point are registered as they are.
In step 533, the “frame dissimilarity” d frame is calculated by (Equation 4). Here, the “frame dissimilarity” is a number indicating the degree of dissimilarity between the reference frame and the searched frame, and the larger this value, the more different the reference frame and the searched frame are.
[0081]
[Expression 4]

[0082]
In the present invention, the frame dissimilarity is the number of components that do not exist in the search range, but other evaluation formulas, for example, the difference in the distance between the reference frame point and the searched frame point (or the absolute difference) Value) may be used as the frame dissimilarity.
[0083]
Step 534 cumulatively adds the “frame difference degree” to the “cumulative frame difference degree”.
In step 535, it is determined whether the format information table has the next frame. If there is, the process proceeds to step 536, and if not, the process proceeds to 537.
Step 536 reads the next reference frame from the format information and returns to step 528. Next, in step 537, it is determined whether or not the “accumulated frame difference” is smaller than the “accumulated frame difference” calculated so far. If it is larger, the process proceeds to step 539, and if smaller, the process proceeds to step 538. Step 538 registers and updates the frame table in the recognition frame table. Step 539 determines whether there is another “start point candidate frame”, and repeats steps 528 to 539 until there is no “start point candidate frame”.
[0084]
If the above processing is performed for all “starting point candidate frames”, the recognition frame data and the accumulated frame difference degree with the smallest difference are finally obtained.
[0085]
FIG. 20 shows the processing result of the matching of the solid line frame and the cumulative frame difference degree. The starting point candidate frames of the starting point frame b1 are r1 and r2 shown in FIG. Therefore, as shown in FIG. 20B, when r1 is a starting point candidate, the frame b2 and the frame r2, the frame b3 and the components e7, e8, e11, the frame b4 and the frame r3 Since the frame dissimilarity is 1 only in the case of b3, the cumulative frame dissimilarity is 1. On the other hand, when r2 is set as the starting point candidate, there is almost no corresponding frame, the cumulative frame difference degree is 10, and the recognition frame data with r1 having the minimum cumulative frame difference degree as the starting point candidate is shown in FIG. Registered in the recognition frame table.
[0086]
Next, the detection of the broken line intersection in step 519 will be described in detail with reference to FIGS. 21 and 22 are flowcharts showing a procedure for detecting a broken line intersection based on corner point information from the second corner detection means. In FIG. 21, N is the total number of broken lines existing in the target frame to be read.
[0087]
The procedure for detecting the broken line intersection will be described step by step based on the flowcharts shown in FIGS.
[0088]
First, in steps 601 to 604, the solid line intersections are equally divided by the number of digits to calculate the candidate positions of the broken line intersections. Step 601 resets the control variable j. Step 602 equally divides the vertices of the solid line frame to obtain the coordinates of the broken line intersection candidate. In step 603, j is incremented. In step 604, it is determined whether j is N or more. If it is N or more, the process proceeds to step 605. Otherwise, the process returns to step 602.
[0089]
The details of the calculation in step 602 will be described in more detail with reference to FIG. FIG. 23A shows the positional relationship between broken line intersection candidates. For example, by dividing the vertex of the frame 701 into five equal parts, the upper broken line intersection candidate CU (0) to CU (3) and The coordinates of CL (3) are obtained from the broken line intersection candidate CL (0) on the side. That is, if the upper two vertices of the solid line frame are (x1, y1), (x2, y2) and the number of digits delimited by the broken line is p, the candidate coordinates (xU (j), yU (j)) of the upper broken line intersection ) Is obtained by the internal division calculation shown in (Equation 5). Candidate coordinates for the lower broken line intersection are also obtained by the same calculation using the lower two vertices (x3, y3) and (x4, y4) of the solid line frame.
[0090]
[Equation 5]

[0091]
The above procedure is also performed for the frame 702 and the frame 703, and the coordinates of all broken line candidate intersections in the form are determined.
[0092]
In the next steps 605 to 611, the corner information from the second corner detection means 8 is read one by one, it is determined whether or not it exists in the vicinity of the candidate coordinates of the broken line intersection, and each corner point existing in the vicinity is determined. Group each broken line intersection into one group.
First, in step 605, the corner information is read point by point in the form of x coordinate, y coordinate, and direction change code. Next, step 606 resets the control variable j. Step 607 sets ± d as a neighboring area centering on the candidate coordinates of the broken line intersection, determines whether or not the corner point exists in the neighboring area, and proceeds to step 608 if it exists in the neighboring area. If not, go to step 609.
[0093]
In step 608, the process is attributed to the corresponding candidate intersection group, and the process returns to 605 again. At this time, if it exists in the vicinity of the candidate coordinates of the broken line intersection on the upper side of the frame, it belongs to the group GU (j), and if it exists in the vicinity of the candidate coordinates of the broken line intersection on the lower side of the frame, the group GL (j) The upper group and the lower group are distinguished from each other by belonging to. This is because the direction change code of the paired corner point is different between the upper intersection and the lower intersection under the condition for establishing the pair corner point described later. Step 609 increments the control variable j. In step 610, it is determined that j ≧ N. If N, the process proceeds to 607, and if Y, the process proceeds to 611. In step S611, it is determined whether the corner point is ended, and it is determined whether the process proceeds to the next step.
[0094]
Next, steps 621 to 627 in FIG. 22 generate a pair of corner points that can constitute a T-shaped element for each group of corner points at each candidate intersection.
[0095]
First, step 621 resets the control variable j. Step 622 then generates a corner point pair for the group GU (j) or GL (j) of corner points for each dashed line intersection. Step 623 determines whether a pair exists. If a pair exists, the process proceeds to step 624, and if the pair does not exist, the process proceeds to step 625. In step 624, the coordinates of the broken line intersection are calculated from the average coordinates of the paired corner points. Step 625 adopts the coordinates obtained by equally dividing the vertices of the solid line frame obtained in Step 602 as the coordinates of the broken line intersection and obtaining the coordinates of the broken line intersection candidate. Next, in step 626, j is incremented, and in step 627, whether or not to proceed to the next step is determined by j ≧ N.
[0096]
As described above, the position of the broken line intersection of the form is determined by the procedure shown in FIGS. Specifically, as shown in FIG. 24, ± d neighboring areas 705 and 706 are set around the candidate positions 707 and 708 of the broken line intersection. The group GU (j) on the upper side of the frame is recognized as a pair because the corner point of the direction change code “31” is present on the right side of the corner point of the direction change code “17”, and the average coordinate is the coordinate of the broken line intersection point. And On the other hand, the group GL (j) on the lower side of the frame is recognized as a pair because the corner point of the direction change code “53” exists on the right side of the corner point of the direction change code “75”. Use the coordinates of the intersection.
[0097]
Next, the character cutout unit 11 will be described. The character cutout means 11 cuts out a binary image of an actual character from the image memory 2 and sends it to the character recognition means 12. The processing will be described in detail with reference to FIG.
[0098]
FIG. 25 shows a part of a binary image in which characters are drawn in a frame of a certain form. Here, 711 to 714 are the four corner points of the frame notified from the frame structure collating means 10, 715 is the area a composed of the four corner points 711 to 714, and 716 is from the area a (715). Also, regions b and 717 that are reduced by the width of the frame line are characters drawn in the frame, and 718 is a character frame.
[0099]
At the time of character cut-out processing, if a binary image is cut out in the area a (715), a part of the character frame 718 is cut out, and an extra image is included, which may reduce the character recognition rate. Therefore, by cutting out from the image memory 2 at the area b (716) which is smaller than the area a (715) by the width of the frame line, the character area excluding the character frame 718 is cut out to the character recognition means 12, and the character 717 is In order to make it recognize, the fall of a character recognition rate can be prevented.
[0100]
The character recognition means 12 is, for example, a character recognition method using a neural network (“handwritten Kanji recognition by PDP model”, Transactions of the Institute of Electronics, Information and Communication Engineers, Vol. J73-D-II, No. 8 pp. 1268-1274 1990. ). Since detailed processing is an existing technique, the above document is presented and omitted.
[0101]
In this way, it is possible to identify a form and recognize characters in the designated target frame.
(Embodiment 2)
The second embodiment of the present invention will be described below with reference to FIGS.
[0102]
FIG. 26 is a block diagram of the form recognition apparatus according to the second embodiment of the present invention. 1 is an image input means for reading a form document and obtaining a binary image, 2 is an image memory for storing the binary image, 3 Are first ruled line extracting means for extracting horizontal and vertical solid ruled lines of a binary image, 4 is first corner detecting means for detecting a corner of a pattern composed of solid ruled lines, and 800 is a combination of the corner points. First component detecting means 801 for detecting components of L-shaped elements, T-shaped elements, cross-shaped elements and I-shaped elements due to bending or intersection of ruled lines, 801 generates a new component by grouping the components together Second component detection means 6, rectangular detection means 6 for connecting the components and outputting the connection form as frame structure information, and 9 for storing format information of a plurality of forms to be read in advance Format information storage means 10 is a frame structure collating means for collating the rectangular structure in the form with the format information based on the frame structure information to determine the type of the form and detecting a character reading target frame in the form; Reference numeral 11 denotes character cutout means for cutting out the character area of the character reading target frame.
[0103]
The operation of the form recognition device configured as described above will be described. The first component detection unit 800 and the second component different from the block configuration diagram of the form recognition device according to the first embodiment of the present invention. The detection means 801 will be described in detail.
[0104]
First, the first component detection unit 801 will be described. The first component detection unit 801 is basically the same as the component detection unit 5 of the first embodiment, and FIG. A determination condition for detecting an I-shaped element that means a break of the ruled line shown in (d) is added. Therefore, an I-shaped element is detected in addition to the L-shaped element, the T-shaped element, and the cross-shaped element, and is notified to the second component element detecting unit 802.
[0105]
Next, the second component detection unit 802 will be described with reference to FIGS. FIG. 28 shows a determination condition for grouping of components, and FIG. 29 is a flowchart showing a procedure for grouping components from the first component detector. FIG. 30 shows the processing result of the second embodiment of the present invention.
[0106]
28A to 28C show examples in which I-shaped elements are grouped to detect L-shaped elements, T-shaped elements, and cross elements, respectively. FIG. 28D shows an example in which I-shaped elements and L-shaped elements are grouped to detect T-shaped elements, and FIG. 28E shows an example in which I-shaped elements and T-shaped elements are grouped to detect cross elements. Is shown.
[0107]
Next, the processing procedure of the second component element detection unit 802 will be described for each step with reference to FIG. First, in step 820, it is determined whether or not the target component is an I-shaped element. If it is an I-shaped element, the process proceeds to step 821. If not, the process proceeds to step 828. Step 821 searches for other components within a preset search range.
[0108]
In step 822, if the other component is not an I-shaped element, the process proceeds to step 824. If the other component is an I-shaped element, the process proceeds to step 823, and the component having the arrangement shown in FIGS. Then, grouping is performed, and the average value of the coordinates of the grouped components and the logical sum of the shape codes are updated and registered as the coordinates and the shape code of the grouping components.
[0109]
In step 824, it is determined whether the other component is an L-shaped element. If it is not an L-shaped element, the process proceeds to step 826. If it is an L-shaped element, the grouping in FIG. Do. In the same manner as described above, an average value of the coordinates of the grouped components and a logical sum of the shape codes are taken and updated and registered as the coordinates and the shape code of the grouping components.
[0110]
In step 826, the other component is determined as a T-shaped element. If it is not a T-shaped element, the process proceeds to step 828. If it is a T-shaped element, the grouping shown in FIG. Similarly, update registration of coordinate values and shape codes is performed. The step 828 returns to 820 if the next component exists, and repeats the process. If the component does not exist, the process ends.
[0111]
Also, an I-shaped element that does not satisfy the grouping conditions of FIGS. 28A to 28E and does not exist is not registered. However, the L-shaped element, the T-shaped element, and the cross element that exist independently are registered as they are.
[0112]
The processing result of the component detection means of Embodiment 2 of this invention is demonstrated using FIG. FIG. 30A is an original image of a form having a moderate curvature at the corners of the frame. FIG. 30B shows the result of detecting a corner point by extracting a line segment having a predetermined length or more by the first ruled line extracting means, and the corner portion of the frame line is interrupted. Reference numerals 831, 832, and 833 in FIG. 30B are I-shaped elements detected from the corner points of the broken part of the frame by the first component detection unit. Similarly, reference numerals 834 and 835 in FIG. 30B denote L-shaped elements detected by the first component detecting means. FIG. 30C shows the components grouped by the second component detection means, and the L-shaped element 836 is a grouped I-shaped element 831 and 832 of FIG. However, although only the left side has been described and illustrated in FIG. 30, the same applies to the right side. The T-shaped element 837 is a grouped I-shaped element 833 and L-shaped element 834. The L-shaped element 838 is notified as it is because there is no component to be grouped. In this way, even when the corner portion of the frame has a gentle curvature, the intersection of the frames can be accurately recognized.
[0113]
【The invention's effect】
As described above, the effects of the present invention are as follows. First, a solid line and a dashed ruled line of a binary image are extracted, components are detected from combinations of corner shapes of the ruled lines, components are connected to each other, and a rectangle is detected. By collating with the format information of multiple forms, even if the form is tilted and input as an image or there is a broken ruled line in the form, the character frame can be accurately detected and character recognition can be performed, A highly reliable form recognition device can be realized.
[0114]
Secondly, a ruled line portion formed in the middle of the frame ruled line extraction process is detected as an I-shaped element, and further, I-shaped elements or I-shaped elements and other elements are grouped to detect constituent elements. Thus, a highly reliable form recognition apparatus capable of accurately recognizing the intersection of the frames can be realized even when the broken or corner portions of the frame have a gentle curvature.
[Brief description of the drawings]
FIG. 1 is a block connection diagram of a form recognition apparatus according to a first embodiment of the present invention.
FIG. 2 is a block connection diagram of first ruled line extraction means in the form recognition apparatus according to the first embodiment.
FIG. 3 is a flowchart showing a processing procedure of line segment shrinking means in the form recognition apparatus of the first embodiment.
FIG. 4 is a flowchart showing a processing procedure of line segment extension means in the form recognition apparatus according to the first embodiment;
FIG. 5 is a diagram for explaining direction coding and direction code change point detection of the first corner detection unit in the form recognition apparatus according to the first embodiment;
FIG. 6 is a diagram showing a correspondence relationship between a direction code of a first corner detection unit and an actual direction in the form recognition apparatus according to the first embodiment.
FIG. 7 is a diagram showing a specific example of the direction change code of the first corner detection means in the form recognition apparatus of the first embodiment.
FIG. 8 is a diagram showing a specific example of a component by a combination of corner points of the component detection means in the form recognition apparatus according to the first embodiment.
FIG. 9 is an explanatory diagram about the description of the form of the component in the component extraction unit in the form recognition apparatus according to the first embodiment;
FIG. 10 is a conceptual diagram showing a connection relationship between components of a rectangle detection unit in the form recognition apparatus according to the first embodiment.
FIG. 11 is a diagram showing a specific example of rectangle detection by rectangle detection means in the form recognition apparatus according to the first embodiment.
FIG. 12 is a block connection diagram of second ruled line extraction means in the form recognition apparatus according to the first embodiment;
FIG. 13 is a view showing a form image for explaining frame structure information of format information storage means in the form recognition apparatus of the first embodiment;
FIG. 14 is a diagram showing a specific example of a format registered in the frame structure information of the format information storage unit in the form recognition apparatus of the first embodiment.
FIG. 15 is a flowchart showing a processing procedure of a frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 16 is a flowchart showing a processing procedure of solid line frame collation and calculation of cumulative frame dissimilarity by a frame structure collating unit in the form recognition apparatus according to the first embodiment;
FIG. 17 is a diagram showing a reference form and its frame structure information for explaining the solid line frame matching process of the frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 18 is a diagram showing an input form, element connection information, and rectangular information for explaining the solid line frame matching process of the frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 19 is a diagram showing a search range for a solid line frame collation of a frame structure collating unit in the form recognition apparatus according to the embodiment;
FIG. 20 is a diagram showing a specific example of a solid line frame matching processing result of the frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 21 is a flowchart showing a procedure for detecting a broken line intersection of a frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 22 is a flowchart showing a procedure for detecting a broken line intersection of a frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 23 is a diagram showing the positional relationship between candidate intersections by broken lines of the frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 24 is a diagram showing a specific example of corner points that form a pair in the broken line detection process of the frame structure matching unit in the form recognition apparatus according to the first embodiment;
FIG. 25 is a diagram for explaining processing of a character extraction unit in the form recognition apparatus according to the first embodiment;
FIG. 26 is a block connection diagram of component detection means of the form recognition apparatus in Embodiment 2 of the present invention.
FIG. 27 is a diagram showing a specific example of an I-shaped component by a combination of corner points of the first component detecting means in the second embodiment of the present invention.
FIG. 28 is a diagram showing combinations of constituent elements that can be grouped by the second constituent element detection unit according to the second embodiment of the present invention;
FIG. 29 is a flowchart showing the processing procedure of the second component element detection means according to the second embodiment of the present invention.
FIG. 30 is a diagram showing a processing result of the component element detection unit according to the second embodiment of the present invention.
FIG. 31 is a block diagram of a conventional ruled line recognition apparatus.
[Explanation of symbols]
1 Image input means
2 Image memory
3 First ruled line extraction means
4 First corner detection means
5 Component detection means
6 Rectangle detection means
7 Second ruled line extraction means
8 Second corner detection means
9 Format information storage means
10 Frame structure verification means
11 Character cutting means
12 character recognition means
20 Binary image
21 Horizontal contraction means
22 Horizontal extension means
23 Vertical contraction means
24 Vertical extension means
25 NOR circuit
201 first horizontal extension means
202 Horizontal contraction means
203 second horizontal extension means
204 First vertical extension means
205 Vertical contraction means
206 Second vertical extension means
207 NOR circuit
801 First component detection means
802 Second component detection means

Claims

An image input means for reading a form document including a frame ruled line and outputting a binary image; an image memory for storing the binary image; and a dashed line for the binary image stored in the image memory by deleting a broken line A first ruled line extracting unit for extracting the first ruled line, a first corner detecting unit for detecting a corner of a ruled line obtained by deleting a broken line from the first ruled line extracting unit, and a binary image stored in the image memory From the second ruled line extracting means for extracting the ruled lines connecting the broken lines, the second corner detecting means for detecting the corners of the ruled lines connected to the broken lines from the second ruled line extracting means, and the first corner detecting means Component detection means for detecting ruled line bends and cross-shaped components from the combination of corners, and rectangular detection means for interconnecting the components from the component detection means and outputting frame structure information; Therefore, the format information storage means for storing the format information storing the frame structure information of a plurality of forms and the broken line flag indicating the intersection of the broken lines, the frame structure information and the format information are sequentially checked, and the frame structure information of the solid line frame is obtained. And determining the type of the form document from the collation result, and a broken line that is an intersection between the corner from the second corner detecting means and the broken line flag of the format information and the target frame specified by the frame structure information and the broken line Frame structure matching means for detecting intersections and outputting coordinates of the target frame, character cutout means for cutting out a character area from the image memory based on the coordinates of the target frame, and character recognition for recognizing characters from the cut out character area A form recognition apparatus comprising: means.

As the frame structure collating means, the frame structure information and the format information are sequentially collated to obtain a frame dissimilarity, the frame dissimilarity is cumulatively added, and the cumulative frame dissimilarity is calculated as the dissimilarity of the entire form. The form recognition apparatus according to claim 1 , wherein the form type is determined from the form.

3. The form recognition according to claim 2 , wherein as the frame structure collating means, the frame structure information and the format information are collated sequentially with the number of components not present in the search range as a frame dissimilarity. apparatus.

3. The form recognition apparatus according to claim 2 , wherein as the frame structure collating means, the frame structure information and the format information are collated in order as a frame dissimilarity with a total sum of differences in distances between corresponding components. .