JP3727422B2

JP3727422B2 - Character recognition apparatus and method

Info

Publication number: JP3727422B2
Application number: JP23332496A
Authority: JP
Inventors: 磨理子竹之内; 好克井藤
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1996-09-03
Filing date: 1996-09-03
Publication date: 2005-12-14
Anticipated expiration: 2016-09-03
Also published as: JPH1078997A

Description

【０００１】
【発明の属する技術分野】
本発明は、帳票や名刺等の予め記載項目が明らかな文書画像を認識する文字認識装置及びその方法に関する。
【０００２】
【従来の技術】
近年、文字認識装置が普及し、その文字認識の精度の向上が望まれている。
図１６は、特開昭６３ー３１１４９２号公報に記載の従来の文字認識装置の構成図である。この文字認識装置は、画像入力部１と、画像メモリ２と、サブ文字パターン切り出し部３と、文字パターン抽出部４と、認識部５と、認識辞書６と、単語処理部７と、単語辞書８と、表示部９とを備えている。
【０００３】
サブ文字パターン切り出し部３は、漢字であれば「偏」と「旁」とのように文字の構成要素であるサブ文字パターンを文書画像から切り出す。文字パターン抽出部４は切り出されたサブ文字パターンの組み合わせをし、認識部５はその文字パターンの組み合わされた文字パターンを認識して、候補文字列を複数作成する。単語処理部７は、単語辞書８を照合して作成された候補文字列から一の単語を認識結果として確定する。
【０００４】
【発明が解決しようとする課題】
ところで、従来の装置では、様々なサブ文字パターンの組み合わせの可能性を考慮して文字認識をしているので、文字を認識するまでの処理に時間を要する。更に、複数の作成された候補文字列を単語辞書と照合して一の単語を決定するけれども、単語辞書との照合ができれば、決定された単語を正しい認識結果であるとして出力していたので、その認識精度は必ずしも高くない。
【０００５】
本発明は上記課題に鑑み、複数項目が記載された文書の記載内容を短時間で精度よく文字認識する文字認識装置及びその方法を提供することを目的とする。
【０００７】
【課題を解決するための手段】
上記課題を解決するために本発明は、対象とする文書画像が複数の記載項目に相当する文字列画像からなり、そのような文書画像の文字認識をする文字認識装置であって、文書画像の入力を受け付けると文字列画像の高さに基づいて又は、指示を受けると指示に基づいて文字画像を切り出す文字切出手段と、字種別に文字特徴が登録された認識辞書と、前記文字切出手段で切り出された文字画像を前記認識辞書の全範囲又は指示された範囲で照合して複数の文字候補を選択する文字候補選択手段と、認識対象とする文書画像の全ての記載項目に関連する単語を属性ごとに分類して登録している単語辞書と、前記文字候補選択手段で選択された文字画像の連続した文字候補を組み合わせて及び単独で前記単語辞書の全範囲又は指示された範囲で照合し、一致する単語があれば単語候補としてその属性とともに抽出し、一致するものがなければ属性を未定義として文字候補をそのまま単語候補として抽出する単語抽出手段と、前記単語抽出手段で抽出された単語候補をキーワードとして文字列画像単位で記載項目に分類する記載項目分類手段と、記載項目ごとに記載内容である単語の関連する属性、記載条件の一覧を記録した関連ルールテーブルと、前記単語抽出手段で単語辞書の全範囲で照合して得られた単語候補が前記関連ルールテーブルの当該記載項目の内容を満たすか否かを判定する関連判定手段と、前記関連判定手段で満たさないと判定されたときに、前記文字切出手段、文字候補選択手段又は単語抽出手段に所定の指示を与える指示手段と、前記関連判定手段で満たすと判定された単語候補と指示手段の指示に従い抽出された単語候補とを出力する出力手段とを備えることとしているので、指示手段は、関連判定手段が記載項目ごとに記載内容の属性等を満たさないと判定した単語候補を誤認識とみなして、文字切出手段等に指示する。したがって、認識精度が向上する。
【０００８】
【発明の実施の形態】
以下、本発明に係る文字認識装置を図面を用いて説明する。
図１は、本発明に係る文字認識装置の一実施の形態の構成図である。この文字認識装置は、画像入力部１０１と、文字切り出し部１０２と、認識辞書１０３と、文字認識部１０４と、単語辞書１０５と、単語照合部１０６と、項目分類部１０７と、項目関連ルール記憶部１０８と、再処理判定部１０９と、結果出力部１１０とを備えている。
【０００９】
画像入力部１０１は、スキャナ等からなり、オペレータにより用意された記載項目が予め決められている原稿をＬ／Ｅ（Light/Electric）変換等して、２値データで構成される画素（黒画素と白画素）の集合である文書画像にして文字切り出し部１０２に通知する。図２は、文字切り出し部１０２に通知された名刺の文書画像２０１を示している。
【００１０】
文字切り出し部１０２は、画像入力部１０１から文書画像の通知を受けると、文書画像を走査して、黒画素の数を縦方向又は横方向に数え、走査線ごとの黒画素の分布から文字列画像を抽出する。この抽出した文字列画像に番号を付し、その位置例えば、文字列画像を外接する矩形の左上端点と右下端点との座標を求め、その文字列画像の高さを計算する。
【００１１】
次に、この文字列画像の存在方向と直角方向に文字列画像を走査して、その黒画素の分布からその文字列画像の高さとほぼ同一の幅をもつ文字画像を切り出す。即ち、各文字列画像から全角サイズの文字画像を切り出す。切り出した文字画像に番号を付して、その位置とともに文字認識部１０４に通知する。
図３は、図２に示した文書画像２０１から抽出された文字列画像Ｌ１〜Ｌ５と、各文字列画像ごとに切り出された文字画像Ｃ１，Ｃ２，…を示している。
【００１２】
また、文字切り出し部１０２は、再処理判定部１０９から後述する再処理条件テーブル８０１の内容に基づいて、文字画像切り出し条件とその文字画像の番号（処理範囲）との通知を受けると、その文字画像切り出し条件例えば等ピッチやプロポーショナルピッチに従い通知された番号の範囲の文字画像を処理範囲として新たな文字画像を切り出す。新たに切り出した文字画像は、上記の処理と同様であり、新たに各文字画像に番号を付し、その位置やラン情報とともに文字認識部１０４に通知される。
【００１３】
認識辞書１０３は、文字コードとその文字コードに対応する文字の標準特徴ベクトルとを登録している。ここで文字コードには、漢字、平仮名、カタカナ、数字、英字のコードの他に記号等のコードも含まれている。この認識辞書１０３は、漢字、…、記号等の字種別に文字コード等が登録されている。
文字認識部１０４は、文字切り出し部１０２から文字画像等の通知を受けると、その特徴ベクトルを抽出し、認識辞書１０３の全ての文字コードの標準特徴ベクトルと照合し、例えば、市街地距離を求めて、その値の小さい標準特徴ベクトルに対応する文字コードを上位から複数個抽出し、文字候補として記憶する。この文字画像の認識処理を全文字列画像の全ての文字画像について行うと、単語照合部１０６を起動する。
【００１４】
図４は、図３に示した文字画像を認識処理した結果を説明するための図である。各文字列画像Ｌ１〜Ｌ５の文字画像Ｃ１，Ｃ２，…を縦列に示し、各文字候補の上位、１位〜３位を横列に記載している。この文字候補は、実際は文字コードで記憶されている。
例えば、文字列画像Ｌ１の文字画像Ｃ１の１位文字候補は「認」であり、２位文字候補は「誌」であり、３位文字候補は「詰」である。
【００１５】
また、文字認識部１０４は、再処理判定部１０９から文字画像認識条件の通知を受けている場合には、認識辞書１０３のその条件例えば文字種が数字だけ、英字、数字、記号だけ等に限定した照合範囲で上記と同様、複数個の文字コードを抽出し文字候補として記憶する。再切り出し文字画像についての再認識処理が全て終了すると単語照合部１０６を起動する。
【００１６】
単語辞書１０５は、図５に示すように単語欄５０１と、単語欄５０１に登録されている単語の総括的な内容を表わす属性欄５０２とを有し、各属性に対応して複数の単語が登録されている。なお、実際には、単語は文字コードで示されている。
即ち、「株式」、「有限」、「会社」、…等の単語の総括的な内容を表わす属性は「会社」であり、属性「メイル」には、「メイル」、「ＭＡＩＬ」、「ｊｐ」、…等の単語が含まれていることを示している。なお、この単語辞書１０５の内容は、認識対象の文書画像によって変更される。
【００１７】
本実施の形態では、文書画像２０１として「名刺」が対象とされているので、このような単語辞書１０５の内容となっている。
また、この属性欄５０２の「会社」〜「メイル」迄に含まれている単語欄５０１の各単語は、後述する記載項目を分類するキーワードとして用いられる。
単語照合部１０６は、文字認識部１０４に起動されると、文字認識部１０４で抽出された各文字列画像の先頭の文字候補から連続する文字候補を単語候補として読み出し、単語辞書１０５の単語欄に登録されている単語に一致するものがあれば、それを単語候補として決定し、その単語の属性とともに記憶する。この際、単語候補の読み出しは、上位の文字候補を優先し、１位同士の文字候補に一致する単語が単語辞書１０５に登録されていないときに、１位と２位、１位と３位のように文字候補を組み合わせて単語候補を読み出すこととする。この１位、２位、３位の文字候補を次の文字候補と組み合わせても単語が見つからないときには、１位の文字候補を単語候補として決定し、属性は「未定義」として記憶する。全ての文字列画像の全ての文字候補について、単語照合処理が終了すると項目分類部１０７を起動する。
【００１８】
図６は、単語辞書１０５を照合して得られた単語候補を示している。図４に示した文字列画像Ｌ１の場合には、文字画像Ｃ１，Ｃ２の１位文字候補の組み合わせが単語辞書１０５の属性「一般」に単語「認識」として登録されているので、文字画像Ｃ１，Ｃ２を単語候補「認識」として決定する。続いて、文字候補Ｃ３，Ｃ４の１位文字候補の組み合わせが属性「氏名」に単語「珠代」として登録されているので文字画像Ｃ３，Ｃ４を単語候補「珠代」として決定する。同様に文字画像Ｃ５，Ｃ６を単語候補「会社」と決定する。
【００１９】
ここで、文字画像Ｃ３，Ｃ４の２位と３位の文字候補を組み合わせると、単語辞書１０５の属性「会社」の単語「株式」と一致するけれども、文字候補の上位のものを組み合わせた「珠代」を単語候補としている。
文字列画像Ｌ２の文字画像Ｃ３，Ｃ４の場合には、１位文字候補同士の組み合わせ「大郎」は、単語辞書１０５にはないので、文字画像Ｃ３の２位文字候補「太」と文字画像Ｃ４の１位文字候補の組み合わせた単語「太郎」が単語辞書１０５の属性「氏名」に登録されているので単語候補として決定される。
【００２０】
文字列画像Ｌ５の文字画像Ｃ６，Ｃ１１の１位から３位迄の文字候補は全て単語辞書１０５に単語登録されていないので、１位の文字候補を単語候補「旧」、「卯」として属性を未定義の「未」とそれぞれしている。
また、単語照合部１０６は、再処理判定部１０９から文字画像の番号と単語辞書照合条件との通知を受けた場合には、文字認識部１０４で認識された連続する又は単独の文字候補を組み合わせて単語候補を決定する際、単語辞書１０５の照合範囲を通知された属性に従うようにして再決定する。
【００２１】
即ち、単語辞書照合条件として属性「会社」、「一般」が通知されているときには、文字列画像Ｌ１の文字画像Ｃ３，Ｃ４のそれぞれ２位と３位の文字候補を組み合わせた「株式」（属性「会社」）を単語候補と決定する。このとき、属性「氏名」に含まれる単語辞書１０５の「珠代」は排除される。
このようにして、単語候補が再決定されると、再処理判定部１０９に決定した単語候補をその文字画像の番号とともに通知する。
【００２２】
項目分類部１０７は、単語照合部１０６から起動されると、単語照合部１０６に記憶されている文字列画像単位で単語候補をキーワードとして予定されたどの記載項目であるか分類する。文字列画像Ｌ１では、文字画像Ｃ５、Ｃ６の単語候補「会社」が単語辞書１０５の属性「会社」にあるので記載項目が「会社」と分類される。なお、この際、属性「氏名」の単語候補「珠代」によって「氏名」に分類しようとしても、残りの「認識」、「会社」が属性「氏名」にはないので分類されない。文字列画像Ｌ２では、単語候補「名刺」「太郎」の属性「氏名」より記載項目が「氏名」と分類される。文字列画像Ｌ３，Ｌ４，Ｌ５の記載項目は、それぞれ「住所」、「電話」、「メイル」に分類される。なお、この記載項目の分類には、文字列画像のレイアウト等を参照して判断されるけれども、本発明の本旨と異なるので説明を省略する。
【００２３】
全ての文字列画像について記載項目の分類が終わると再処理判定部１０９を起動する。
項目関連ルール記憶部１０８は、図７に示す関連ルールテーブル７０１と図８に示す再処理条件テーブル８０１とを記憶している。
関連ルールテーブル７０１は、記載項目７０２ごとの記載内容に関連する属性７０３と記載内容の特殊条件７０４とを含んでいる。この関連ルールテーブル７０１から記載項目「会社」の関連する属性は「会社」と「一般」とが含まれており、他の属性は含まれないことがわかる。即ち、他の属性の単語候補を含んでいれば文字画像を誤認識したことになる。
【００２４】
特殊条件７０４から記載項目「住所」において、「〒」の次に記載されている文字画像は「３又は５桁の数字」であることがわかる。また、記載項目「電話」の文字列画像の最後は「４桁の数字」であることがわかる。
再処理条件テーブル８０１は、記載項目８０２と再処理条件８０３とを含み、再処理条件８０３には、その処理範囲８０４と文字画像切り出し条件８０５と文字画像認識条件８０６と単語辞書照合条件８０７とを含んでいる。
【００２５】
例えば、文字列画像が記載項目「会社」と分類された場合には、その文字列画像の単語照合部１０６において決定された単語候補の属性が関連する属性７０３に含まれていないときには、再処理条件８０３の処理範囲８０４を「関連する属性を満たさない文字画像の単語候補」と規定し、単語辞書照合条件８０７を「属性会社又は一般と限定」するよう規定している。これによって、単語照合部１０６の単語候補決定のための単語辞書１０５の照合範囲は、属性「会社」、「一般」の単語に限定される。
【００２６】
また、文字列画像が記載項目「電話」と分類されている場合には、単語候補の属性が関連する属性７０３に含まれていないときには、再処理条件８０３の処理範囲８０４を関連する属性を満たさない単語候補及びその単語候補に連なる「メイル」以外の単語候補と規定し、文字画像切り出し条件８０５をプロポーショナルピッチと、文字認識条件８０６を文字種を「数字」、「記号」と限定するよう規定している。
【００２７】
再処理判定部１０９は、項目分類部１０７に起動されると、項目関連ルール記憶部１０８に記憶されている関連ルールテーブル７０１を読み出す。次に、項目分類部１０７で文字列画像単位ごとに分類された記載項目に含まれる単語候補に関連ルールテーブル７０１の関連する属性７０３に反する単語候補の属性があるか否か、特殊条件７０４があるときに、その条件を満たさない単語候補があるか否かを判定する。いずれの判定でも否定のときには、全文字列画像についての再処理条件の判定が終了したか否かを判定し、終了していれば結果出力部１１０を起動する。終了していなければ、次の文字列画像について、関連ルールテーブル７０１の内容に反するか否かの判定を繰り返す。
【００２８】
いずれかの判定で肯定のときには、その文字列画像の認識に誤りがあるので、項目分類部１０７で分類されたその文字列画像の記載項目が「メイル」であるか「住所」または「電話」であるか、「会社」または「氏名」であるかを判定する。記載項目の判定をすると、項目関連ルール記憶部１０８に記憶されている再処理条件テーブル８０１からその記載項目８０２の再処理条件８０３を読み出す。
【００２９】
「メイル」と判定したときには、その関連する属性を満たさない単語候補とその単語候補に連なる「メイル」以外の単語候補の文字画像の番号とプロポーショナルピッチでの文字画像切り出しをする旨とを文字切り出し部１０２に通知し、認識辞書１０３の英字、数字、記号を照合対象とする旨を文字認識部１０３に通知し、単語辞書１０６の属性がメイル、英字、数字、記号の単語を単語照合の範囲とする旨を単語照合部１０７に通知する。
【００３０】
記載項目を「住所」と判定したときには、その特殊条件を満たさない単語候補の文字画像の番号と等ピッチの「３」又は「５」文字で文字画像を切り出す旨を文字切り出し部１０２に通知し、認識辞書１０３の数字を照合対象とする旨を文字認識部１０４に通知する。
記載項目を「電話」と判定したときには、その関連する属性を満たさない単語候補の文字画像の番号とプロポーショナルピッチでの文字画像を切り出す旨を文字切り出し部１０２に通知し、認識辞書１０３の数字、記号を照合対象とする旨を文字認識部１０４に通知する。
【００３１】
記載項目を「会社」と判定したときには、関連する属性を満たさない単語候補の文字画像の番号と単語辞書１０５の属性が会社、一般の単語を照合対象とする旨とを単語照合部１０６に通知する。
記載項目を「氏名」と判定したときには、関連する属性を満たさない単語候補の文字画像の番号と単語辞書１０５の属性が氏名、肩書の単語を照合範囲とする旨とを単語照合部１０６に通知する。
【００３２】
再処理判定部１０９は、例えば、文字列画像Ｌ１の場合には、記載項目を「会社」と判定し、その文字画像Ｃ３，Ｃ４とを関連する属性を満たさない「珠代」と認識しているので、単語照合部１０６に文字画像Ｃ３，Ｃ４の文字候補の単語辞書１０５の照合範囲を会社、一般に限定するよう単語照合部１０６に通知する。これによって、単語照合部１０６は、文字認識部１０４で認識された文字画像Ｃ３の２位文字候補「株」と文字画像Ｃ４の３位文字候補「式」とを組み合わせた属性「会社」の「株式」を単語候補として再認識する。
【００３３】
文字画像Ｌ３の場合には、記載項目を「住所」と判定し、その文字画像Ｃ２を特殊条件に反する英字「Ｍ」と認識している（図６）。その文字画像Ｃ２（図９（ａ））は、文字切り出し部１０２によって、文字画像切り出し条件８０５に従い、図９（ｂ）に示すように３個の文字画像Ｃ２１、Ｃ２２、２３として切り出される。これによって、文字認識部１０４は、図９（ｃ）に示すように認識辞書１０３と照合して文字候補を抽出する。単語照合部１０６は、単語辞書１０５と文字候補を照合して、文字画像Ｃ２１、Ｃ２２、Ｃ２３の１位文字候補を単語候補「１」、「２」、「１」とそれぞれ決定する。
【００３４】
文字列画像Ｌ５の場合には、記載項目を「メイル」と判定し、その文字画像Ｃ１〜Ｃ３以外の文字画像Ｃ４〜Ｃ１１を文字認識の再処理範囲としている（図１０（ａ））。文字切り出し部１０２は、再処理判定部１０９から通知されたプロポーショナルピッチに従い図１０（ｂ）に示すように、文字画像Ｃ４〜Ｃ１４を切り出す。文字認識部１０４は、再処理判定部１０９から通知された英文、数字、記号の範囲で認識辞書１０３と照合し、図１０（ｃ）に示すような文字候補を抽出する。単語照合部１０６は、再処理判定部１０９から通知された属性がメイル、英字、数字、記号の範囲で文字候補を組み合わせて単語辞書１０５と照合し、その属性とともに図１０（ｄ）に示すように単語候補を決定する。
【００３５】
なお、文字画像Ｃ１３，Ｃ１４の１位の文字候補の組み合わせ「ｌｐ」は属性「メイル」に存在しないので文字画像Ｃ１３の２位文字候補「ｊ」と文字画像Ｃ１４の１位文字候補「ｐ」とから単語候補「ｊｐ」が決定されている。
また、再処理判定部１０９は、単語照合部１０６から再処理の結果の通知を受けると、再処理の処理範囲外とした先の認識結果とともに結果出力部１１０に文字列画像の番号と、その認識項目と、その認識結果である文字コードとを通知する。
【００３６】
結果出力部１１０は、文字コード等を表示画面に表示させるためのビットマップデータを保持し、再処理判定部１０９から認識結果である文字コード等の通知を受けると、表示画面に図１１に示すような、文書画像の認識結果を表示するとともに、この認識結果を記憶しておく。
次に、本実施の形態の動作を図１２、図１３のフローチャートを参照して説明する。
【００３７】
画像入力部１０１は、オペレータからの文書原稿例えば「名刺」の入力を受け付け、２値化された文書画像に変換する（Ｓ１２０２）。
文字切り出し部１０２は、文書画像から文字列画像を抽出する（Ｓ１２０４）。抽出した全文字列画像から文字画像を全角文字として切り出す（Ｓ１２０６，Ｓ１２０８）。
【００３８】
次に文字認識部１０４は、認識辞書１０３の全字種を照合範囲として切り出された文字画像を照合して文字候補を抽出し、全文字列画像の全文字画像についてのこの照合を繰り返す（Ｓ１２１０，Ｓ１２１２）。
単語照合部１０６は、得られた連続する文字候補を単語辞書１０５の全属性の単語を対象として単語照合を行い、上位の文字候補の組合せを優先して単語候補とその属性との抽出を全文字列画像の全文字画像について終わるまで繰り返す（Ｓ１２１４，Ｓ１２１６）。全ての文字候補の組合せをすることなく、上位の文字候補の組合せによって単語照合の結果、一致する単語が単語辞書１０５に見つかれば単語照合ができたとするので、単語照合の時間は短縮される。この単語候補の認識誤りは、後の属性等の判定により修正ができる。
【００３９】
次に、項目分類部１０７は、単語候補をキーワードとして、文字列画像の記載項目を全文字列画像について分類する（Ｓ１２１８，Ｓ１２２０）。
再認識処理部１０９は、全文字列画像について再認識が終了したか否かを判定し（Ｓ１２２２）、終了していないときは、項目関連ルール記憶部１０８に記憶されている関連ルールテーブル７０１の内容に反するか否かを記載項目ごとに判定し（Ｓ１２２４）、反しないときはＳ１２２２に戻り、関連ルールテーブル７０１の関連する属性７０３と単語候補の属性とが一致しない又は、特殊条件７０４に反するときはＳ１３０２に移る。
【００４０】
Ｓ１３０２において、再処理判定部１０９は、文字列画像の記載項目が「メイル」であるか否かを判定する（Ｓ１３０２）。「メイル」でないときはＳ１３１６に移り、「メイル」のときは、再処理条件テーブル８０１の記載項目８０２の「メイル」の処理範囲８０４に記載の文字画像の番号を文字切り出し部１０２に通知する（Ｓ１３０４）。
【００４１】
文字切り出し部１０２は、通知された文字画像と文字画像切り出し条件８０５のプロポーショナルピッチとに従い、文字画像を再切り出しする（Ｓ１３０６）。
文字認識部１０４は、文字画像認識条件８０６の認識辞書１０３の文字種の英字、数字、記号を照合範囲として文字候補を認識し、再切り出しされた文字画像がなくなるまで処理を繰り返す（Ｓ１３０８，Ｓ１３１０）。
【００４２】
次に、単語照合部１０６は、単語辞書照合条件８０７の単語辞書１０５の属性のメイル、英字、数字、記号を照合範囲として連続した文字候補を組み合わせて又は単独の文字候補を単語候補に決定する。この処理を再認識された単語候補がなくなるまで繰り返し（Ｓ１３１２，Ｓ１３１４）、Ｓ１２２２に戻る。
Ｓ１３１６において、再処理判定部１０９は、文字列画像の記載項目が「住所」又は「電話」であるか否かを判定し（Ｓ１３１６）、ないときはＳ１３２６に移る。「住所」又は「電話」のときは、文字切り出し部１０２に再処理条件テーブル８０１の処理範囲８０４の内容を通知するとともに、記載項目が「住所」のときは等ピッチの３又は５文字の文字画像切り出し条件８０５を、記載項目が「電話」のときはプロポーショナルピッチの文字画像切り出し条件８０５をそれぞれ通知する（Ｓ１３１８）。
【００４３】
文字切り出し部１０２は、再処理判定部１０９から通知された処理範囲と切り出し条件とに従い、文字画像を再切り出しする（Ｓ１３２０）。
文字認識部１０４は、「住所」のときは、認識辞書１０３の文字種の数字を照合範囲とし、「電話」のときは文字種を数字、記号を照合範囲と限定して、再切り出しされた文字画像を文字候補として認識し、再切り出しされた文字画像の認識が全て終わるまで繰り返し（Ｓ１３２２，Ｓ１３２４）、Ｓ１２２２に戻る。
【００４４】
Ｓ１３２６では、単語照合部１０６は、再認識の対象とした文字候補の単語照合が全て終了するまで記載項目が「会社」、「氏名」の場合に、再処理条件テーブル８０１の処理範囲８０４に規定する文字画像の文字候補の組み合わせによる単語辞書照合条件を「会社」のときには単語の属性を会社、一般とし、「氏名」のときには属性を氏名、肩書として照合条件を限定して単語候補を決定する（Ｓ１３２８）。この後Ｓ１２２２に戻る。
【００４５】
このようにして、最初に記載項目を考慮しないで粗い文字画像の切り出し条件によって文字画像を切り出して文字認識した後に、その記載項目を分類して記載内容に応じた文字画像を切り出して文字候補を認識して、記載内容に応じた属性を有する単語候補を決定するので、文字認識の精度と効率が飛躍的に向上する。
Ｓ１２２２において、再処理判定部１０９が全文字列画像の再認識を終了したと判定したときは、結果出力部１１０は、入力された文書画像から認識した単語候補をその文字列画像ごとにその記載項目とともに表示画面に表示して（Ｓ１２２６）、処理を終了する。
【００４６】
なお、本実施の形態では、図１に示したような構成で本発明に係る文字認識装置を実現したけれども、本発明はプログラムによって実現し、これをフロッピーディスク等の記録媒体に記録して移送することにより、独立した他のコンピュータ・システムで容易に実施することができる。図１４は、これをフロッピーディスクで実施する場合を説明する図である。
【００４７】
記録媒体本体であるフロッピーディスク１４０１の物理フォーマットは、同心円状に外周から内周に向かってトラック１、２、…、８０を作成し、角度方向に１６のセクタに分割している。このように割り当てられた領域に従って、プログラムを記録する。
このフロッピーディスク１４０１は、ケース１４０２に収納され、これによって、ディスクを埃や外部からの衝撃から守り、安全に移送することができる。
【００４８】
図１５は、フロッピーディスク１４０１にプログラムの記録再生を行うことを説明する図である。図示のようにコンピュータ・システム１５０１にフロッピーディスクドライブ１５０２を接続することにより、ディスク１４０１に対してプログラムを記録再生することが可能となる。ディスク１４０１はフロッピーディスクドライブ１５０２に、挿入口１５０３を介して組込み、および取り出しがなされる。記録する場合はコンピュータ・システム１５０１からプログラムをフロッピーディスクドライブ１５０２によってディスク１４０１に記録する。再生する場合は、フロッピーディスクドライブ１５０２がプログラムをディスク１４０１から読み出し、コンピュータ・システム１５０１に転送する。
【００５０】
【発明の効果】
以上述べたように、本発明によれば、対象とする文書画像が複数の記載項目に相当する文字列画像からなり、そのような文書画像の文字認識をする文字認識装置であって、文書画像の入力を受け付けると文字列画像の高さに基づいて又は、指示を受けると指示に基づいて文字画像を切り出す文字切出手段と、字種別に文字特徴が登録された認識辞書と、前記文字切出手段で切り出された文字画像を前記認識辞書の全範囲又は指示された範囲で照合して複数の文字候補を選択する文字候補選択手段と、認識対象とする文書画像の全ての記載項目に関連する単語を属性ごとに分類して登録している単語辞書と、前記文字候補選択手段で選択された連続した文字画像の文字候補を組み合わせて及び単独で前記単語辞書の全範囲又は指示された範囲で照合し、一致する単語があれば単語候補としてその属性とともに抽出し、一致するものがなければ属性を未定義として文字候補をそのまま単語候補として抽出する単語抽出手段と、前記単語抽出手段で抽出された単語候補をキーワードとして文字列画像単位で記載項目に分類する記載項目分類手段と、記載項目ごとに記載内容である単語の関連する属性、記載条件の一覧を記録した関連ルールテーブルと、前記単語抽出手段で単語辞書の全範囲で照合して得られた単語候補が前記関連ルールテーブルの当該記載項目の内容を満たすか否かを判定する関連判定手段と、前記関連判定手段で満たさないと判定されたときに、前記文字切出手段、文字候補選択手段又は単語抽出手段に所定の指示を与える指示手段と、前記関連判定手段で満たすと判定された単語候補と指示手段の指示に従い抽出された単語候補とを出力する出力手段とを備えることとしているので、記載項目の内容に反しない属性の単語候補は正しい認識であるとして再度の認識は行わないので認識に要する時間は短縮され、記載項目の内容に反するような属性の単語候補は誤認識であるとして再認識するので認識精度が向上する。
【００５１】
また、本発明によれば、前記指示手段は、前記関連判定手段で満たさないと判定されたときに、該単語候補の文字画像を含む文字列画像の前記文字切出手段での文字画像の切り出し条件と、前記文字候補選択手段での文字候補の照合条件と、前記単語抽出手段での単語候補の照合条件とを所定の条件に変更する指示を与える条件変更指示部を含むこととしてるので、誤認識であるとされた単語候補を正しく認識することができる。
【００５２】
また、本発明によれば、前記指示手段は、記載項目ごとに認識対象とする処理範囲と、文字画像の切り出し条件と、文字候補の字種と、単語候補の属性との処理条件を記録した処理条件テーブルと、前記関連判定手段で満たさないと判定されたとき、該単語候補を含む文字列画像の処理条件を前記処理条件テーブルからその記載項目ごとに読み出す読出部とを有し、前記条件変更指示部は、前記文字切出手段に前記読出部が読み出した処理範囲に含まれる文字画像と文字画像の切り出し条件とを指示する第１条件指示部と、前記文字候補選択手段に前記読出部が読み出した前記認識辞書の照合範囲を限定する字種を指示する第２条件指示部と、前記単語抽出手段に前記読出部が読み出した前記単語辞書の照合範囲を限定する属性を指示する第３条件指示部とを備えることとしているので、記載項目ごとに予め再処理条件が明確となり、文書画像の認識効率が高まる。
【００５３】
また、本発明によれば、前記単語抽出手段は、最初に単語候補を抽出するときには、前記文字候補選択手段で選択された複数の文字候補のうち上位の文字候補の組合せを優先して単語候補を抽出し、前記第３条件指示部からの指示を受けたときには、その照合範囲を優先して単語候補を抽出することとしているので、認識処理に要する時間を短縮することができる。
【００５４】
また、本発明によれば、対象とする文書画像が複数の記載項目に相当する文字列画像からなり、そのような文書画像の文字認識をする文字認識方法であって、文書画像の入力を受け付けると文字列画像の高さに基づいて文字画像を切り出す第１文字切出ステップと、前記第１文字切出ステップで切り出された文字画像を字種別に文字特徴が登録された認識辞書の全範囲で照合して複数の文字候補を選択する第１文字候補選択ステップと、前記第１文字候補選択ステップで選択された連続した文字画像の文字候補を組み合わせて及び単独で、認識対象とする文書画像の全ての記載項目に関連する単語を属性ごとに分類して登録している単語辞書の全範囲で照合し、一致する単語があれば単語候補としてその属性とともに抽出し、一致するものがなければ属性を未定義として文字候補をそのまま単語候補として抽出する第１単語抽出ステップと、前記第１単語抽出ステップで抽出された単語候補をキーワードとして文字列画像単位で記載項目に分類する記載項目分類ステップと、前記第１単語抽出ステップで抽出された単語候補が記載項目ごとに記載内容である単語の関連する属性、記載条件の一覧を記録した関連ルールテーブルの当該記載項目の内容を満たすか否かを判定する関連判定ステップと、前記関連判定ステップで満たさないと判定されたときに、文字画像の切り出し条件を指示する第１指示ステップと、前記第１指示ステップにおける切り出し条件に従い文字画像を切り出す第２文字切出ステップと、第２文字切出ステップで切り出された文字画像の認識辞書の照合範囲を指示する第２指示ステップと、前記第２指示ステップにおける指示に従い文字候補を選択する第２文字候補選択ステップと、前記第２文字候補選択ステップで選択された文字候補の単語辞書の照合範囲を指示する第３指示ステップと、前記第３指示ステップにおける指示に従い単語候補を抽出する第２単語候補抽出ステップと、前記関連判定ステップで満たすと判定された単語候補と前記第２単語候補抽出ステップで抽出された単語候補とを認識結果として出力する出力ステップとを有して実行することとしているので、上記文字認識装置と同様の効果を得ることができる。
【図面の簡単な説明】
【図１】本発明に係る文字認識装置の一実施の形態の構成図である。
【図２】上記実施の形態の画像入力部で変換された文書画像の一例を示す図である。
【図３】上記実施の形態の文字切り出し部で切り出された文字画像の一例を示す図である。
【図４】上記実施の形態の文字認識部で図３に示した文字画像から認識された文字候補を説明するための図である。
【図５】上記実施の形態の単語辞書の一例の説明図である。
【図６】上記実施の形態の単語照合部で図４の文字候補から照合処理された単語候補とその属性とを説明するための図である。
【図７】上記実施の形態の項目関連ルール記憶部に記憶されている関連ルールテーブルの内容を示す図である。
【図８】上記実施の形態の項目関連ルール記憶部に記憶されている再処理条件テーブルの内容を示す図である。
【図９】上記実施の形態の文字列画像Ｌ３の文字画像Ｃ２における再認識処理を説明するための図である。
【図１０】上記実施の形態の文字列画像Ｌ５の文字画像Ｃ４〜Ｃ１１における再認識処理を説明するための図である。
【図１１】上記実施の形態の結果出力部で出力表示された文書画像の認識結果の一例を示す図である。
【図１２】上記実施の形態の動作を説明するフローチャートである。
【図１３】上記実施の形態の動作を説明するフローチャートである。
【図１４】上記実施の形態で説明した文字認識方法を記録した記録媒体の説明図である。
【図１５】上記記録媒体のコンピュータシステムへの装着を説明する図である。
【図１６】従来の文字認識装置の構成図である。
【符号の説明】
１０１画像入力部
１０２文字切り出し部
１０３認識辞書
１０４文字認識部
１０５単語辞書
１０６単語照合部
１０７項目分類部
１０８項目関連ルール記憶部
１０９再処理判定部
１１０結果出力部
１４０１フロッピーディスク
１５０１コンピュータシステム
１５０２フロッピーディスクドライブ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character recognition apparatus and method for recognizing a document image in which items described in advance, such as forms and business cards, are clear.
[0002]
[Prior art]
In recent years, character recognition devices have become widespread, and it is desired to improve the accuracy of character recognition.
FIG. 16 is a block diagram of a conventional character recognition apparatus described in Japanese Patent Laid-Open No. 63-311492. This character recognition device includes an image input unit 1, an image memory 2, a sub character pattern cutout unit 3, a character pattern extraction unit 4, a recognition unit 5, a recognition dictionary 6, a word processing unit 7, a word dictionary 8 and a display unit 9.
[0003]
The sub character pattern cutout unit 3 cuts out a sub character pattern that is a constituent element of a character such as “bias” and “旁” from a document image if it is a Chinese character. The character pattern extraction unit 4 combines the cut sub-character patterns, and the recognition unit 5 recognizes the combined character pattern and creates a plurality of candidate character strings. The word processing unit 7 determines one word as a recognition result from candidate character strings created by collating the word dictionary 8.
[0004]
[Problems to be solved by the invention]
By the way, in the conventional apparatus, since character recognition is performed in consideration of the possibility of combining various sub-character patterns, it takes time to recognize characters. Furthermore, although a plurality of candidate character strings are collated with a word dictionary to determine one word, if the collation with the word dictionary can be performed, the determined word is output as a correct recognition result. The recognition accuracy is not necessarily high.
[0005]
In view of the above problems, an object of the present invention is to provide a character recognition device and a method for recognizing characters in a document in which a plurality of items are described accurately in a short time.
[0007]
[Means for Solving the Problems]
To solve the above problems The present invention is a character recognition device for recognizing characters of a document image, in which the target document image is composed of character string images corresponding to a plurality of description items. Character cutting means for cutting out a character image based on the height of the character or on the basis of the instruction, a recognition dictionary in which character characteristics are registered in the character type, and a character image cut out by the character cutting means Character candidate selection means for selecting a plurality of character candidates by collating with the entire range of the recognition dictionary or the designated range, and classifying words related to all the description items of the document image to be recognized for each attribute. The word dictionary registered in combination with the consecutive character candidates of the character image selected by the character candidate selection means and collated alone in the entire range of the word dictionary or in the designated range, Ah If there is no match, the word extraction unit extracts the character candidate as the word candidate as it is and the character candidate extracted by the word extraction unit as a keyword Description item classification means for classifying into description items in units of images, a related rule table that records a list of related attributes and description conditions of words that are description contents for each description item, and the entire range of the word dictionary by the word extraction means When it is determined that the word candidate obtained by collating with the relation candidate satisfies the contents of the description item of the relation rule table and the relation determination means does not satisfy the word candidate, An instruction means for giving a predetermined instruction to the output means, the character candidate selection means or the word extraction means, and the word candidate and the indicator determined to be satisfied by the association determination means Output means for outputting the word candidates extracted according to the instruction of the instruction, the instruction means misrecognizes the word candidates that the association determination means determines that the description content attribute or the like is not satisfied for each description item Instructing the character cutting means and the like. Therefore, recognition accuracy is improved.
[0008]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a character recognition device according to the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram of an embodiment of a character recognition device according to the present invention. This character recognition device includes an image input unit 101, a character cutout unit 102, a recognition dictionary 103, a character recognition unit 104, a word dictionary 105, a word collation unit 106, an item classification unit 107, and an item related rule storage. Unit 108, reprocessing determination unit 109, and result output unit 110.
[0009]
The image input unit 101 is composed of a scanner or the like, and a pixel (black pixel) composed of binary data by performing L / E (Light / Electric) conversion or the like on a document whose description items prepared by an operator are determined in advance. And a white image) are notified to the character segmentation unit 102 as a document image. FIG. 2 shows a business card document image 201 notified to the character cutout unit 102.
[0010]
Upon receiving the notification of the document image from the image input unit 101, the character cutout unit 102 scans the document image, counts the number of black pixels in the vertical direction or the horizontal direction, and determines the character string from the distribution of black pixels for each scanning line. Extract images. A number is assigned to the extracted character string image, and the position, for example, the coordinates of the upper left corner point and the lower right corner point of the rectangle circumscribing the character string image are obtained, and the height of the character string image is calculated.
[0011]
Next, the character string image is scanned in a direction perpendicular to the direction in which the character string image exists, and a character image having the same width as the height of the character string image is cut out from the black pixel distribution. That is, a full-width character image is cut out from each character string image. A number is assigned to the clipped character image, and the character recognition unit 104 is notified together with the number.
3 shows character string images L1 to L5 extracted from the document image 201 shown in FIG. 2, and character images C1, C2,... Cut out for each character string image.
[0012]
When the character cutout unit 102 receives a notification of the character image cutout condition and the number (processing range) of the character image based on the content of the reprocessing condition table 801 described later from the reprocessing determination unit 109, the character cutout unit 102 A new character image is cut out using the character image in the range of the number notified in accordance with the image cut-out condition, for example, equal pitch or proportional pitch, as the processing range. The newly cut-out character image is the same as the above-described processing, and each character image is newly numbered and notified to the character recognition unit 104 together with its position and run information.
[0013]
The recognition dictionary 103 registers character codes and standard feature vectors of characters corresponding to the character codes. Here, the character code includes codes such as symbols in addition to codes of kanji, hiragana, katakana, numbers, and alphabets. In the recognition dictionary 103, character codes and the like are registered in character types such as kanji,.
When the character recognition unit 104 receives a notification of a character image or the like from the character cutout unit 102, the character recognition unit 104 extracts the feature vector, compares it with the standard feature vectors of all character codes in the recognition dictionary 103, and obtains, for example, the city distance A plurality of character codes corresponding to the standard feature vector having a small value are extracted from the top and stored as character candidates. When this character image recognition processing is performed for all character images of all character string images, the word collating unit 106 is activated.
[0014]
FIG. 4 is a diagram for explaining the result of recognition processing for the character image shown in FIG. The character images C1, C2,... Of the character string images L1 to L5 are shown in a column, and the top, first to third places of each character candidate are shown in a row. This character candidate is actually stored as a character code.
For example, the first character candidate of the character image C1 of the character string image L1 is “approval”, the second character candidate is “magazine”, and the third character candidate is “filled”.
[0015]
Further, when the character recognition unit 104 receives the notification of the character image recognition condition from the reprocessing determination unit 109, the character recognition unit 104 limits the condition of the recognition dictionary 103, for example, the character type to only numbers, English letters, numbers, symbols, etc. Similar to the above, a plurality of character codes are extracted and stored as character candidates in the collation range. When all the re-recognition processes for the re-cut character image are completed, the word collating unit 106 is activated.
[0016]
As shown in FIG. 5, the word dictionary 105 includes a word column 501 and an attribute column 502 that represents the overall contents of words registered in the word column 501, and a plurality of words are associated with each attribute. It is registered. In practice, words are indicated by character codes.
That is, the attribute representing the general content of the words “stock”, “finite”, “company”,... Is “company”, and the attribute “mail” includes “mail”, “MAIL”, “jp”. ",... Are included. The contents of the word dictionary 105 are changed depending on the document image to be recognized.
[0017]
In this embodiment, since “business card” is targeted as the document image 201, the contents of such a word dictionary 105 are obtained.
In addition, each word in the word column 501 included in the attribute column 502 from “company” to “mail” is used as a keyword for classifying a description item to be described later.
When activated by the character recognizing unit 104, the word collating unit 106 reads out consecutive character candidates from the first character candidate of each character string image extracted by the character recognizing unit 104 as word candidates, and the word column of the word dictionary 105 If there is a word that matches the registered word, it is determined as a word candidate and stored together with the attribute of the word. At this time, the word candidates are read out by giving priority to the upper character candidate, and when the word matching the first character candidate is not registered in the word dictionary 105, the first place, the second place, the first place, and the third place In this way, word candidates are read out by combining character candidates. When a word is not found even if the first, second and third character candidates are combined with the next character candidate, the first character candidate is determined as the word candidate, and the attribute is stored as “undefined”. When the word matching process is completed for all character candidates of all character string images, the item classification unit 107 is activated.
[0018]
FIG. 6 shows word candidates obtained by collating the word dictionary 105. In the case of the character string image L1 shown in FIG. 4, since the combination of the first character candidates of the character images C1 and C2 is registered as the word “recognition” in the attribute “general” of the word dictionary 105, the character image C1. , C2 is determined as the word candidate “recognition”. Subsequently, since the combination of the first character candidates of the character candidates C3 and C4 is registered as the word “Tsuyo” in the attribute “name”, the character images C3 and C4 are determined as the word candidate “Tsuyo”. Similarly, the character images C5 and C6 are determined as the word candidate “company”.
[0019]
Here, combining the 2nd and 3rd character candidates of the character images C3 and C4 matches the word “stock” of the attribute “company” in the word dictionary 105, but combines the higher character candidates. Is a word candidate.
In the case of the character images C3 and C4 of the character string image L2, the combination “Taro” of the first character candidates is not in the word dictionary 105, so the second character candidate “bold” of the character image C3 and the character image Since the word “Taro” combined with the C1 first character candidate is registered in the attribute “name” of the word dictionary 105, it is determined as a word candidate.
[0020]
Since the first to third character candidates of the character images C6 and C11 of the character string image L5 are not registered as words in the word dictionary 105, the first character candidates are attributed as word candidates “old” and “属性”. Are defined as “undefined”.
When the word collation unit 106 receives a notification of the number of the character image and the word dictionary collation condition from the reprocessing determination unit 109, the word collation unit 106 combines consecutive or single character candidates recognized by the character recognition unit 104. When the word candidate is determined, the collation range of the word dictionary 105 is determined again according to the notified attribute.
[0021]
That is, when the attributes “company” and “general” are notified as the word dictionary matching condition, “stock” (attribute) combining the second and third character candidates of the character images C3 and C4 of the character string image L1. "Company") is determined as a word candidate. At this time, “Tsuyo” of the word dictionary 105 included in the attribute “name” is excluded.
Thus, when a word candidate is re-determined, the re-determining determination unit 109 is notified of the determined word candidate together with the number of the character image.
[0022]
When activated by the word collating unit 106, the item classifying unit 107 classifies which description item is scheduled with the word candidate as a keyword in units of character string images stored in the word collating unit 106. In the character string image L1, since the word candidate “company” of the character images C5 and C6 is in the attribute “company” of the word dictionary 105, the description item is classified as “company”. At this time, even if an attempt is made to classify the name “name” by the word candidate “Tsuyo” of the attribute “name”, the remaining “recognition” and “company” are not classified because they are not in the attribute “name”. In the character string image L2, the description item is classified as “name” from the attribute “name” of the word candidates “business card” and “Taro”. The items described in the character string images L3, L4, and L5 are classified into “address”, “phone”, and “mail”, respectively. The classification of the description items is determined by referring to the layout of the character string image, but the description is omitted because it differs from the gist of the present invention.
[0023]
When the classification of the description items is completed for all the character string images, the reprocessing determination unit 109 is activated.
The item relation rule storage unit 108 stores a relation rule table 701 shown in FIG. 7 and a reprocessing condition table 801 shown in FIG.
The association rule table 701 includes an attribute 703 related to the description content for each description item 702 and a special condition 704 for the description content. From this association rule table 701, it can be seen that the related attributes of the item “company” include “company” and “general”, and other attributes are not included. That is, if a word candidate with another attribute is included, the character image is erroneously recognized.
[0024]
From the special condition 704, it can be seen that the character image described next to “〒” in the item “address” is “3 or 5 digit number”. It can also be seen that the last character string image of the entry item “telephone” is “four-digit number”.
The reprocessing condition table 801 includes a description item 802 and a reprocessing condition 803. The reprocessing condition 803 includes a processing range 804, a character image cutout condition 805, a character image recognition condition 806, and a word dictionary matching condition 807. Contains.
[0025]
For example, when the character string image is classified as the description item “company”, the re-processing is performed when the attribute of the word candidate determined by the word matching unit 106 of the character string image is not included in the related attribute 703. The processing range 804 of the condition 803 is defined as “a word candidate for a character image that does not satisfy a related attribute”, and the word dictionary matching condition 807 is defined as “limited to attribute company or general”. As a result, the collation range of the word dictionary 105 for determining word candidates in the word collating unit 106 is limited to words of the attributes “company” and “general”.
[0026]
If the character string image is classified as the entry item “telephone” and the attribute of the word candidate is not included in the related attribute 703, the processing range 804 of the reprocessing condition 803 is satisfied with the related attribute. And a word candidate other than “mail” connected to the word candidate, character image cutout condition 805 is defined as proportional pitch, and character recognition condition 806 is defined as limited to “number” and “symbol”. ing.
[0027]
When activated by the item classification unit 107, the reprocessing determination unit 109 reads the related rule table 701 stored in the item related rule storage unit 108. Next, whether the word candidate included in the description item classified for each character string image unit by the item classification unit 107 has an attribute of a word candidate contrary to the related attribute 703 in the related rule table 701, whether a special condition 704 exists. At a certain time, it is determined whether there is a word candidate that does not satisfy the condition. If any determination is negative, it is determined whether or not the determination of the reprocessing condition for all the character string images has been completed. If the determination has been completed, the result output unit 110 is activated. If not completed, the determination of whether or not the next character string image is contrary to the contents of the related rule table 701 is repeated.
[0028]
If any of the determinations are affirmative, there is an error in the recognition of the character string image. Therefore, the description item of the character string image classified by the item classification unit 107 is “mail”, “address” or “phone”. , “Company” or “Name”. When the description item is determined, the reprocessing condition 803 of the description item 802 is read from the reprocessing condition table 801 stored in the item-related rule storage unit 108.
[0029]
When it is determined to be “mail”, the word candidate that does not satisfy the related attribute, the character image number of the word candidate other than “mail” that is connected to the word candidate, and the character image cutout at the proportional pitch are cut out. The word recognition unit 103 is notified that the alphabet, number, and symbol of the recognition dictionary 103 are to be collated, and the word dictionary 106 includes words, letters, numbers, and symbols that have the word collation range. To the word collating unit 107.
[0030]
When it is determined that the description item is “address”, the character segmentation unit 102 is notified that the character image is to be segmented by “3” or “5” characters having the same pitch as the number of the character image of the word candidate that does not satisfy the special condition. The character recognition unit 104 is notified that the numbers in the recognition dictionary 103 are to be collated.
When it is determined that the description item is “telephone”, the character extraction unit 102 is notified that the character image number of the word candidate that does not satisfy the related attribute and the character image at the proportional pitch are to be extracted, The character recognition unit 104 is notified that the symbol is to be collated.
[0031]
When it is determined that the description item is “company”, the word collation unit 106 is notified that the character image number of the word candidate that does not satisfy the related attribute and the attribute of the word dictionary 105 are the company and the general word is to be collated. To do.
When it is determined that the description item is “name”, the word collation unit 106 is notified that the number of the character image of the word candidate that does not satisfy the related attribute and the attribute of the word dictionary 105 are the name and the word of the title. To do.
[0032]
For example, in the case of the character string image L1, the reprocessing determination unit 109 determines that the item to be described is “company” and recognizes the character images C3 and C4 as “Tsuyo” that does not satisfy the related attributes. Therefore, the word collating unit 106 is notified so that the collation range of the word dictionary 105 for the character candidates of the character images C3 and C4 is limited to the company and the general public. As a result, the word collation unit 106 combines the second character candidate “stock” of the character image C3 recognized by the character recognition unit 104 with the third character candidate “formula” of the character image C4. Re-recognize “stock” as a word candidate.
[0033]
In the case of the character image L3, the description item is determined to be “address”, and the character image C2 is recognized as the letter “M” contrary to the special condition (FIG. 6). The character image C2 (FIG. 9A) is cut out by the character cutout unit 102 as three character images C21, C22, and 23 as shown in FIG. 9B according to the character image cutout condition 805. As a result, the character recognition unit 104 collates with the recognition dictionary 103 and extracts character candidates as shown in FIG. The word collating unit 106 collates the word dictionary 105 with the character candidate, and determines the first character candidates of the character images C21, C22, and C23 as word candidates “1”, “2”, and “1”, respectively.
[0034]
In the case of the character string image L5, the description item is determined to be “mail”, and character images C4 to C11 other than the character images C1 to C3 are set as a reprocessing range of character recognition (FIG. 10A). The character cutout unit 102 cuts out character images C4 to C14 in accordance with the proportional pitch notified from the reprocessing determination unit 109, as shown in FIG. The character recognition unit 104 collates with the recognition dictionary 103 in the range of English sentences, numbers, and symbols notified from the reprocessing determination unit 109, and extracts character candidates as shown in FIG. As shown in FIG. 10D, the word collation unit 106 collates with the word dictionary 105 by combining character candidates with the attributes notified from the reprocessing determination unit 109 in the range of mail, alphabetic characters, numbers, and symbols. Determine word candidates.
[0035]
Since the combination “lp” of the first character candidate of the character images C13 and C14 does not exist in the attribute “mail”, the second character candidate “j” of the character image C13 and the first character candidate “p” of the character image C14. The word candidate “jp” is determined from the above.
When the reprocessing determination unit 109 receives the notification of the result of the reprocessing from the word matching unit 106, the reprocessing determination unit 109 sends the number of the character string image to the result output unit 110 together with the previous recognition result outside the processing range of the reprocessing, The recognition item and the character code that is the recognition result are notified.
[0036]
The result output unit 110 holds bitmap data for displaying a character code or the like on the display screen. When the result output unit 110 receives a notification of the character code or the like as a recognition result from the reprocessing determination unit 109, the result screen 110 is shown in FIG. The recognition result of the document image is displayed and the recognition result is stored.
Next, the operation of the present embodiment will be described with reference to the flowcharts of FIGS.
[0037]
The image input unit 101 receives an input of a document original, for example, “business card” from an operator, and converts it into a binarized document image (S1202).
The character cutout unit 102 extracts a character string image from the document image (S1204). A character image is cut out as a full-width character from the extracted full-character string image (S1206, S1208).
[0038]
Next, the character recognizing unit 104 extracts character candidates by collating character images cut out using all character types in the recognition dictionary 103 as a collation range, and repeats this collation for all character images of all character string images (S1210). , S1212).
The word matching unit 106 performs word matching on the obtained consecutive character candidates for words of all attributes in the word dictionary 105, and prioritizes combinations of higher-order character candidates to extract all word candidates and their attributes. It repeats until it completes about all the character images of a character string image (S1214, S1216). Since word matching can be performed if a matching word is found in the word dictionary 105 as a result of word matching by combining upper character candidates without combining all character candidates, the time for word matching is shortened. This recognition error of the word candidate can be corrected by determining the attribute or the like later.
[0039]
Next, the item classification unit 107 classifies the description items of the character string image with respect to all character string images using word candidates as keywords (S1218, S1220).
The re-recognition processing unit 109 determines whether or not the re-recognition has been completed for all the character string images (S1222). If the re-recognition processing unit 109 has not completed the re-recognition processing unit 109, the re-recognition processing unit 109 stores the related rule table 701 stored in the item-related rule storage unit 108. Whether or not the content is contrary is determined for each entry item (S1224). If not, the process returns to S1222, and the related attribute 703 in the related rule table 701 does not match the attribute of the word candidate, or the special condition 704 is violated. If so, the process proceeds to S1302.
[0040]
In step S1302, the reprocessing determination unit 109 determines whether the item described in the character string image is “mail” (S1302). If it is not “mail”, the process advances to step S1316. If it is “mail”, the character extraction unit 102 is notified of the number of the character image described in the “mail” processing range 804 of the description item 802 of the reprocessing condition table 801 ( S1304).
[0041]
The character cutout unit 102 cuts out the character image again according to the notified character image and the proportional pitch of the character image cutout condition 805 (S1306).
The character recognizing unit 104 recognizes a character candidate using the alphabet, number, and symbol of the character type in the recognition dictionary 103 of the character image recognition condition 806 as a collation range, and repeats the process until there is no character image cut out again (S1308, S1310). .
[0042]
Next, the word collation unit 106 determines a single character candidate as a word candidate by combining consecutive character candidates with the mail, English letters, numbers, and symbols of the attribute of the word dictionary 105 of the word dictionary collation condition 807 as a collation range. . This process is repeated until there are no re-recognized word candidates (S1312, S1314), and the process returns to S1222.
In step S1316, the reprocessing determination unit 109 determines whether the description item of the character string image is “address” or “phone” (S1316). If there is no item, the process proceeds to S1326. When “address” or “telephone”, the character cutout unit 102 is notified of the contents of the processing range 804 of the reprocessing condition table 801. When the described item is “address”, three or five characters of equal pitch are used. When the entry item is “telephone” as the image cutout condition 805, the proportional pitch character image cutout condition 805 is notified (S1318).
[0043]
The character cutout unit 102 cuts out the character image again according to the processing range and the cutout conditions notified from the reprocessing determination unit 109 (S1320).
The character recognition unit 104 re-cuts the character image by limiting the character type number of the recognition dictionary 103 to the collation range when “address” and limiting the character type to a number and the symbol as the collation range when “phone”. Is recognized as a character candidate, and it is repeated until the recognition of the recut character image is completed (S1322, S1324), and the process returns to S1222.
[0044]
In S1326, the word matching unit 106 defines the processing range 804 of the reprocessing condition table 801 when the description items are “company” and “name” until all word matching of the character candidates to be re-recognized is completed. When the word dictionary matching condition based on the combination of character candidates for the character image to be set is “Company”, the word attribute is determined to be “Company” and “General”. (S1328). Thereafter, the process returns to S1222.
[0045]
In this manner, after character recognition is performed by cutting out character images according to the rough character image cutout conditions without considering the written items in the first place, the written items corresponding to the written contents are cut out by classifying the written items and character candidates are extracted. Recognizing and determining word candidates having attributes according to the contents of description, the accuracy and efficiency of character recognition are dramatically improved.
When the reprocessing determination unit 109 determines in S1222 that the re-recognition of all the character string images has been completed, the result output unit 110 describes the word candidates recognized from the input document image for each character string image. It displays on a display screen with an item (S1226), and complete | finishes a process.
[0046]
In this embodiment, the character recognition apparatus according to the present invention is realized with the configuration shown in FIG. 1, but the present invention is realized by a program, which is recorded on a recording medium such as a floppy disk and transferred. By doing so, it can be easily implemented by another independent computer system. FIG. 14 is a diagram for explaining a case where this is implemented with a floppy disk.
[0047]
The physical format of the floppy disk 1401 which is a recording medium body is such that tracks 1, 2,..., 80 are created concentrically from the outer periphery to the inner periphery and divided into 16 sectors in the angular direction. The program is recorded according to the allocated area.
The floppy disk 1401 is housed in a case 1402, which protects the disk from dust and external impact and can be safely transported.
[0048]
FIG. 15 is a diagram for explaining the recording / reproduction of a program on the floppy disk 1401. By connecting a floppy disk drive 1502 to the computer system 1501 as shown, a program can be recorded on and reproduced from the disk 1401. The disk 1401 is inserted into and removed from the floppy disk drive 1502 via the insertion port 1503. In the case of recording, the program is recorded on the disk 1401 by the floppy disk drive 1502 from the computer system 1501. For reproduction, the floppy disk drive 1502 reads the program from the disk 1401 and transfers it to the computer system 1501.
[0050]
【The invention's effect】
As mentioned above, According to the present invention, the target document image is composed of character string images corresponding to a plurality of description items, and is a character recognition device for recognizing characters of such a document image. Based on the height of the row image or when an instruction is received, a character cutout means that cuts out a character image based on the instruction, a recognition dictionary in which character characteristics are registered in the character type, and the character cutout means Character candidate selection means for selecting a plurality of character candidates by collating the character image with the entire range of the recognition dictionary or the designated range, and words related to all the description items of the document image to be recognized for each attribute The word dictionary classified and registered and the character candidates of the continuous character images selected by the character candidate selection means are combined and matched in the whole range of the word dictionary or the designated range, and they match. single If there is a word candidate, it is extracted along with its attribute, and if there is no match, the attribute is undefined and the character candidate is extracted as it is as a word candidate, and the word candidate extracted by the word extraction means is used as a keyword. Description item classification means for classifying into description items in character string image units, a related rule table in which a list of related attributes and description conditions of words that are description contents for each description item is recorded, and the word extraction means When it is determined that the word candidate obtained by collating in the entire range satisfies the contents of the description item of the related rule table, and when it is determined not to be satisfied by the related determination unit, An instruction means for giving a predetermined instruction to the character extraction means, the character candidate selection means or the word extraction means; and the word candidates determined to be satisfied by the association determination means; Output means for outputting the word candidates extracted in accordance with the instructions of the indicating means, so that word candidates with attributes that do not contradict the contents of the description items are recognized as correct recognition, so that recognition is not performed again. The time required is shortened, and the word candidates having attributes that are contrary to the contents of the description items are re-recognized as erroneous recognition, so that the recognition accuracy is improved.
[0051]
Further, according to the present invention, when the instruction unit determines that the relation determination unit does not satisfy the instruction, the character extraction of the character string image including the character image of the word candidate is performed by the character extraction unit. Since it includes a condition change instruction unit that gives an instruction to change the condition, the character candidate collating condition in the character candidate selecting means, and the word candidate collating condition in the word extracting means to a predetermined condition, It is possible to correctly recognize word candidates that have been erroneously recognized.
[0052]
Further, according to the present invention, the instruction means records a processing range to be recognized for each description item, a character image clipping condition, a character candidate character type, and a word candidate attribute. A processing condition table; and a reading unit that reads out the processing condition of the character string image including the word candidate for each description item from the processing condition table when it is determined that the relation determination unit does not satisfy the condition determination unit. The change instructing unit includes a first condition instructing unit that instructs the character extracting unit to specify a character image included in the processing range read by the reading unit and a character image clipping condition, and the character candidate selecting unit to the reading unit. A second condition designating unit that designates a character type that limits the collation range of the recognition dictionary read by the user, and an attribute that designates an attribute that limits the collation range of the word dictionary read by the reading unit to the word extracting unit. Since the possible and a condition instructing section, pre-reprocessing conditions for each described item becomes clear, it increases the recognition efficiency of the document image.
[0053]
According to the present invention, when the word extracting unit first extracts a word candidate, the word candidate is given priority to a combination of upper character candidates among the plurality of character candidates selected by the character candidate selecting unit. When the word candidate is extracted with priority given to the collation range when the instruction is received from the third condition instruction unit, the time required for the recognition process can be shortened.
[0054]
According to the present invention, there is also provided a character recognition method for recognizing characters of a document image, in which the target document image is composed of character string images corresponding to a plurality of description items, and accepts input of the document image. A first character cutting step for cutting out a character image based on the height of the character string image, and the entire range of the recognition dictionary in which character features are registered for the character type of the character image cut out in the first character cutting step A document image to be recognized by combining the first character candidate selection step for selecting a plurality of character candidates by collating with the character candidates for the consecutive character images selected in the first character candidate selection step Words related to all the description items are collated in the entire range of the registered word dictionary classified by attribute, and if there is a matching word, it is extracted as a word candidate with the attribute, A first word extraction step for extracting a character candidate as a word candidate with the attribute undefined, and a description item for classifying the word candidate extracted in the first word extraction step into a description item for each character string image as a keyword. Whether the word candidates extracted in the classification step and the first word extraction step satisfy the content of the description item in the related rule table in which a list of related attributes and description conditions of the word as the description content is recorded for each description item An association determination step for determining whether or not, a first instruction step for instructing a character image cut-out condition when it is determined that the relation determination step is not satisfied, and a character image in accordance with the cut-out condition in the first instruction step The collation range of the recognition dictionary of the character image cut out by the second character cutting out step and the second character cutting out step A second instruction step to indicate, a second character candidate selection step for selecting a character candidate in accordance with the instruction in the second instruction step, and a collation range of the word dictionary of the character candidate selected in the second character candidate selection step A third instruction step that performs extraction, a second word candidate extraction step that extracts word candidates in accordance with instructions in the third instruction step, a word candidate that is determined to be satisfied in the association determination step, and an extraction in the second word candidate extraction step Therefore, it is possible to obtain the same effect as that of the character recognition device.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an embodiment of a character recognition device according to the present invention.
FIG. 2 is a diagram illustrating an example of a document image converted by an image input unit according to the embodiment.
FIG. 3 is a diagram illustrating an example of a character image cut out by a character cutout unit according to the embodiment.
4 is a diagram for explaining character candidates recognized from the character image shown in FIG. 3 by the character recognition unit of the embodiment. FIG.
FIG. 5 is an explanatory diagram of an example of the word dictionary of the embodiment.
6 is a diagram for explaining word candidates and their attributes collated from the character candidates in FIG. 4 by the word collating unit of the embodiment. FIG.
FIG. 7 is a diagram showing the contents of a related rule table stored in the item related rule storage unit of the embodiment.
FIG. 8 is a diagram showing the contents of a reprocessing condition table stored in the item-related rule storage unit of the embodiment.
FIG. 9 is a diagram for explaining re-recognition processing in the character image C2 of the character string image L3 in the embodiment.
FIG. 10 is a diagram for explaining re-recognition processing in character images C4 to C11 of the character string image L5 in the embodiment.
FIG. 11 is a diagram illustrating an example of a recognition result of a document image output and displayed by the result output unit of the embodiment.
FIG. 12 is a flowchart illustrating the operation of the embodiment.
FIG. 13 is a flowchart illustrating the operation of the embodiment.
FIG. 14 is an explanatory diagram of a recording medium on which the character recognition method described in the above embodiment is recorded.
FIG. 15 is a diagram for explaining attachment of the recording medium to a computer system.
FIG. 16 is a configuration diagram of a conventional character recognition device.
[Explanation of symbols]
101 Image input unit
102 character cutout
103 recognition dictionary
104 Character recognition part
105 word dictionary
106 Word verification unit
107 Item classification
108 Item-related rule storage unit
109 Reprocessing determination unit
110 Result output section
1401 Floppy disk
1501 Computer system
1502 Floppy disk drive

Claims

A target document image consists of character string images corresponding to a plurality of description items, and is a character recognition device for character recognition of such a document image,
A character cutout means for cutting out a character image based on the height of the character string image when receiving an input of a document image or based on the instruction when receiving an instruction;
A recognition dictionary in which character features are registered as character types;
Character candidate selection means for selecting a plurality of character candidates by collating the character image clipped by the character cutout means with the entire range of the recognition dictionary or the designated range;
A word dictionary in which words related to all description items of a document image to be recognized are classified and registered for each attribute;
Combining consecutive character candidates of the character image selected by the character candidate selecting means and collating them alone in the entire range of the word dictionary or the designated range, and if there is a matching word, it is extracted along with its attributes as a word candidate If there is no match, the word extraction means for extracting the character candidate as it is as the word candidate without defining the attribute,
A description item classification means for classifying the word candidates extracted by the word extraction means into a description item in character string image units as keywords;
A related rule table that records a list of related attributes and description conditions of words that are description contents for each description item,
Relevance determining means for determining whether or not word candidates obtained by collating with the entire range of the word dictionary by the word extracting means satisfy the contents of the description item of the related rule table;
An instruction unit that gives a predetermined instruction to the character extraction unit, the character candidate selection unit, or the word extraction unit when it is determined that the relation determination unit does not satisfy the condition determination unit;
A character recognition apparatus comprising: an output unit that outputs a word candidate determined to be satisfied by the association determination unit and a word candidate extracted according to an instruction from the instruction unit.

The instruction means includes
When it is determined that the relation determination unit does not satisfy the condition, the character image cut-out condition in the character cut-out unit of the character string image including the character image of the word candidate and the character candidate selection unit in the character candidate selection unit and matching condition, the character recognition system according to claim 1, characterized in that it comprises a condition change instruction unit that gives an instruction to change the matching condition of word candidates in the word extraction means on a predetermined condition.

The instruction means includes
A processing condition table that records processing conditions to be recognized for each description item, character image clipping conditions, character candidate character types, and word candidate attributes;
A reading unit that reads out the processing condition of the character string image including the word candidate for each description item from the processing condition table when it is determined not to be satisfied by the association determination unit;
The condition change instruction unit
A first condition instructing unit for instructing the character extracting unit to specify a character image included in the processing range read by the reading unit and a character image extracting condition;
A second condition instructing unit that instructs the character candidate selecting unit to specify a character type that limits a collation range of the recognition dictionary read by the reading unit;
3. The character recognition device according to claim 2, further comprising: a third condition indicating unit that instructs the word extracting unit to specify an attribute that limits a collation range of the word dictionary read by the reading unit.

The word extracting means includes
When extracting word candidates for the first time, a word candidate is extracted by giving priority to a combination of upper character candidates among a plurality of character candidates selected by the character candidate selecting means,
Wherein when receiving an instruction from the third condition instructing section, the character recognition apparatus according to claim 3, wherein the extracting the word candidate in favor of the comparison range.

A character recognition method in which a target document image consists of character string images corresponding to a plurality of description items, and character recognition of such a document image is performed,
A first character extraction step of extracting a character image based on the height of the character string image upon accepting an input of a document image;
A first character candidate selection step of selecting a plurality of character candidates by collating the character image extracted in the first character extraction step with the entire range of the recognition dictionary in which character characteristics are registered for the character type;
A word dictionary in which words related to all description items of a document image to be recognized are classified and registered for each attribute in combination with and independently of the consecutive character candidates selected in the first character candidate selection step. A first word extraction step for collating with all the ranges, extracting a word candidate with its attribute if there is a matching word, and extracting a character candidate as a word candidate with the attribute undefined if there is no matching word;
A description item classification step for classifying the word candidates extracted in the first word extraction step into description items in character string image units as keywords;
It is determined whether or not the word candidate extracted in the first word extraction step satisfies the content of the description item in the related rule table in which a list of related attributes and description conditions of the word as the description content is recorded for each description item. Relevance determination step,
A first instruction step for instructing a character image cut-out condition when it is determined that the relation determination step is not satisfied;
A second character cutting step of cutting out a character image in accordance with the cutting conditions in the first instruction step;
A second instruction step for instructing the collation range of the recognition dictionary of the character image extracted in the second character extraction step;
A second character candidate selection step of selecting a character candidate according to the instruction in the second instruction step;
A third instruction step for instructing a collation range of the word dictionary of the character candidates selected in the second character candidate selection step;
A second word candidate extraction step for extracting word candidates in accordance with the instructions in the third instruction step;
A character recognition method comprising: an output step of outputting as a recognition result the word candidate determined to be satisfied in the association determination step and the word candidate extracted in the second word candidate extraction step .