JP4176175B2

JP4176175B2 - Pattern recognition device

Info

Publication number: JP4176175B2
Application number: JP26129197A
Authority: JP
Inventors: 聡直井; 美佐子諏訪; 悦伸堀田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1996-09-27
Filing date: 1997-09-26
Publication date: 2008-11-05
Anticipated expiration: 2017-09-26
Also published as: JPH10154204A

Description

【０００１】
【発明の属する技術分野】
本発明はパターン認識装置及びパターン認識方法に係わり、特に手書き用文字認識装置のみならず、印刷文字認識装置や図面認識装置における文字、図形及び記号の認識を入力画像の様々な状態に応じて正確に行うようにするものである。
【０００２】
【従来の技術】
ＯＣＲ（Optical Character Reader) 等の手書き文字認識装置は、会計帳票などに書かれている文字を自動的に読み取って、文字を自動入力することにより、会計帳票などから人手で文字を見つけ出し、文字をキー入力するような手間を省くようにしていた。
【０００３】
図７９は、従来の手書き文字認識装置の構成を示すブロック図である。
図７９において、帳票／文書３１１をスキャナで読み込み、その帳票／文書３１１の多値画像を得る。
【０００４】
次に、前処理部３１２において、その多値画像の２値化、雑音除去、帳票／文書３１１の傾き補正を行う。
次に、文字切り出し部３１３において、予め定義されている罫線情報や文字の位置情報を用いることにより、文字を１文字づつ切り出す。
【０００５】
次に、文字認識部３１４において、それぞれの文字ごとに文字認識を行い、文字コードを出力する。ここで、この文字認識は、文字切り出し部３１３により切り出された未知の文字パターンの特徴のそれぞれに対し、認識辞書３１５に予め登録されている個々の文字カテゴリの特徴と１つずつ照合することにより行われる。
【０００６】
例えば、２次元の文字パターンを文字の特徴を表す特徴空間上の特徴ベクトルに変換し、未知の文字パターンと認識辞書３１５に予め登録されている文字カテゴリとの類似度として、特徴空間上の特徴ベクトル間の距離を算出する。そして、未知の文字パターンの特徴ベクトルと認識辞書３１５に予め登録されている文字カテゴリの特徴ベクトルとの間の距離が最も近いものを、未知の文字パターンに対応する文字カテゴリとして認識する。
【０００７】
ここで、消し線、雑音、模様などの非文字を文字と誤って認識し、文字の文字コードが非文字に対して出力されることを防止するため、２つの特徴ベクトル間の距離に対してしきい値を設定しておく。そして、２つの特徴ベクトル間の距離がこのしきい値以上の場合、未知の文字パターンが認識辞書３１５に予め登録されている文字カテゴリのどれに対応しているのかかわからないとするか、非文字であると判断してリジェクトコードを出力するようにしていた。
【０００８】
また、認識辞書３１５として、高品質文字、かすれ文字、つぶれ文字のそれぞれの文字カテゴリの特徴を登録したものを用意しておき、高品質文字に対しては、高品質文字についての認識辞書３１５を使用し、かすれ文字に対しては、かすれ文字についての認識辞書３１５を使用し、つぶれ文字に対しては、つぶれ文字についての認識辞書３１５を使用することにより、帳票／文書３１１の文字の品質の違いに対応できるようにしていた。
【０００９】
【発明が解決しようとする課題】
しかしながら、従来の手書き文字認識装置は、文字がかすれている場合であっても、文字がつぶれている場合であっても、文字が高品質文字であっても、切り出した１文字に対して、同一の認識辞書３１５を用いて画一的に処理を行っていた。
【００１０】
このため、認識辞書３１５に登録してあるかすれ文字についての情報が、高品質文字の認識処理を行う際に悪影響を及ぼし、かすれ文字が認識辞書３１５に登録してあるために、高品質文字が読めなくなってしまうという問題があった。
【００１１】
また、かすれやつぶれだけでなく、文字が罫線に接触しているなどの文字が書かれている環境は様々なものがあり、画一的な認識辞書３１５で様々な環境に対応しようとした場合、互いに相互作用を及ぼし合い、認識処理の精度の大幅な改善は望めないという問題があった。
【００１２】
そこで、本発明の目的は、文字の書かれている環境に応じた適切な認識処理を精度よく行うことが可能なパターン認識装置を提供することである。
【００１３】
【課題を解決するための手段】
上述した課題を解決するために、本発明のパターン認識装置は、入力画像の処理対象に、枠線に接触した文字である枠接触文字が含まれているかどうかを解析するレイアウト解析手段と、前記入力画像の処理対象にかすれ又はつぶれ文字が含まれているかどうかを解析する品質解析手段と、前記入力画像の処理対象に、消し線による訂正文字が含まれているかどうかを解析する訂正解析手段と、枠接触文字の認識手法についての知識を格納した第１の知識テーブルを有し、該第１の知識テーブルに格納された知識に基づいて、前記処理対象に含まれる枠接触文字のパターン認識処理を行なう第１のパターン認識手段と、かすれ又はつぶれ文字の認識手法についての知識を格納した第２の知識テーブルを有し、該第２の知識テーブルに格納された知識に基づいて、前記処理対象に含まれるかすれ又はつぶれ文字のパターン認識処理を行なう第２のパターン認識手段と、訂正文字の認識手法についての知識を格納した第３の知識テーブルを有し、該第３の知識テーブルに格納された知識に基づいて、前記処理対象に含まれる訂正文字のパターン認識を行なう第３のパターン認識手段と、基本的な文字又は文字列の認識手法についての知識を格納した第４の知識テーブルを有し、該第４の知識テーブルに格納された知識に基づいて、前記処理対象に含まれる基本的な文字又は文字列のパターン認識を行なう第４のパターン認識手段と、前記レイアウト解析手段により前記処理対象に枠接触文字が含まれていると解析された場合、前記第１のパターン認識手段に認識処理を行なわせ、前記品質解析手段により前記処理対象にかすれ文字又はつぶれ文字が含まれていると解析された場合、前記第２のパターン認識手段に認識処理を行なわせ、前記訂正解析手段により前記処理対象に訂正文字が含まれていると解析された場合、前記第３のパターン認識手段に認識処理を行なわせ、前記処理対象に文字又は文字列として基本的な文字又は文字列のみが含まれている場合、前記第４のパターン認識手段に認識処理を行なわせる認識処理制御手段とを備え、該認識処理制御手段は、同一の処理対象に基本的な文字又は文字列、枠接触文字、かすれ又はつぶれ文字、及び訂正文字のうちの複数の状態が含まれている場合に前記第１〜第４のパターン認識手段による認識処理をどの順序で実行するかを示す処理順序を格納する処理順序格納手段と、前記レイアウト解析手段、前記品質解析手段、及び前記訂正解析手段による解析の結果に対応させて、前記第１〜第４のパターン認識手段の中からどのパターン認識手段を呼び出すかを示す呼び出し手順を格納する処理順序制御ルール格納手段と、前記処理対象に基本的な文字又は文字列、枠接触文字、かすれ又はつぶれ文字、及び訂正文字のうちの複数の状態が含まれていると解析された場合、前記処理順序制御ルール格納手段に格納されている、前記複数の状態に対応する呼び出し手順と、前記処理順序格納手段に格納されている、前記複数の状態に対応する処理順序とに基づき、前記処理対象に対して行なわれる前記パターン認識手段による各認識処理の実行順序を記入した中間処理結果テーブルを作成する中間処理結果テーブル作成手段とを備え、該中間処理結果テーブル作成手段で作成された前記中間処理結果テーブルに記入された実行順序に基づいて前記パターン認識手段に認識処理を行なわせることを特徴とするものである。
なお、本発明の一態様によれば、処理対象の状態を入力画像から抽出し、その状態に適した認識処理を処理対象ごとに選択することにより、パターン認識を行うようにしている。
【００１４】
このことにより、様々な状態を有する入力画像に対し、それぞれの状態に適したパターン認識処理を行うことができ、認識処理を精度よく行うことが可能となる。
【００１５】
また、本発明の一態様によれば、処理対象の状態を入力画像から抽出し、第１の状態を有する処理対象に対しては、第１の状態専用のパターン認識処理を行い、第２の状態を有する処理対象に対しては、第２の状態専用のパターン認識処理を行うようにしている。
【００１６】
このことにより、第１の状態を有する処理対象の認識処理と第２の状態を有する処理対象の認識処理とが互いに相互作用を及ぼすことがなくなり、認識処理を精度よく行うことが可能となる。
【００１７】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、認識辞書を使い分けるようにしている。
このことにより、例えば、かすれ文字やつぶれ文字や高品質文字が入力画像の中に混在している場合においても、かすれ文字に対してはかすれ文字に適した認識辞書を使用し、つぶれ文字に対してはつぶれ文字に適した認識辞書を使用し、高品質文字に対しては高品質文字に適した認識辞書を使用して認識処理を行うことができ、認識処理を精度よく行うことが可能となる。
【００１８】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、識別関数を使い分けるようにしている。
このことにより、例えば、１文字枠に書かれている文字についてはシティブロック距離を用いて文字認識を行い、フリーピッチ枠に書かれている文字に対しては判別関数を用いて文字の切り出し信頼度を考慮しながら文字認識を行うことができ、認識処理を精度よく行うことが可能となる。
【００１９】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、知識を使い分けるようにしている。
このことにより、例えば、未知文字の変形が大きくて、認識辞書に格納されている文字カテゴリとの対応関係が取れない場合、文字セグメントに文字を分割することにより、未知文字と文字カテゴリとの対応関係をとるようにしたり、文字列から文字を切り出す場合、学習パターンに基づいて生成した判別関数を用いて切り出し信頼度を算出したり、枠接触文字についての文字認識を行う場合、学習パターンにより得られた信頼度を用いて、枠接触文字についての認識信頼度を評価したりすることができ、認識処理を精度よく行うことが可能となる。
【００２０】
また、本発明の一態様によれば、同一の処理対象に対して複数の認識処理が呼ばれた場合、認識処理による信頼度が所定の値以上となるまで、優先順位に従って認識処理を行わせるようにしている。
【００２１】
このことにより、認識処理の信頼度を上げることができ、認識処理の精度を向上させることができる。
また、本発明の一態様によれば、入力画像から非文字を抽出し、この非文字についての認識処理を文字についての認識処理と別々に行うようにしている。
【００２２】
このことにより、文字が非文字とみなされたり、非文字が文字とみなされたりして認識処理が行われることが減少し、認識処理を精度よく行うことが可能となる。
【００２３】
【発明の実施の形態】
以下、本発明の一実施例によるパターン認識装置について図面を参照しながら説明する。
【００２４】
図１は、本発明の一実施例によるパターン認識装置の機能的な構成を示すブロック図である。
図１において、環境認識手段１は、第１〜第Ｎの状態を入力画像から抽出する。ここで、入力画像から抽出される状態とは、例えば、１文字枠やフリーピッチ枠や表などのいずれの形式で文字が書かれているかの状態、文字と枠との接触状態、文字のかすれ状態、文字のつぶれ状態、文字が消し線で消されている状態などである。
【００２５】
第１のパターン認識手段２は、第１の状態を有する処理対象についてのパターン認識処理を専用に行い、第２のパターン認識手段４は、第２の状態を有する処理対象についてのパターン認識処理を専用に行い、第Ｎのパターン認識手段６は、第Ｎの状態を有する処理対象についてのパターン認識処理を専用に行う。
【００２６】
ここで、第１〜第Ｎのパターン認識手段２、４、６は、それぞれの認識結果についての信頼度を算出する信頼度算出手段３、５、７を備え、第１〜第Ｎのパターン認識手段２、４、６による認識結果についての信頼度を算出する。
【００２７】
そして、環境認識手段１は、第１〜第Ｎのパターン認識手段２、４、６の中から、第１〜第Ｎの状態に対応するものを呼び出して認識処理を実行させる。
例えば、環境認識手段１が、入力画像から第１の状態を抽出した場合、その第１の状態の処理対象に対して、第１のパターン認識手段２によるパターン認識処理を呼び出し、入力画像から第２の状態を抽出した場合、その第２の状態の処理対象に対して、第２のパターン認識手段４によるパターン認識処理を呼び出し、入力画像から第Ｎの状態を抽出した場合、その第Ｎの状態の処理対象に対して、第Ｎのパターン認識手段６によるパターン認識処理を呼び出す。
【００２８】
また、環境認識手段１が、同一の処理対象に対して、例えば、第１の状態及び第２の状態を抽出した場合、第１のパターン認識手段２によるパターン認識処理及び第２のパターン認識手段４によるパターン認識処理を、その同一の処理対象に対して呼び出す。
【００２９】
例えば、第１の状態が一文字枠に文字が書かれている状態であるとし、第２の状態がフリーピッチ枠に文字列が書かれている状態であるとし、第３の状態が文字と枠とが接触している状態であるとし、第４の状態が文字のかすれ状態であるとし、第５の状態が文字のつぶれ状態であるとし、第６の状態が文字が消し線で訂正された状態であるとすると、第１のパターン認識手段２は一文字枠に書かれている文字についての認識処理を行い、第２のパターン認識手段４はフリーピッチ枠に書かれている文字列についての認識処理を行い、第３のパターン認識手段は枠接触文字についての認識処理を行い、第４のパターン認識手段はかすれ文字についての認識処理を行い、第５のパターン認識手段はつぶれ文字についての認識処理を行い、第６のパターン認識手段は訂正文字についての認識処理を行う。
【００３０】
そして、環境認識手段１が、入力画像から一文字枠を抽出した場合、その一文字枠に書かれている文字に対し、第１のパターン認識手段２により認識処理を実行させ、環境認識手段１が、入力画像からフリーピッチ枠を抽出した場合、そのフリーピッチ枠に書かれている文字に対し、第２のパターン認識手段４により認識処理を実行させ、環境認識手段１が、入力画像から枠接触文字を抽出した場合、その枠接触文字に対し、第３のパターン認識手段により認識処理を実行させ、環境認識手段１が、入力画像からかすれ文字を抽出した場合、そのかすれ文字に対し、第４のパターン認識手段により認識処理を実行させ、環境認識手段１が、入力画像からつぶれ文字を抽出した場合、そのつぶれ文字に対し、第５のパターン認識手段により認識処理を実行させ、環境認識手段１が、入力画像から訂正文字の候補を抽出した場合、その訂正文字の候補に対し、第６のパターン認識手段により認識処理を実行させる。
【００３１】
また、例えば、環境認識手段１が、入力画像からフリーピッチ枠に接触している枠接触文字を抽出した場合、そのフリーピッチ枠に接触している枠接触文字に対し、パターン認識手段２及びパターン認識手段３により認識処理を実行させ、入力画像からフリーピッチ枠に接触している消し線付きの枠接触文字を抽出した場合、そのフリーピッチ枠に接触している消し線付きの枠接触文字に対し、第２のパターン認識手段４、第３のパターン認識手段及び第６のパターン認識手段により認識処理を実行させる。
【００３２】
ここで、同一の処理対象についての複数の状態が入力画像から抽出され、それに対応して複数のパターン認識手段２、４、６が呼び出された場合、複数のパターン認識手段２、４、６をどの順序で呼び出すかを格納した処理順序テーブルに基づいて、複数のパターン認識手段２、４、６による認識処理の順序を決定する。そして、パターン認識手段２、４、６による認識処理により、所定のしきい値以上の信頼度が信頼度算出手段３、５、７により得られるまで、複数のパターン認識手段２、４、６による認識処理を呼び出し順序に従って順次に実行する。
【００３３】
例えば、環境認識手段１が、入力画像からフリーピッチ枠に接触している枠接触文字を抽出した場合、そのフリーピッチ枠に接触している枠接触文字に対し、パターン認識手段３による認識処理を実行してからパターン認識手段２による認識処理を実行し、入力画像からフリーピッチ枠に接触している消し線付きの枠接触文字を抽出した場合、そのフリーピッチ枠に接触している消し線付きの枠接触文字に対し、第３のパターン認識手段による認識処理を実行してから第６のパターン認識手段による認識処理を実行し、さらに、第２のパターン認識手段４による認識処理を実行させる。
【００３４】
図２は、図１の環境認識手段１の一実施例の構成を示すブロック図である。
図２において、状態抽出手段１ａは、第１〜第Ｎの状態を入力画像から抽出する。
【００３５】
認識処理制御手段１ｂは、状態抽出手段１ａにより抽出された第１〜第Ｎの状態に対応させて、図１の第１〜第Ｎのパターン認識手段２、４、６の中のいずれか１つ又は複数を呼び出して認識処理を行わせる。
【００３６】
処理順序テーブル１ｆは、第１〜第Ｎのパターン認識手段２、４、６の中から複数の認識手段が呼び出された際に、これらの第１〜第Ｎのパターン認識手段２、４、６をどのような順序で実行するかを示す処理順序を格納する。
【００３７】
処理順序制御ルール格納手段１ｄは、状態抽出手段１ａにより抽出された第１〜第Ｎの状態に基づいて、第１〜第Ｎのパターン認識手段２、４、６の中からどの認識手段を呼び出すかを示す呼び出し手順を格納する。
【００３８】
中間処理結果テーブル作成手段１ｃは、処理順序制御ルール格納手段１ｄに格納されている呼び出し手順及び処理順序テーブル１ｆに格納されている処理順序に基づいて、第１〜第Ｎのパターン認識手段２、４、６の実行順序を示す中間処理結果テーブルを作成する。
【００３９】
処理実行ルール格納手段１ｅは、中間処理結果テーブルに記入された認識処理の実行結果に基づいて、次の処理の実行を指示する手順を格納する。
図３は、本発明の一実施例によるパターン認識装置の具体的な構成を示すブロック図である。
【００４０】
図３において、環境認識系１１は、入力画像の状態を抽出し、この抽出された状態に基づいて、文字認識部１２の基本文字認識部１７、文字列認識部１５、接触文字認識部１３、かすれ文字認識部１９、つぶれ文字認識部２１又は非文字認識部２５の消し線認識部２６及び雑音認識部２８のいずれか１つ又は複数を呼び出す。ここで、入力画像の状態を抽出するために、入力画像のレイアウト解析、品質解析及び訂正解析を行う。
【００４１】
文字認識部１２は、入力画像の状態ごとに文字認識処理を行うもので、文字についての文字認識を行う基本文字認識部１７、文字列についての文字認識Ｂ及び文字切り出しＢを行う文字列認識部１５、枠に接触した文字についての文字認識Ａ及び文字切り出しＡを行う接触文字認識部１３、かすれ文字についての文字認識Ｃ及び文字切り出しＣを行うかすれ文字認識部１９、つぶれ文字についての文字認識Ｄ及び文字切り出しＤを行うつぶれ文字認識部２１及びくせ字についての文字認識Ｅ及び文字切り出しＥを行うくせ字認識部２３を備えている。
【００４２】
また、基本文字認識部１７、文字列認識部１５、接触文字認識部１３、かすれ文字認識部１９、つぶれ文字認識部２１及びくせ字認識部２３はそれぞれ、文字認識の手法についての知識を格納した知識テーブル１４、１６、１８、２０、２２、２４を備えている。知識テーブル１４には、例えば、枠接触状態と認識の信頼度に関する知識や重複の部分パターン法に関する知識が格納され、知識テーブル１６には、例えば、切り出しの信頼度に関する知識や切り出しと認識の融合法に関する知識が格納され、知識テーブル１８には、例えば、詳細識別法に関する知識が格納されている。
【００４３】
非文字認識部２５は、入力画像の状態ごとに非文字認識処理を行うもので、消し線についての非文字認識Ｆ及び非文字切り出しＦを行う消し線認識部２６、雑音についての非文字認識Ｇ及び非文字切り出しＧを行う雑音認識部２８を備えている。
【００４４】
また、消し線認識部２６及び雑音認識部２８はそれぞれ、非文字認識の手法についての知識を格納した知識テーブル２７、２９を備えている。
図４は、環境認識系１１の全体的な処理の一例を示すフローチャートである。
【００４５】
図４において、まず、ステップＳ１に示すように、入力画像の前処理を行う。この入力画像の前処理は、ファクシミリやスキャナなどにより２値化された入力画像に対しラベリングを行い、入力画像とラベル画像とを格納するものである。なお、入力画像とラベル画像とは、これ以降の処理でいつでもアクセスできるようにしておく。
【００４６】
図５は、図４の入力画像の前処理を示すフローチャートである。
図５において、ステップＳ１１に示すように、２値化された入力画像に対しラベリングを行うことにより、連結パターンを抽出してラベル付けを行い、抽出したラベル画像と入力画像とを格納する。この際、ラベル付けされた連結パターンを外接矩形の加減算で圧縮表現することにより、メモリ容量を削減する。このラベル付けされた連結パターンの圧縮表現によれば、例えば、４００ｄｐｉの高解像度のスキャナで入力したＡ４サイズ（約３０００×４０００）の文書／帳票に対し、、数百キロバイト以内で表すことができる。
【００４７】
次に、図４のステップＳ２に示すように、レイアウト解析を行う。このレイアウト解析は、ラベル付けされた連結パターンのサイズや配置状態などに基づいて、テキスト認識、罫線抽出、枠抽出、枠の種類及び表の判別、枠接触文字の有無の判断及び図認識を行う。
【００４８】
図６は、図４のレイアウト解析を示すフローチャートである。
図６において、まず、ステップＳ２１に示すように、テキスト認識を行う。このテキスト認識は、ラベル付けされた連結パターンのサイズを解析し、連結パターンのサイズが比較的小さいものを抽出し、これを文字の候補とみなす。そして、隣接する文字の候補を統合することにより、テキストを抽出する。
【００４９】
次に、ステップＳ２２に示すように、罫線抽出を行う。この罫線抽出は、ステップＳ２１でテキストと認識されなかった連結パターンを対象として、縦又は横方向のヒストグラム値が大きいものについての探索を行うことにより、罫線を抽出する。
【００５０】
次に、ステップＳ２３に示すように、枠抽出を行う。この枠抽出は、ステップＳ２２で抽出された罫線から４辺に相当する罫線を見つけて枠を抽出する。
次に、ステップＳ２４に示すように、枠の種類／表判別を行う。この枠の種類／表判別は、ステップＳ２３で抽出された枠に対し、その枠の種類を判別して枠の種類の属性を付与する。枠の種類の属性としては、一文字枠、ブロック枠、フリーピッチ枠、表などがある。
【００５１】
次に、ステップＳ２５に示すように、枠接触文字の有無の判断を行う。この枠接触文字の有無の判断は、枠内を枠線に沿って探索した際に、交差するパターンがあるかどうかを検出し、交差するパターンがある場合は、文字が枠に接触しているものと判断する。ここで、交差するパターンが存在していても、注目している枠の隣の枠から、文字がはみ出している場合があるので、交差するパターンが隣の枠からはみ出しているものについては、注目している枠に対し、接触文字でないとする。
【００５２】
次に、ステップＳ２６に示すように、図認識を行う。この図認識は、テキストや枠や表などの属性が付与されなかったサイズが比較的大きな連結パターンに対して、図の属性を付与する。
【００５３】
次に、図４のステップＳ３に示すように、品質解析を行う。この品質解析は、入力画像にかすれやつぶれがあるかどうかを検出するもので、大局的品質解析と局所的品質解析とがある。
【００５４】
この品質解析では、所定の領域について、（面積、縦／横の長さがそれぞれ所定のしきい値以下の連結領域の数）／（前記所定の領域の全ての連結領域の数）の値が、所定値よりも大きい時にかすれと判断する。
【００５５】
また、罫線抽出の際にかすれた罫線を部分的に統合した情報を用いることにより、所定の領域について、（かすれた罫線を補完した際の補完された部分の長さの合計）／（各罫線の長さの合計）の値が、所定値よりも大きい時にかすれと判断する。
【００５６】
さらに、所定の領域について、（黒画素密度が所定のしきい値より大きい連結領域の数）／（前記所定の領域の全ての連結領域の数）の値が、所定値よりも大きい時につぶれと判断する。
【００５７】
図７は、図４の品質解析を示すフローチャートである。
図７において、まず、ステップＳ３１に示すように、大局的品質解析を行う。この大局的品質解析は、文書／帳票全体に対して品質解析を行うもので、入力画像を２値化する際のしきい値が適切であったかどうか、ファクシミリで送られてきた文書／帳票に対してノイズが全体にのったため品質が不正常になっていないかどうか、かすれやつぶれが発生していないかを解析する。
【００５８】
次に、ステップＳ３２に示すように、局所的品質解析を行う。この局所的品質解析は、レイアウト解析により一文字枠やテキストやフリーピッチ枠や表などの属性が付与された領域ごとにかすれやつぶれが発生していないかを調べたり、ノイズが発生していないかを調べたりして品質解析を行うものである。
【００５９】
次に、図４のステップＳ４に示すように、訂正解析を行う。この訂正解析は、入力画像から消し線を抽出して、消し線で訂正された文字については、文字の認識処理を省略できるようにするものである。
【００６０】
図８は、図４の訂正解析を示すフローチャートである。
図８において、まず、ステップＳ４１に示すように、訂正特徴抽出を行う。この訂正特徴抽出は、訂正文字に有効な特徴を抽出するもので、訂正文字には、つぶれた文字、２重線で消した文字、斜線で消した文字及びばつで消した文字の大きく分けて４種類あり、各訂正文字の特徴を黒画素線密度、線密度、オイラー数、ヒストグラム値などを算出して抽出する。
【００６１】
次に、ステップＳ４２に示すように、訂正文字候補抽出を行う。この訂正文字候補抽出は、訂正文字の特徴を表す特徴空間で、訂正文字と訂正されていない通常文字との分布の違いから訂正文字の候補を抽出する。
【００６２】
次に、図４のステップＳ５に示すように、文字認識／非文字認識の制御を行う。この文字認識／非文字認識の制御は、図４のステップＳ２〜Ｓ４で抽出された入力画像の状態に基づいて、文字認識部１２の基本文字認識部１７、文字列認識部１５、接触文字認識部１３、かすれ文字認識部１９、つぶれ文字認識部２１又は非文字認識部２５の消し線認識部２６及び雑音認識部２８のいずれを呼び出すかを決定するもので、中間処理結果テーブルの読み込み／処理順序制御ルールの実行、終了判定や処理実行ルールによる処理の実行を行う。
【００６３】
ここで、処理順序制御ルールは、環境認識系１１が抽出した状態に基づいて、文字認識部１２の基本文字認識部１７、文字列認識部１５、接触文字認識部１３、かすれ文字認識部１９、つぶれ文字認識部２１又は非文字認識部２５の消し線認識部２６及び雑音認識部２８のいずれを呼び出すかの手順を示すものである。
【００６４】
また、処理実行ルールは、処理順序制御ルールにより呼ばれた認識処理の結果に基づいて、次にどのような処理を行うのかの手順を示すものである。
また、中間処理結果テーブルは、レイアウト解析により一文字枠やテキストやフリーピッチ枠や表などの属性が付与された領域ごとに、図４のステップＳ２〜Ｓ４で抽出された入力画像の状態を記入するとともに、入力処理順序制御ルールにより呼ばれた処理を処理順序テーブルに格納されている処理順序で記入するものである。
【００６５】
例えば、環境認識系１１が、文字を抽出した場合、この文字に対しては、基本文字認識部１７を呼び出して認識処理を実行し、環境認識系１１が、図６のステップＳ２１でテキストを抽出した場合、このテキストに対しては、文字列認識部１５を呼び出して認識処理を実行し、環境認識系１１が、図６のステップＳ２５で枠接触文字を抽出した場合、この枠接触文字に対しては、接触文字認識部１３を呼び出して認識処理を実行し、環境認識系１１が、図７のステップＳ３２で、（面積、縦／横の長さがそれぞれ所定のしきい値以下の連結領域の数）／（前記所定の領域の全ての連結領域の数）の値が所定値よりも大きいと判断した場合、この領域の文字に対しては、かすれ文字認識部１９を呼び出して認識処理を実行し、環境認識系１１が、図７のステップＳ３２で、（黒画素密度が所定のしきい値より大きい連結領域の数）／（前記所定の領域の全ての連結領域の数）の値が所定値よりも大きいと判断した場合、この領域の文字に対しては、つぶれ文字認識部２１を呼び出して認識処理を実行し、環境認識系１１が、図８のステップＳ４２で、訂正文字候補を抽出した場合、この訂正文字候補に対しては、消し線認識部２６を呼び出して認識処理を実行し、環境認識系１１が、図７のステップＳ３２で雑音を検出した場合、この雑音に対しては、雑音認識部２８を呼び出して認識処理を実行する。
【００６６】
図９は、図４の文字認識／非文字認識の制御を示すフローチャートである。図９において、まず、ステップ５１に示すように、中間処理結果テーブルの読み込み／処理順序制御ルールの実行を行う。
【００６７】
次に、ステップ５２に示すように、終了判定を行う。この終了判定は、処理順序制御ルールに基づいて、中間処理結果テーブルの全ての処理が完了して中間処理結果テーブルの全ての処理指示欄に終了が記入された場合、終了と判定する。終了判定で未終了と判定された場合、ステップ５３に進んで、処理実行ルールによる処理を実行してステップ５１に戻り、ステップ５２の終了判定で終了と判定されるまで以上の処理を繰り返す。
【００６８】
図１０は、本発明の一実施例によるパターン認識装置のシステム構成を示すブロック図である。
図１０において、画像格納部４１は帳票画像を格納し、処理条件格納部４２は帳票のレイアウト構造や読み取り文字情報、例えば、枠の位置、種類、サイズ、文字種、文字数などの定義体を格納し、ラベル画像格納部４３はラベル付けされたラベル画像を圧縮表現により格納する。
【００６９】
環境認識系３０はレイアウト解析部３１及び訂正解析部３２を備え、環境認識系３８はくせ字解析部３９及び終了判定処理部４０を備え、文字認識系／非文字認識系３３は基本文字認識部３４、黒枠接触文字認識部３５、フリーピッチ文字列認識部３６及び消し線認識部３７を備えている。
【００７０】
レイアウト解析部３１は、ラベル画像格納部４３に格納されているラベル画像について、処理条件格納部４２に格納されている定義体を参照しながら、罫線抽出、枠抽出及び黒枠接触文字抽出を行う。ここで、枠の位置やサイズなどのフォーマット情報及び傾きに関する情報を予め帳票データとして格納しておき、この帳票データに基づいて、罫線抽出や枠抽出を行う方法は、例えば、特開昭６２−２１２８８号公報や特開平３−１２６１８６号公報に記載されている。
【００７１】
なお、例えば、特開平６−３０９４９８号公報や特開平７−２８９３７号公報に記載されているように、枠の位置やサイズなどのフォーマット情報の入力を必要とせずに、罫線抽出や枠抽出を行うようにしてもよい。
【００７２】
訂正解析部３２は消し線候補の抽出を行い、くせ字解析部３９は個人筆記特性によるくせ字の解析を行い、終了判定処理部４０は文字認識の終了判定を行い、終了判定で終了と判定された場合、文字認識結果の出力を行う。
【００７３】
基本文字認識部３４は、１文字ごとに切り出された文字の認識を行い、黒枠接触文字認識部３５は、黒枠接触文字から枠を除去し、その枠を除去することによりかすれた文字の補完を行ってから文字の認識を行い、フリーピッチ文字列認識部３６は、文字列から文字を切り出す際の切り出し信頼度を考慮しながら文字列についての文字認識を行い、消し線認識部３７は、訂正文字の黒画素線密度、線密度、オイラー数、ヒストグラムなどに基づいて、消し線の認識を行う。
【００７４】
中間処理結果テーブル４４は、環境認識系３０、３８により抽出された状態に基づいて、文字認識系／非文字認識系３３のいずれの処理を実行するかを示す処理順序やその処理結果を格納する。
【００７５】
図１１は、図１〜３のパターン認識装置が適用される文字認識システムの具体的な構成を示すブロック図である。
図１１において、５１は全体的な処理を行う中央演算処理ユニット（ＣＰＵ）、５２はＣＰＵ５１で実行されるプログラムを格納するプログラムメモリ、５３は画像データをビットマップ形式で格納する画像メモリ、５４は画像処理に使用するワークメモリ、５５は画像を光学的に読み取るスキャナ、５６はスキャナ５５により読み取られた情報を一時的に格納するメモリ、５７は各文字画像の特徴を格納した辞書ファイル、５８は認識結果を表示するディスプレイ、５９は認識結果を印刷するプリンタ、６０はディスプレイ５８及びプリンタ５９の入出力インターフェイス、６１はＣＰＵ５１、プログラムメモリ５２、画像メモリ５３、ワークメモリ５４、メモリ５６、辞書ファイル５７、入出力インターフェイス６０及びドライバ６４を接続しているバス、６２は通信ネットワーク６３を介してデータやプログラムの送受信を行う通信インターフェイス、６４はドライバ、６５はハードディスク、６６はＩＣメモリカード、６７は磁気テープ、６８はフロッピーディスク、６９はＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの光ディスクである。
【００７６】
この文字認識システムは、スキャナ５５により読み取った画像データをメモリ５６に一時的に格納し、その画像データをビットマップ形式で画像メモリ５３に展開する。そして、画像メモリ５３からワークメモリ５４にコピーされた２値画像データに対してパターン抽出処理を行う。その結果に基づいて、スキャナ５５により読み取った画像データから文字画像の切り出しを行い、切り出された文字画像の特徴と辞書ファイル５７に格納された特徴データとの比較を行い、文字の認識を行う。その後、その認識結果を、ディスプレイ５８又はプリンタ５９に出力する。
【００７７】
この文字認識システムにおいて、図１〜３のパターン抽出装置は、プログラムメモリ５２に格納されたプログラムに従って処理を行うＣＰＵ５１の機能として実現される。ここで、パターン抽出処理を行うプログラムは、プログラムメモリ５２のＲＯＭに予め格納しておくことが可能である。また、パターン抽出処理を行うプログラムを、ハードディスク６５、ＩＣメモリカード６６、磁気テープ６７、フロッピーディスク６８または光ディスク６９などの記憶媒体からプログラムメモリ５２のＲＡＭにロードした後、このプログラムをＣＰＵ５１で実行させるようにしてもよい。
【００７８】
さらに、パターン抽出処理を行うプログラムを、通信インターフェイス６２を介して通信ネットワーク６３から取り出すこともできる。通信インターフェイス６２と接続される通信ネットワーク６３として、例えば、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネット、アナログ電話網、デジタル電話網（ＩＳＤＮ：ＩｎｔｅｇｒａｌＳｅｒｖｉｃｅＤｉｇｉｔａｌＮｅｔｗｏｒｋ）、ＰＨＳ（パーソナルハンディシステム）や衛星通信などの無線通信網などを用いることが可能である。
【００７９】
以下、図３の環境認識系１１、文字認識部１２及び非文字認識部２５の構成をより具体的に説明する。
図１２は、図５のステップＳ１１のラベリング処理を説明する図である。
【００８０】
図１２において、“０”と“１”とからなる２値画像がラベリング処理部７０に入力されると、ラベリング処理部７０は、連結した画素で構成される連結パターンを入力された２値画像から抽出し、各連結パターンごとにラベルを付したラベル画像を生成して、ラベル画像格納部７１に格納する。例えば、“０”と“１”とからなる２値画像７２が入力された場合、各連結パターンごとにラベル“１”、“２”、“３”を付してラベル画像７３を生成する。
【００８１】
ここで、例えば、２５５個の連結パターンが１画像内に存在する場合、２５５個のラベルが必要となるため、１画素当たり８ビットを必要とし、ラベル画像格納部７１に必要な記憶容量は、１画像全体の画素数の８倍となり、ラベル画像を格納するために多くの記憶容量が必要となる。
【００８２】
図１３は、図１２のラベル画像７３を圧縮表現することにより、ラベル画像格納部７１に必要な記憶容量を削減する方法を説明する図である。
図１３において、例えば、図１３（ａ）の連結パターンＡ₁及び連結パターンＡ₂のそれぞれに対し、図１３（ｂ）に示すように、ラベル“１”及びラベル“２”が付され、図１３（ｃ）に示すように、連結パターンＡ₁に外接する外接矩形Ｂ₁及び連結パターンＡ₂に外接する外接矩形Ｂ₂が生成されている。外接矩形Ｂ₁及び外接矩形Ｂ₂は、図１３（ｄ）に示すように、その外接矩形Ｂ₁及び外接矩形Ｂ₂の左上頂点の座標（ｘ₁、ｙ₁）及び右下頂点の座標（ｘ₂、ｙ₂）によって特定することができる。
【００８３】
そして、連結パターンＡ₁に外接する外接矩形Ｂ₁と連結パターンＡ₂に外接する外接矩形Ｂ₂とが重なっているかどうかを判定し、連結パターンＡ₁に外接する外接矩形Ｂ₁と連結パターンＡ₂に外接する外接矩形Ｂ₂とが重なっていない場合、それぞれの外接矩形Ｂ₁及び外接矩形Ｂ₂の左上頂点の座標（ｘ₁、ｙ₁）及び右下頂点の座標（ｘ₂、ｙ₂）を記憶する。
【００８４】
一方、連結パターンＡ₁に外接する外接矩形Ｂ₁と連結パターンＡ₂に外接する外接矩形Ｂ₂とが重なっている場合、他の外接矩形と重ならないようにより小さな矩形領域に外接矩形Ｂ₁及び外接矩形Ｂ₂を細分化し、細分化された矩形領域が元の外接矩形Ｂ₁及び外接矩形Ｂ₂のどちらに属するかを判定し、連結パターンＡ₁及び連結パターンＡ₂を、細分化された矩形領域の和や差などの演算で表現する。
【００８５】
例えば、図１３（ｃ）において、連結パターンＡ₁は、連結パターンＡ₁に属する最大の矩形領域（１−１）及び矩形領域（１−１）に含まれる矩形領域（１−２）を用いて、
Ａ₁＝（１−１）−（１−２）
のように矩形領域（１−１）と矩形領域（１−２）との差で表現することができる。
【００８６】
また、連結パターンＡ₂は、連結パターンＡ₂に属する最大の矩形領域（２−１）、矩形領域（２−１）に含まれる矩形領域（２−２）及び矩形領域（２−２）に含まれる矩形領域（２−３）を用いて、
Ａ₂＝（２−１）−（２−２）＋（２−３）
のように矩形領域（２−１）と矩形領域（２−２）との差及び矩形領域（２−３）との和で表現することができる。
【００８７】
このように、連結パターンを連結する画素の外接矩形で表現することにより、連結パターンを表現する情報量を減らして、ラベル画像を格納するために必要な記憶容量を削減することができる。
【００８８】
なお、このラベル画像の圧縮表現の方法については、例えば、特開平８−５５２１９号公報に記載されている。
図１４は、図６のステップＳ２１のテキスト認識処理の一実施例を示すフローチャートである。
【００８９】
図１４において、まず、ステップＳ６１に示すように、スキャナで文書を読み込み、読み込んだ文書の画像データをメモリに格納する。
次に、ステップＳ６２に示すように、ステップＳ６１で読み込んだ画像データのうち、横方向の特定の区間の短冊状の部分領域だけに注目し、その注目した部分領域の中でラベリングを行い、黒連結画素の外接矩形を求める。
【００９０】
例えば、処理対象として複数の文書Ａ、Ｂ、Ｃがあり、図１５（ａ）の文書Ａの文字列８１の領域が、図１５（ｄ）に示すように、区間Ａの範囲内にあり、図１５（ｂ）の文書Ｂの文字列８２の領域が、図１５（ｄ）に示すように、区間Ａの範囲内にあり、図１５（ｃ）の文書Ｃの文字列８３の領域が、図１５（ｄ）に示すように、区間Ｂの範囲内にある場合、この区間Ａ，Ｂの部分領域にのみ着目し、この部分領域の短冊状の中でのみラベリング処理を行って、黒連結画素の外接矩形を求める。
【００９１】
次に、ステップＳ６３に示すように、ステップＳ６２で求めた外接矩形の高さと、予め求めておいた矩形の高さｙｌｅｎとの差がしきい値ｔｈｙ以内で、かつステップＳ６２で求めた外接矩形の幅と、予め求めておいた矩形の幅ｘｌｅｎとの差がしきい値ｔｈｘ以内であるような外接矩形だけを抽出する。そして、その外接矩形が存在しているｙ方向（縦方向）の座標を求め、メモリに記憶する。
【００９２】
次に、ステップＳ６４に示すように、ステップＳ６３で求めたｙ方向の座標を中心として、ステップＳ６２で抽出した矩形を含む左右方向の長さが画像幅に等しい横長部分領域に注目する。
【００９３】
次に、ステップＳ６５に示すように、ステップＳ６４で求めた横長部分領域に対してラベリングを行うことにより、黒連結画素の外接矩形を求める。
次に、ステップＳ６６に示すように、ステップＳ６５で求めた外接矩形の高さと、予め求めておいた矩形の高さｙｌｅｎとの差がしきい値ｔｈｙ以内で、かつステップＳ６５で求めた外接矩形の幅と、予め求めておいた矩形の幅ｘｌｅｎとの差がしきい値ｔｈｘ以内であるような外接矩形だけを抽出し、メモリに記憶する。
【００９４】
次に、ステップＳ６７に示すように、ステップＳ６６で抽出した矩形を対象にｘ座標でソートし、抽出した矩形の中心線の間隔からピッチを計算し、この計算により求めたピッチと予め求めておいたピッチｐｉｔｃｈとの差がしきい値ｔｈｐｉｔｃｈ以内の矩形が横方向に所定の数ｔｈ個以上並んでいるものをテキストとして出力する。
【００９５】
なお、このテキスト抽出方法については、例えば、特開平８−１７１６０９号公報に記載されている。
次に、図６のステップＳ２２の罫線抽出処理の一実施例についてより具体的に説明する。
【００９６】
この罫線抽出処理は、ラベリングにより得られた連結パターンを横方向及び縦方向に複数に分割し、横方向及び縦方向に分割したそれぞれの範囲内で連結パターンの隣接投影値を算出し、ある一定の長さの線分又は直線の一部を矩形近似により検出することにより罫線を抽出するものである。
【００９７】
ここで、隣接投影とは、注目行又は注目列の投影値に周囲の行又は列の投影値を足し合わせたものである。また、注目行又は注目列の投影値は、その行又は列に存在する黒画素の総和をとったものである。
【００９８】
図１６は、この隣接投影処理を説明する図である。
図１６において、ｉ行の投影値をｐ（ｉ）とすると、隣接投影値Ｐ（ｉ）は、（１）式により算出することができる。
【００９９】
Ｐ（ｉ）＝ｐ（ｉ−ｊ）＋・・・＋ｐ（ｉ）＋・・・＋ｐ（ｉ＋ｊ）（１）
なお、図１６に示す例は、（１）式においてｊ＝１とおいたものである。
図１７は、部分パターンの投影値の例を示す図である。
【０１００】
図１７において、縦方向の長さがＬ_Y、横方向の長さがＬ_Xの矩形８４の水平方向ｊの投影値Ｐｈ（ｉ）をＨＰ（ｉ）、矩形８４の垂直方向ｉの投影値Ｐｖ（ｊ）をＶＰ（ｊ）とすると、ＨＰ（１）＝ＨＰ（ｎ）＝ｍ、ＨＰ（２）〜ＨＰ（ｎ−１）＝２、ＶＰ（１）＝ＶＰ（ｍ）＝ｎ、ＶＰ（２）〜ＶＰ（ｍ−１）＝２である。
【０１０１】
このように、矩形８４を構成する直線が存在している部分は、その投影値が大きくなるので、この投影値を算出することにより、罫線を構成している直線を抽出することができる。
【０１０２】
例えば、隣接投影値と縦横それぞれの分割長との比が所定の閾値以上である部分パターンを検出することにより、罫線を構成している直線の候補を抽出することができる。
【０１０３】
図１８は、罫線抽出処理を示すフローチャートである。
図１８において、まず、ステップ６０１に示すように、隣接投影値と縦横それぞれの分割長との比が所定のしきい値以上であるかどうかを判定する。そして、隣接投影値と縦横それぞれの分割長との比が所定のしきい値以上でないと判断された場合、ステップＳ６０２に進み、罫線を構成している線分が存在しないものとみなす。
【０１０４】
一方、ステップＳ６０１で隣接投影値と縦横それぞれの分割長との比が所定のしきい値以上であると判断された場合、ステップＳ６０３に進み、罫線を構成している線分が存在するものとみなす。
【０１０５】
次に、ステップＳ６０４において、ステップＳ６０３で線分とみなされたパターンが、その上下に存在する線分と接しているかどうかを判断する。そして、上記パターンが上下に存在する線分と接していないと判断された場合、ステップＳ６０５に進み、そのパターンを矩形線分とする。
【０１０６】
一方、ステップＳ６０４において、ステップＳ６０３で線分とみなされたパターンがその上下に存在する線分と接していると判断された場合、ステップＳ６０６に進み、上記パターンとその上下に存在する線分とを統合する。そして、ステップＳ６０７で、ステップＳ６０６で統合した線分を矩形線分として検出する。例えば、図１９（ａ）に示すような３つの矩形線分８５を統合し、図１９（ｂ）に示す１つの矩形線分８６を得る。この後、ステップＳ６０５又はステップＳ６０７で求めた矩形線分を対象として探索を行うことにより、罫線を抽出する。
【０１０７】
なお、この罫線抽出処理については、例えば、特開平６−３０９４９８号公報に記載されている。
図２０は、図６のステップＳ２２の罫線抽出処理において、かすれ罫線の補完を行いながら、探索を行う方法を説明する図である。
【０１０８】
このかすれ罫線の補完方法は、直線を構成するパターンの探索を行う際、探索の進行方向にパターンのない空白領域が存在しても、一定の画素数以下の空白領域に対してはパターンがあるとみなして探索を行うようにするものである。
【０１０９】
例えば、図２０に示すように、直線９１に対して、この直線９１を構成する画素９２の検索を行う場合、一定の画素数以下の空白領域９３に対しては画素９２があるとみなして探索を行う。
【０１１０】
図２１は、罫線抽出処理におけるかすれ罫線の補完方法を示すフローチャートである。
図２１において、まず、ステップＳ７１に示すように、所定の矩形範囲内のパターンのうち、最も細い部分のＸ座標を算出する。
【０１１１】
次に、ステップＳ７２に示すように、ステップＳ７１で算出したＸ座標におけるパターンの中心点を算出する。そして、ステップＳ７３に示すように、ステップＳ７２で算出したパターンの中心点を探索の開始点とする。ここで、探索の開始点をパターンの最も細い部分とするのは、最も細い部分は文字である可能性が低いため、枠となる直線の探索をより確実に行うことができるからである。
【０１１２】
次に、ステップＳ７４で直線の探索方向を右に設定する。
次に、ステップＳ７５に示すように、空白領域の長さをカウントする変数Ｋの初期値を０に設定する。
【０１１３】
次に、ステップＳ７６に示すように、ステップＳ７３で求めた開始点をパターンの探索の現在地と設定する。
次に、ステップＳ７７に示すように、ステップＳ７６で設定した探索の現在地が、ステップＳ７１で注目した矩形範囲の内部であるかどうかの判定を行い、探索の現在地が、ステップＳ７１で注目した矩形範囲の内部でない場合、ステップＳ８６に進む。
【０１１４】
一方、ステップＳ７７で探索の現在地が、ステップＳ７１で注目した矩形範囲の内部であると判定された場合、ステップＳ７８に進み、探索の現在地からみて探索方向隣にパターンがあるかどうか判定する。ここで、探索の現在地からみて探索方向隣にパターンがあるとは、図２２に示すように、パターン１０１からみて右方向隣の位置にパターン１０２が存在していることを意味している。そして、探索の現在地からみて探索方向隣にパターン１０２があると判定された場合、ステップＳ８１に進み、探索方向隣にあるパターン１０２を探索の現在地とする。
【０１１５】
一方、ステップＳ７８で探索の現在地からみて探索方向隣にパターンがないと判定された場合、ステップＳ７９に進み、探索の現在地からみて探索方向斜め隣にパターンがあるかどうか判定する。
【０１１６】
ここで、探索の現在地からみて探索方向斜め隣にパターンがあるとは、図２２に示すように、パターン１０３からみて右方向斜め隣の位置にパターン１０４ａ又はパターン１０４ｂが存在していることを意味している。そして、探索の現在地からみて探索方向斜め隣にパターン１０４ａ、１０４ｂがあると判定された場合、ステップＳ８３に進み、探索方向斜め隣にあるパターン１０４ａ、１０４ｂを探索の現在地とする。なお、探索方向斜め隣にあるパターン１０４ａ、１０４ｂが２つある場合はパターン１０４ａ、１０４ｂのどちらか一方を探索の現在地とする。
一方、ステップＳ７９で探索の現在地からみて探索方向斜め隣にパターン１０４ａ、１０４ｂがないと判定された場合、ステップＳ８０に進み、空白領域の長さをカウントする変数Ｋがしきい値以下であるかどうかを判定する。そして、空白領域の長さをカウントする変数Ｋがしきい値以下である場合、ステップＳ８４に進み、探索の現在地からみて探索方向隣にありパターンを構成しない画素を現在地とする。例えば、図２０において、一定の画素数以下の空白領域９３に対してはパターンがあるとみなして探索を行う。
【０１１７】
次に、ステップＳ８５に示すように、空白領域の長さをカウントする変数Ｋの値を１ドットだけ増やし、ステップＳ７７に戻る。
一方、ステップＳ８０で空白領域の長さをカウントする変数Ｋがしきい値以下でないと判定された場合、ステップＳ８６に進み、探索方向は右に設定されているかどうかを判定する。そして、探索方向は右に設定されていない場合、処理を終了する。
【０１１８】
ステップＳ８６で探索方向は右に設定されている場合、ステップＳ８７に進み、探索方向を左に設定する。そして、探索方向を右に設定して行った処理と同様に、ステップＳ７５〜ステップＳ８５の処理を繰り返す。
【０１１９】
ここで、探索方向を左に設定して処理を行う場合、探索の現在地からみて探索方向隣にパターンがあるとは、図２２に示すように、パターン１０５からみて左方向隣の位置にパターン１０６が存在していることを意味している。また、探索の現在地からみて探索方向斜め隣にパターンがあるとは、図２２に示すように、パターン１０７からみて左方向斜め隣の位置にパターン１０８ａ又はパターン１０８ｂが存在していることを意味している。
【０１２０】
なお、このかすれ罫線の補完方法については、例えば、特願平８−１０７５６８号の明細書及び図面に記載されている。
次に、図６のステップＳ２３の枠抽出処理について説明する。
【０１２１】
図２３は、一文字枠抽出処理の一実施例を示すフローチャートである。
図２３において、まず、ステップＳ９１に示すように、図１８の処理により矩形線分として検出されたパターンに対し探索を行う。この際、図２１のフローチャートに示すように、所定の長さの空白領域に対しては、パターンが存在するものとみなして探索を行い、かすれを補完する。
【０１２２】
次に、ステップＳ９２に示すように、ステップＳ９１で探索を行った結果、パターンが所定の長さで途切れているかどうかを判断し、パターンが所定の長さで途切れていない場合、図２４のブロック枠抽出処理に進む。一方、パターンが所定の長さで途切れている場合、ステップＳ９３に進み、探索された線分を統合して直線を検出する。
【０１２３】
次に、ステップＳ９４に示すように、ステップＳ９３で検出した直線のうち、４方を囲んでいる直線を抽出する。
次に、ステップＳ９５に示すように、４方を直線で囲まれた部分の大きさが、同一画像内の一文字枠の大きさの所定範囲内であるかどうかを判断し、４方を直線で囲まれた部分の大きさが、同一画像内の一文字枠の大きさの所定範囲内であるか場合、ステップＳ９６に進んで、４方を直線で囲まれた部分を一文字枠であるとみなし、４方を直線で囲まれた部分の大きさが、同一画像内の一文字枠の大きさの所定範囲内でない場合、ステップＳ９７に進んで、４方を直線で囲まれた部分を一文字枠でないとみなす。
【０１２４】
図２４は、ブロック枠抽出処理の一実施例を示すフローチャートである。図２４において、まず、ステップＳ１０１に示すように、探索により検出された横直線が所定値以上の長さを有するかどうかを判断し、探索により検出された横直線の長さが所定値より小さい場合、ステップＳ１０２に進んで、その横直線を横枠でないとみなす。一方、探索により検出された横直線の長さが所定値以上の場合、ステップＳ１０２に進み、探索により検出された横直線を横枠であるとみなす。
【０１２５】
次に、ステップＳ１０４に示すように、ステップＳ１０３で抽出された横枠から、互いに隣接する２本の横枠を取り出す。
次に、ステップＳ１０５に示すように、ステップＳ１０４で取り出した２本の横枠の間に挟まれた範囲を１行のブロック枠とみなす。
【０１２６】
次に、ステップＳ１０６に示すように、図１８の処理により検出された矩形線分のうち、縦方向の矩形線分を抽出して縦線を検出する。
次に、ステップＳ１０７に示すように、ステップＳ１０６で検出した縦線の探索を行い、ステップＳ１０８において、縦線がステップＳ１０４で取り出した上下の横枠に達したかどうかを判断する。そして、縦線が上下の横枠に達しない場合、ステップＳ１０９に進み、その縦線を縦枠の候補から除外する。一方、縦線が上下の横枠に達した場合、ステップＳ１１０に進み、その縦線を縦枠の候補とする。
【０１２７】
次に、ステップＳ１１１に示すように、処理の対象が規則的な表形式のブロック枠であるか、不規則な表形式のブロック枠であるかを判断する。そして、処理の対象が規則的な表形式のブロック枠である場合、ステップＳ１１２に進み、ステップＳ１１０で縦枠の候補とみなされた縦線同士の間隔を算出するとともに、算出された縦線同士の間隔とその出現頻度との関係を示すヒストグラムを算出する。
【０１２８】
次に、ステップＳ１１３に示すように、互いの隣接する２本の横枠の間に挟まれた範囲内の縦線のうち、他の縦線と異なる間隔を形成する縦線を縦枠の候補から除外し、残った縦線を縦枠として処理を終了する。
【０１２９】
一方、ステップＳ１１１で処理の対象が不規則的な表形式のブロック枠であると判断された場合、ステップＳ１１０で縦枠の候補とされたものを全て縦枠として処理を終了する。
【０１３０】
次に、図６のステップＳ２４の枠種類／表判別処理について説明する。
図２５は、図６のステップＳ２３の枠抽出処理により抽出された枠や表の一例を示す図である。
【０１３１】
図２５において、図２５（ａ）は一文字枠、図２５（ｂ）はフリーピッチ枠、図２５（ｃ）はブロック枠、図２５（ｄ）は規則的な表、図２５（ｅ）は不規則な表を示している。そして、一文字枠には一文字枠の属性を付与し、フリーピッチ枠にはフリーピッチ枠の属性を付与し、ブロック枠にはブロック枠の属性を付与し、表には表の属性を付与する。
【０１３２】
なお、枠抽出処理及び枠種類／表判別処理については、例えば、特開平７−２８９３７号公報に記載されている。
次に、図６のステップＳ２５の枠接触有無の判断処理について説明する。ここでは、元の入力画像をＯＲ処理により縮小率１／ｎで縮小してから、枠接触有無の判断処理を行う例について述べる。ここで、画像の各画素に対応して座標が設定され、画像の横方向にＸ座標、画像の縦方向にＹ座標を設定し、Ｘ座標は右向きに増加し、Ｙ座標は下向きに増加するものとしている。
【０１３３】
図２６は、入力画像の縮小処理の一実施例を示すフローチャートである。図２６において、まず、ステップＳ１２１に示すように、原画像を入力する。次に、ステップＳ１２２に示すように、原画像の左上から横ｎ画素×縦ｎ画素の範囲（左上座標（１，１）、右下座標（Ｘ，Ｙ））を設定する。
【０１３４】
次に、ステップＳ１２３に示すように、原画像の設定された範囲内に黒画素があるかどうかを判断し、原画像の設定された範囲内に黒画素がある場合、ステップＳ１２４に進み、縮小画像の座標（Ｘ／ｎ，Ｙ／ｎ）の画素を黒画素とし、原画像の設定された範囲内に黒画素がない場合、ステップＳ１２５に進み、縮小画像の座標（Ｘ／ｎ，Ｙ／ｎ）の画素を白画素とする。
【０１３５】
次に、ステップＳ１２６に示すように、原画像の右下まで処理が終了したかどうかを判断し、原画像の右下まで処理が終了していない場合、ステップＳ１２７に進み、原画像の右端に達したかどうかを判断する。
【０１３６】
そして、原画像の右端に達していない場合、処理した範囲の右隣に横ｎ画素×縦ｎ画素の範囲（左上座標（ｘ，ｙ）、右下座標（Ｘ，Ｙ））を設定し、原画像の右端に達した場合、処理した範囲の下側で、かつ、原画像の左端から横ｎ画素×縦ｎ画素の範囲（左上座標（ｘ，ｙ）、右下座標（Ｘ，Ｙ））を設定して、ステップＳ１２３に戻り、原画像の全ての範囲内について縮小処理が終了するまで以上の処理を繰り返す。
【０１３７】
次に、入力画像の縮小処理により縮小された圧縮画像データにおける枠線の内側を枠に沿って探索することにより、文字が枠に接触しているかどうかの判定を行い、文字の接触している辺に関して、矩形領域を所定の距離だけ外側に拡大し、この拡大した矩形領域の座標を原画像データにおける座標に変換する。
【０１３８】
例えば、図２７（ａ）に示すように、圧縮画像データの枠線の範囲１１０が抽出され、この枠線により囲まれた矩形領域内に「４」の文字１１２が存在し、この「４」の文字１１２が下側の枠線１１１に接触しているものとする。
【０１３９】
次に、図２７（ｂ）に示すように、枠線の内側に沿って真っ直ぐに探索を行い、探索の途中でパターンと交差した場合、枠線の近辺に文字が存在し、この文字は枠線に接触している可能性が高いとみなして、この枠線により囲まれた矩形領域内に存在する「４」の文字１１２は枠と接触しているものとする。この例の場合、「４」の文字１１２は下側の枠１１１と接触しているものとされる。
【０１４０】
次に、枠線１１１の内側に沿って探索を行い、文字１１２が枠線１１１に接触しているとみなされた結果、図２７（ｃ）に示すように、文字１１２が接触している枠線１１１から外側の方向へ枠線により囲まれた矩形領域を拡大し、この拡大した矩形領域１１３を文字１１２が存在する文字領域とする。なお、文字が枠線に接触していないとみなされた場合は、枠の内部をそのまま文字領域とする。
【０１４１】
次に、圧縮画像データにおける文字領域から原画像データにおける文字領域を求めるため、図２７（ｃ）の矩形領域１１３の座標を原画像データにおける座標に変換する。このことにより、図２７（ｄ）に示すように、原画像データにおける矩形領域１１６を求めることができる。
【０１４２】
次に、原画像データの矩形領域１１６における枠線１１４についての投影処理を行い、枠線１１４の枠座標を原画像データから算出する。この際、枠線１１４を所定の長さの短冊状の矩形によって表現する。そして、図２７（ｅ）に示すように、この矩形領域１１６に存在するパターンを文字補完処理に送り、原画像データから算出した枠線１１４の枠座標に基づいて、枠線１１４に接触している文字１１５の補完処理を行う。
【０１４３】
図２８は、枠接触有無の判断処理の一実施例を示すフローチャートである。図２８において、まず、ステップＳ１３１に示すように、圧縮画像データによる矩形表現を、例えば、図２６の処理により行う。
【０１４４】
次に、ステップＳ１３２に示すように、縦横４本の直線に囲まれた矩形部分を抽出する。
次に、ステップＳ１３３に示すように、直線の内側を示す矩形の左上及び右下を示す座標をそれぞれ算出する。
【０１４５】
次に、ステップＳ１３４に示すように、枠の内側を示す矩形の４辺（上側横枠、下側横枠、右側縦枠、左側縦枠）に沿って圧縮画像の探索を行う。
次に、ステップＳ１３５に示すように、探索の途中で画像パターンと交差した場合、探索を行っていた辺に文字が接触しているものとする。
【０１４６】
次に、ステップＳ１３６に示すように、枠の内側を示す矩形の座標値を原画像上の座標値に変換することにより、圧縮画像データにおける矩形領域から原画像データにおける矩形領域を算出する。
【０１４７】
次に、ステップＳ１３７に示すように、ステップＳ１３６で算出された矩形領域を原画像データにおける文字領域とする。
次に、ステップＳ１３８に示すように、ステップＳ１３５の処理により文字が枠に接触していたかどうかを判断し、文字が枠に接触している場合、ステップＳ１３９〜Ｓ１４３の接触文字範囲獲得処理を行う。
【０１４８】
接触文字範囲獲得処理では、まず、ステップＳ１３９において、文字の接触している辺から外側方向に文字領域を拡大し、ステップＳ１３７で算出された文字領域位置より一定の距離だけ外側の位置を文字領域の端とする。
【０１４９】
次に、ステップＳ１４０に示すように、ステップＳ１３９で算出された文字領域に含まれる枠線の位置座標を原画像上の座標値に変換することにより、圧縮画像データにおける枠線の位置座標から原画像データにおける枠線の位置座標を算出する。
【０１５０】
次に、ステップＳ１４１に示すように、ステップＳ１４０で算出された原画像データにおける枠線の位置座標に基づいて獲得した原画像データの枠線領域について、横枠は横方向、縦枠は縦方向に投影処理を行う。
【０１５１】
次に、ステップＳ１４２に示すように、投影値が一定値以上の領域を原画像上の枠座標とする。
次に、ステップＳ１４３に示すように、算出した原画像上の文字領域を示す座標値と文字領域内の枠線の位置を示す座標値とを文字補完処理へ渡す。
【０１５２】
次に、ステップＳ１４４に示すように、算出した原画像上の文字領域を示す座標値を文字領域とする。
なお、枠接触有無の判断処理については、例えば、特願平８−１０７５６８号の明細書及び図面に記載されている。
【０１５３】
次に、図８のステップＳ４１の訂正特徴抽出処理及びステップＳ４２の訂正文字候補抽出処理について説明する。
図２９は、訂正文字の一実施例を示す図である。
【０１５４】
図２９において、訂正文字は、文字を消し線で消したものであり、訂正文字の形態として、図２９（ａ）に示すように、文字を“×”印により消したもの、図２９（ｂ）に示すように、文字を横二重線によりにより消したもの、図２９（ｃ）に示すように、文字を斜め線により消したもの、図２９（ｄ）に示すように、文字を波線により消したもの、図２９（ｅ）に示すように、文字を真っ黒に塗りつぶすことにより消したものなど様々なものがある。
【０１５５】
このような訂正文字に対し、訂正文字に特有な特徴を抽出する。この訂正文字に特有な特徴として、「所定方向の線密度」、「オイラー数」、「黒画素密度」などがある。
【０１５６】
「所定方向の線密度」は、矩形内の画像を所定の一定方向に沿って走査した際に、白画素から黒画素（又は黒画素から白画素）に変化する回数を計数した値である。また、所定方向は、消し線として想定された線分の方向と垂直方向に設定する。
【０１５７】
例えば、図３０（ａ）は、「６」の文字について、縦方向の最大線密度を計数した例を示すもので、この場合の縦方向の最大線密度は３となっている。
訂正文字の「所定方向の線密度」は、通常文字の「所定方向の線密度」に比べて大きくなる傾向があり、この「所定方向の線密度」を算出することにより、訂正文字の候補を抽出することができる。
【０１５８】
「オイラー数」Ｅは、画像中での互いに連結している連結成分の個数Ｃから、その画像が有する穴の個数Ｈを引いた値である。
例えば、図３０（ｂ）は、互いに連結している連結成分が画像中に２つだけ存在し、その画像中に穴が１つだけ存在する例を示すもので、この例の場合のオイラー数Ｅは、Ｅ＝Ｃ−Ｈ＝２−１＝１となる。
【０１５９】
訂正文字の「オイラー数」は絶対値が大きな負の値となる傾向があり、通常文字の「オイラー数」は絶対値が小さな値（２〜−１）となる傾向がある。したがって、この「オイラー数」を算出することにより、訂正文字の候補を抽出することができる。
【０１６０】
「黒画素密度」Ｄは、注目する画像自体の面積（黒画素数）Ｂと注目する画像の外接矩形の面積Ｓとの比である。
例えば、図３０（ｃ）は、「４」の文字について黒画素密度Ｄを算出した場合の例を示すもので、「４」の文字に外接している外接矩形の面積をＳ、「４」の文字の面積をＢとすると、Ｄ＝Ｂ／Ｓとなる。
【０１６１】
訂正文字の「黒画素密度」は、通常文字の「黒画素密度」に比べて大きくなる傾向があり、この「黒画素密度」を算出することにより、訂正文字の候補を抽出することができる。
【０１６２】
次に、図３の基本文字認識部１７について具体的に説明する。
図３１は、基本文字認識部１７の構成の一実施例を示すブロック図である。
図３１において、特徴抽出部１２１は、入力された未知の文字パターンから文字の特徴を抽出し、この抽出した特徴を特徴ベクトルにより表す。一方、基本辞書１２２には、各文字カテゴリの特徴ベクトルが格納されている。
【０１６３】
そして、照合部１２３は、特徴抽出部１２１により抽出した未知の文字パターンの特徴ベクトルを、基本辞書１２２に格納されている各文字カテゴリの特徴ベクトルと照合し、特徴空間上での特徴ベクトル間の距離Ｄ_ij（ｉは未知文字の特徴ベクトル、ｊは基本辞書１２２のカテゴリの特徴ベクトル）を算出する。その結果、特徴ベクトル間の距離Ｄ_ijを最小とするカテゴリｊを未知文字ｉとして認識する。
【０１６４】
ここで、特徴空間上での特徴ベクトル間の距離Ｄ_ijは、例えば、ユークリッド距離Σ（ｉ−ｊ）²、シティブロック距離Σ｜ｉ−ｊ｜、又は判別関数などの識別関数を用いて算出することができる。
【０１６５】
なお、第１位のカテゴリとの距離をＤ_ij1、第２位のカテゴリとの距離をＤ_ij2とすると、第１位のカテゴリｊ１、第２位のカテゴリｊ２、カテゴリ間の距離（Ｄ_ij2−Ｄ_ij1）及び信頼度に関するテーブル１を予め作成しておく。また、第１位のカテゴリとの距離をＤ_ij1、第１位のカテゴリｊ１及び信頼度に関するテーブル２も予め作成しておく。そして、テーブル１とテーブル２とからそれぞれ得られる信頼度の小さい方を中間処理結果テーブルに格納する。
【０１６６】
図３２は、特徴ベクトルの算出例を示す図である。
この例は、図３２（ａ）の縦５×横４の合計２０個の枡目に「２」の文字が書かれており、黒く塗りつぶされた枡目を“１”、白抜きの枡目を“０”として、枡目の左上から右下の順に枡目を見ていき、その時現れる“１”又は“０”の数値を順に並べたものを、特徴ベクトルとしたものである。
【０１６７】
例えば、図３２（ｂ）の場合の特徴ベクトルｖｅｃｔｏｒＡは、ｖｅｃｔｏｒＡ＝（１、１、１、１、０、０、０、１、１、１、１、１、１、０、０、０、１、１、１、１、）となり、図３２（ｃ）の場合の特徴ベクトルｖｅｃｔｏｒＢは、ｖｅｃｔｏｒＢ＝（０、１、１、１、０、０、０、１、１、１、１、１、１、０、０、０、１、１、１、１、）となり、図３２（ｄ）の場合の特徴ベクトルｖｅｃｔｏｒＣは、ｖｅｃｔｏｒＣ＝（１、１、１、１、０、０、０、１、０、１、１、０、１、０、０、０、１、１、１、１、）となる。
【０１６８】
図３３は、シティブロック距離ｄ（ｉ，ｊ）により特徴ベクトル間の距離Ｄ_ijを算出する例を示す図である。
ここで、シティブロック距離ｄ（ｉ，ｊ）は、特徴ベクトルの次元数をＮ、特徴ベクトルの番号をｉとすると、ｉ番目の特徴ベクトルｘ_iは、ｘ_i＝（ｘ_i1，ｘ_i2，ｘ_i3，・・・ｘ_iN）と表され、ｊ番目の特徴ベクトルｘ_jは、ｘ_j＝（ｘ_j1，ｘ_j2，ｘ_j3，・・・ｘ_jN）と表される。そして、ｉ番目の特徴ベクトルｘ_iとｊ番目の特徴ベクトルｘ_jとのシティブロック距離ｄ（ｉ，ｊ）は、
ｄ（ｉ，ｊ）＝｜ｘ_i−ｘ_j｜・・・（２）
と定義される。
【０１６９】
例えば、図３３において、基本辞書１２２には、「１」、「２」、「３」、「４」の文字カテゴリの特徴ベクトルが登録されているものとする。ここで、「１」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ１は、ｖｅｃｔｏｒ１＝（０、１、１、０、０、１、１、０、０、１、１、０、０、１、１、０、０、１、１、０、）、「２」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ２は、ｖｅｃｔｏｒ２＝（１、１、１、１、０、０、０、１、１、１、１、１、１、０、０、０、１、１、１、１、）、「３」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ３は、ｖｅｃｔｏｒ３＝（１、１、１、１、０、０、０、１、１、１、１、１、０、０、０、１、１、１、１、１、）、「４」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ４は、ｖｅｃｔｏｒ４＝（１、０、１、０、１、０、１、０、１、１、１、１、０、０、１、０、０、０、１、０、）とする。
【０１７０】
そして、特徴ベクトルｖｅｃｔｏｒが、ｖｅｃｔｏｒ＝（０、１、１、１、０、０、０、１、１、１、１、１、１、０、０、０、１、１、１、１、）である未知文字が入力された場合、この特徴ベクトルｖｅｃｔｏｒと、基本辞書１２２に登録されている「１」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ１、「２」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ２、「３」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ３、「４」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ４のそれぞれとの間のシティブロック距離ｄ（ｉ，ｊ）を（２）式により算出する。
【０１７１】
すなわち、未知文字の特徴ベクトルｖｅｃｔｏｒと「１」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ１との間のシティブロック距離ｄ（ｉ，ｊ）は、ｄ（ｉ，ｊ）＝｜ｖｅｃｔｏｒ−ｖｅｃｔｏｒ１｜＝｜０−０｜＋｜１−１｜＋｜１−１｜＋｜１−０｜＋｜０−０｜＋｜０−１｜＋｜０−１｜＋｜１−０｜＋｜１−０｜＋｜１−１｜＋｜１−１｜＋｜１−０｜＋｜１−０｜＋｜０−１｜＋｜０−１｜＋｜０−０｜＋｜１−０｜＋｜１−１｜＋｜１−１｜＋｜１−０｜＝１１となる。
【０１７２】
同様に、未知文字の特徴ベクトルｖｅｃｔｏｒと「２」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ２との間のシティブロック距離ｄ（ｉ，ｊ）は、ｄ（ｉ，ｊ）＝｜ｖｅｃｔｏｒ−ｖｅｃｔｏｒ２｜＝１、未知文字の特徴ベクトルｖｅｃｔｏｒと「３」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ３との間のシティブロック距離ｄ（ｉ，ｊ）は、ｄ（ｉ，ｊ）＝｜ｖｅｃｔｏｒ−ｖｅｃｔｏｒ３｜＝３、未知文字の特徴ベクトルｖｅｃｔｏｒと「４」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ４との間のシティブロック距離ｄ（ｉ，ｊ）は、ｄ（ｉ，ｊ）＝｜ｖｅｃｔｏｒ−ｖｅｃｔｏｒ４｜＝１１となる。
【０１７３】
ここで、未知文字の特徴ベクトルｖｅｃｔｏｒと、「１」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ１、「２」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ２、「３」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ３、「４」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ４のそれぞれとの間のシティブロック距離ｄ（ｉ，ｊ）のうち、未知文字の特徴ベクトルｖｅｃｔｏｒと「２」の文字カテゴリの特徴ベクトルｖｅｃｔｏｒ２との間のシティブロック距離ｄ（ｉ，ｊ）が最小となっている。
【０１７４】
従って、特徴ベクトルｖｅｃｔｏｒが、ｖｅｃｔｏｒ＝（０、１、１、１、０、０、０、１、１、１、１、１、１、０、０、０、１、１、１、１、）である未知文字は、「２」の文字カテゴリに属すると判定される。
【０１７５】
次に、図３の基本文字認識部１７の知識テーブル１８に格納されている詳細識別法について説明する。この詳細識別法は、各文字カテゴリの局所的な部分パターンを文字セグメントとして取り出し、未知文字の文字セグメントの位置や角度変化量とセグメント辞書に予め格納してある文字セグメントの位置や角度変化量とを比較することにより、未知文字と文字カテゴリとの対応を取りながら文字を認識する。
【０１７６】
図３４は、文字セグメントの抽出方法を説明する図である。
図３４（ａ）は、「２」の文字についての２値画像パターンを示しており、斜線部分が黒画素で表された文字部分を示している。
【０１７７】
図３４（ｂ）は、図３４（ａ）の２値画像パターンから抽出された輪郭線を示しており、点線部分は元の２値画像パターンを示している。
図３４（ｃ）は、図３４（ｂ）の輪郭線を文字セグメントＳ１、Ｓ２と端点部分Ｔ１、２とに分割した状態を示している。この端点部分Ｔ１、２は、図３４（ａ）の「２」の文字の書き始め及び書き終わりに対応するものである。
【０１７８】
図３５は、端点の検出方法を説明する図である。
図３５において、端点は輪郭線の傾きが急激に変化する場所として検出され、具体的には、一定間隔だけ離れた３点Ａ、Ｂ、Ｃを輪郭線Ｓ上にとり、その３点Ａ、Ｂ、Ｃを結んだ真ん中の点Ａを頂点としてなす角θが所定値以下となる輪郭線上の領域を、端点として検出する。
【０１７９】
文字の輪郭線を端点で分割することにより、文字セグメントを２値画像パターンから抽出すると、例えば、文字セグメント上に代表点Ｘ、Ｙ、Ｚを一定の距離ごとにとる。そして、連続する代表点Ｘ、Ｙ、Ｚのなす角度を求め、各代表点Ｘ、Ｙ、Ｚでの特徴量として、文字セグメント上の最初の代表点から各代表点までの角度変化量の累積値を求める。
【０１８０】
図３６は、角度変化の検出方法を説明する図である。
図３６において、任意の間隔だけ離れた代表点Ｘ、Ｙ、Ｚを輪郭線Ｓ上にとり、代表点Ｘから代表点Ｙに引いたベクトルＸＹと、代表点Ｙから代表点Ｚに引いたベクトルＹＺとを作り、ベクトルＸＹとベクトルＹＺとのなす角θ₂が代表点Ｙでの角度変化となる。
【０１８１】
角度変化の初期値である輪郭線Ｓ上の代表点Ｘでの角度変化は、文字の重心Ｇから代表点Ｘに引いたベクトルＧＸとベクトルＸＹとのなす角θ₁を代表点Ｘでの角度変化とする。
【０１８２】
各代表点Ｘ、Ｙ、Ｚでの特徴量は、角度変化の初期値を有する代表点Ｘから各代表点Ｙ、Ｚまでの角度変化を累積した値で表し、例えば、代表点Ｙでの特徴量は、θ₁＋θ₂の値となる。
【０１８３】
未知文字の文字セグメント上の代表点での角度変化量の累積値を求めた後、この未知文字の文字セグメントについての代表点とセグメント辞書に格納してある文字セグメントの代表点との対応をとる。すなわち、未知文字の文字セグメントについての代表点の角度変化量の累積値と、セグメント辞書に格納してある文字セグメントの代表点の角度変化量の累積値との距離を算出し、この距離が最も小さくなるセグメント辞書の文字セグメントの代表点を未知文字の文字セグメントの代表点に対応させる。
【０１８４】
図３７（ａ）は、未知文字の文字セグメントの代表点とセグメント辞書の文字セグメントの代表点との対応関係を示す図である。
図３７（ａ）において、代表点ａ₁〜ａ₈は、未知文字の文字セグメント上の代表点を表し、代表点ｂ₁〜ｂ₈は、セグメント辞書に格納されている文字セグメント上の代表点を表している。そして、未知文字の文字セグメントについての代表点ａ₁〜ａ₈はそれぞれ、セグメント辞書に格納されている文字セグメントの代表点ｂ₁〜ｂ₈に対応している。
【０１８５】
未知文字の文字セグメントの代表点とセグメント辞書の文字セグメントの代表点との対応関係を求めた後、セグメント辞書に格納されている文字セグメント上の基準点に対応する未知文字の文字セグメントについての代表点を検査点とする。
【０１８６】
図３７（ｂ）は、基準点と検査点との対応関係を示す図である。
図３７（ｂ）において、セグメント辞書に格納されている文字セグメントの基準点ｄ₁、ｄ₂はそれぞれ、未知文字の文字セグメントの検査点ｃ₁、ｃ₂に対応している。
【０１８７】
基準点と検査点との対応関係を求めた後、未知文字の文字セグメントの検査点ｃ₁、ｃ₂についての検査情報を算出する。
この検査情報は、例えば、１つの検査点に対しては、その検査点が文字画像全体の中でどの位置に存在しているかという個々の検査点の絶対位置情報や、２つの検査点に対しては、それらの検査点間の距離や方向などの相対位置情報や、２つの以上の検査点に対しては、それらの検査点間の角度変化や直線性などの情報からなっている。
【０１８８】
そして、検査点についての検査情報を算出した結果、所定の判定条件を満たす場合、判定条件を満たしたセグメント辞書に格納されている文字セグメントの文字カテゴリを未知文字の認識結果として出力する。
【０１８９】
例えば、判定条件として、図３７（ｂ）の文字セグメント上の検査点ｃ₁から文字セグメントに沿って検査点ｃ₂までの角度変化を検査情報とした場合、この角度変化が６０度以上である文字セグメントの文字画像が、その文字セグメントに対応して格納されているセグメント辞書の「２」の文字カテゴリに属するとした場合、図３７（ｂ）の文字セグメント上の検査点ｃ₁から文字セグメントに沿って検査点ｃ₂までの角度変化を算出することにより、図３４（ａ）の文字パターンが「２」の文字カテゴリに属すると認識できる。
【０１９０】
図３８は、詳細識別法による文字認識処理を示すフローチャートである。
図３８において、まず、ステップＳ１５０に示すように、文字認識の対象となる帳票などをスキャナで走査し、読み込んだ文字画像を白黒２値の画像に２値化する。
【０１９１】
次に、ステップＳ１５１に示すように、ステップＳ１５０で得られた２値画像データから文字セグメントを抽出する。
次に、ステップＳ１５２に示すように、セグメント辞書に格納されている複数の文字セグメントから、未知文字の文字セグメントとの対応関係が付けられていない文字セグメントを取り出す。
【０１９２】
次に、ステップＳ１５３に示すように、セグメント辞書から取り出した文字セグメントと未知文字の文字セグメントとの対応関係を付ける。
次に、ステップＳ１５４に示すように、未知文字の文字セグメント上にとった代表点の中から検査点を決定し、この検査点についての検査情報を算出する。
【０１９３】
次に、ステップＳ１５５に示すように、ステップＳ１５４で算出した検査情報に基づいて、セグメント辞書から取り出した文字セグメントと未知文字の文字セグメントとを比較し、セグメント辞書から取り出した文字セグメントの検査情報と未知文字の文字セグメントの検査情報とが一致するかどうかを判定することにより、未知文字に対する文字候補の決定処理を行う、
次に、ステップＳ１５６に示すように、未知文字に対する文字候補の決定処理で、文字候補が決定した場合、ステップＳ１５３で取り出した文字セグメントに対応する文字カテゴリを認識結果として出力する。一方、文字候補が決定しない場合、ステップＳ１５７に進み、未知文字の文字セグメントとの対応関係が付けられていない未処理の文字セグメントがセグメント辞書にあるかどうかを判断し、未処理の文字セグメントがセグメント辞書にある場合、ステップＳ１５２に戻って、以上の処理を繰り返す。
【０１９４】
一方、未知文字の文字セグメントとの対応関係が付けられていない未処理の文字セグメントがセグメント辞書にない場合、入力された未知文字は認識不能であると判断して、認識不能という認識結果を出力する。
【０１９５】
なお、詳細識別法については、例えば、特開平６−３０９５０１号公報に記載されている。
次に、図３の接触文字認識部１３の一実施例について説明する。
【０１９６】
図３９は、接触文字認識部１３の文字補完処理を説明する図である。
この文字補完処理では、枠接触文字の２値画像から枠だけを抽出してこの枠を除去する。この際、枠接触文字の枠に接触している文字線分の枠接触部分がかすれてしまい、文字線分が複数の部分に途切れてしまうので、途切れた文字線分について、各ラベルが付与された文字線分間の距離や方向性等の幾何学的構造を評価して、それを補完する。
【０１９７】
例えば、図３９（ａ）に示すように、「３」を表している文字パターン１３１と枠１３２とが接触したために連結している２値画像にラベル“１”が付されている。そして、図３９（ａ）の２値画像から枠１３２を抽出し、この枠１３２を除去することにより、図３９（ｂ）に示すように、「３」を表している文字パターン１３１が３個に分割されて、ラベル“１”、ラベル“２”及びラベル“３”が付与された３個の文字線分が生成される。
【０１９８】
このラベル“１”、ラベル“２”及びラベル“３”が付与された３個の文字線分について、各ラベルが付与された文字線分間の距離や方向性等の幾何学的構造を評価して、それを補完する。これにより、ラベル“１”、ラベル“２”及びラベル“３”が付与された３個の文字線分が連結されて、図３９（ｃ）に示すように、ラベル“１”が付された「３」を表している文字補完パターン１３２が生成される。
【０１９９】
この文字補完処理により復元された文字は、認識文字の候補として認識処理が行われる。この認識処理では、文字カテゴリ辞書に登録されている標準パターンと照合して、相違度が最も小さい文字カテゴリのコードを出力する。
【０２００】
図４０は、接触文字認識部１３の再補完処理を説明する図である。
この再補完処理では、枠に平行な文字線分が枠に接触し、枠を除去したために枠に平行な文字線分が消滅した場合に、この文字線分を補完するもので、予め、枠接触文字をラベリングによる連結性を用いて抽出しておき、文字補完処理により補完された文字補完パターンと枠接触文字の連結性が一致することを検出することにより、枠に平行な文字線分を補完する。
【０２０１】
例えば、図４０（ａ）に示すように、「７」を表している文字パターン１４１と枠１４２とが接触したために連結している２値画像にラベル“１”が付されている。そして、図４０（ａ）の２値画像から枠１４２を抽出し、この枠１４２を除去することにより、図４０（ｂ）に示すように、「７」を表している文字パターン１４１が３個に分割されて、ラベル“１”、ラベル“２”及びラベル“３”が付与された３個の文字線分が生成される。
【０２０２】
このラベル“１”、ラベル“２”及びラベル“３”が付与された３個の文字線分について、各ラベルが付与された文字線分間の距離や方向性等の幾何学的構造を評価して、それを補完する。これにより、ラベル“１”及びラベル“２”が付与された２個の文字線分が連結されて、図４０（ｃ）に示すように、ラベル“１”及びラベル“２”が付与された２個の文字線分からなる文字補完パターン１４２が生成される。
【０２０３】
この場合、文字補完処理で補完されるのは、図４０（ｂ）のラベル”１”が付与されていた部分とラベル”２”が付与されていた部分との間のみで、図４０（ｂ）のラベル”１”が付与されていた部分とラベル”３”が付与されていた部分については、補完することができない。この図４０（ｂ）のラベル”１”が付与されていた部分とラベル”３”が付与されていた部分の補完は、再補完処理により行う。
【０２０４】
この再補完処理は、予め、枠接触文字をラベリングによる連結性を用いて抽出しておき、図４０（ｃ）のパターンと枠接触文字の連結性が一致することを検出することにより、枠に平行な文字線分を補完する。すなわち、図４０（ｃ）のラベル”１”が付与されたパターンとラベル”２”が付与されたパターンとは、図４０（ａ）に示すように、枠を除去する前は互いに連結していたので、図４０（ｃ）のラベル”１”が付与されたパターンとラベル”２”が付与されたパターンとを、枠に平行な線分を用いて互いに連結する。
【０２０５】
これにより、図４０（ｃ）のラベル”１”とラベル”２”の２つの文字線分に分かれていた「７」の２値画像が補完され、図４０（ｄ）に示すように、ラベル“１”が付された「７」を表している再補完パターン１４３が生成される。
【０２０６】
この再補完処理により復元された文字は、認識文字の候補として認識処理が行われる。この認識処理では、文字カテゴリ辞書に登録されている標準パターンと照合して、相違度が最も小さい文字カテゴリのコードを出力する。
【０２０７】
すなわち、図４０に示す例では、図４０（ｃ）に示す文字補完パターン１４２は、「リ」の文字カテゴリに属すものと認識される。また、図４０（ｄ）に示す再補完パターン１４３は、「７」の文字カテゴリに属すものと認識される。そして、「リ」よりも「７」のほうが相違度が小さいと判断されて、最終的に「７」と認識され、その文字コードが出力される。
【０２０８】
次に、図３の接触文字認識部１３が知識テーブル１４を参照しながら認識処理を行う場合について説明する。
図４１は、誤読文字対を学習し、知識テーブル１４に登録しておくことにより、枠接触文字を認識する例を説明する図である。
【０２０９】
この例の場合、図４１（ａ）に示すように、「２」を表している文字パターン１５１と枠１５２とが接触したために連結している２値画像にラベル“１”が付されている。そして、図４１（ａ）の２値画像から枠１５２を抽出し、この枠１５２を除去することにより、図４１（ｂ）に示すように、「２」を表す文字１５１がラベル“１”とラベル“２”の２つの部分パターンに分離される。
【０２１０】
次に、図４１（ｃ）に示すように、文字補完処理により、図４１（ｂ）のラベル“１”とラベル“２”の２つの部分パターンが連結され、文字補完パターン１５３が生成される。
【０２１１】
この場合、「２」を表している文字パターン１５１の下線部分が枠１５２に接触し、その接触部分がほぼ完全に枠１５２に重なっている。このため、再補完処理を用いても、「２」を表している文字パターン１５１の下線部分を補完することができず、「２」の文字を、誤って「７」と認識してしまう可能性が高くなる。
【０２１２】
このように、枠接触文字の一部が枠からはみ出すことなく、枠に完全に重なっているため、他の文字と誤って認識してしまう場合、誤読文字対を学習して登録しておくことにより、枠接触文字が正しく認識されるようにする。
【０２１３】
以下、誤読文字対を学習して登録しておくことにより、枠接触文字を認識する方法について説明する。
図４２は、図３の接触文字認識部１３において、誤読文字対を学習する構成を示すブロック図である。
【０２１４】
枠接触文字の自動生成部１６１は、入力された枠に未接触の学習文字を枠に重ね合わせて、枠接触文字を生成する。ここで、枠に対する学習文字の変動の方法により、同一の学習文字に対して複数の枠接触文字が生成される。図４２では、「２」を表している学習文字１６８が枠接触文字の自動生成部１６１に入力され、文字「２」の下辺と下枠とが重なった枠接触文字１６９を生成した例を示している。枠接触文字の自動生成部１６１により生成された情報は、知識テーブル１６７に登録される。
【０２１５】
学習文字に枠を重ね合わせる際の変動の種類は、例えば、「文字枠に対する文字の変動」と「文字枠の変動」の２種類があり、「文字枠に対する文字の変動」には、例えば、「位置ずれ」、「サイズ変動」及び「傾き変動」などがあり、「文字枠の変動」には、例えば、「傾き変動」、「枠幅変動」、「サイズ変動」及び「枠の凹凸」などがある。
【０２１６】
また、これらの変動についての変動量を表すパラメータとして以下のパラメータがある。なお、垂直方向にｘ軸を、水平方向にｙ軸を設定するものとする。
１．文字枠に対する文字の変動
位置ずれ：ｄｘ、ｄｙ、
ここで、ｄｘ（図４３で黒丸で示した位置）、ｄｙ（図４３で ×で示した位置）は、それぞれ、文字の重心と文字枠の重心の位置の差のｘ方向、ｙ方向の大きさを表す。
【０２１７】
サイズ変動：ｄｓｘ、ｄｓｙ、
ここで、ｄｓｘ，ｄｓｙは、それぞれ、文字のｘ方向、ｙ方向の大きさを表す。
【０２１８】
傾き変動：ｄα、
ここで、ｄαは、垂線に対する文字の傾き角度を表す。
２．文字枠の変動
傾き変動：ｆα、
ここで、垂線に対する文字枠の傾き角度を表す。
【０２１９】
枠幅変動：ｗ、
ここで、ｗは、文字枠の幅を表す。
サイズ変動：ｆｓｘ、ｆｓｙ、
ここで、ｆｓｘ，ｆｓｙは、それぞれ、文字のｘ方向、ｙ方向の大きさを表す。
【０２２０】
枠の凹凸：ｆδ、
ここで、ｆδは、例えば、ファクシミリなどに印刷された文字枠の品質劣化等を考慮した文字枠の凹凸を制御するパラメータである。例えば、文字枠の周囲長をＬとすると、ｆδは、このサイズＬの配列ｆδ〔Ｌ］として表現され、この配列の各要素ｆδ〔ｉ］（ｉ＝１、２、３，・・・）は、乱数発生により決定される−β〜＋βの範囲内の整数値をとる。
【０２２１】
これらの変動の種類及び変動量に基づいて、学習文字に対して操作Ｆ（ｄｘ，ｄｙ，ｄｓｘ，ｄｓｙ，ｄα，ｗ，ｆｓｘ，ｆｓｙ，ｆα、ｆδ）を施すことにより、枠接触文字を生成する。
【０２２２】
図４３は、「７」を表す学習文字１７１に対して枠１７２を合成することにより、枠接触文字を生成する例を示す図である。
図４３（ａ）に示すように、「７」を表す学習文字１７１に対して変換操作Ｆ（ｄｘ，ｄｙ，ｄｓｘ，ｄｓｙ，ｄα，ｗ，ｆｓｘ，ｆｓｙ，ｆα、ｆδ）を施すことにより、図４３（ｂ）に示すように、枠１７２に接触する「７］の枠接触文字を生成する。
【０２２３】
すなわち、学習文字１７１及び枠１７２に対して変換操作Ｆ（ｄｘ，ｄｙ，ｄｓｘ，ｄｓｙ，ｄα，ｗ，ｆｓｘ，ｆｓｙ，ｆα、ｆδ）を施し、学習文字１７１と枠１７２とを重ね合わせることにより、枠接触文字を生成する。この場合、例えば、枠１７２の方の重心の位置を固定しながら変換操作Ｆ（ｄｘ，ｄｙ，ｄｓｘ，ｄｓｙ，ｄα，ｗ，ｆｓｘ，ｆｓｙ，ｆα、ｆδ）を実行する。
【０２２４】
図４４は、ｘ方向のサイズ変動ｆｓｘ及びｙ方向のサイズ変動ｆｓｙを固定し、枠の大きさを固定した場合について、「３」の学習文字に対して生成した各種枠接触文字の例を示す図である。
【０２２５】
図４４（ａ）は、変動の種類が“位置ずれ”の場合の例であり、変動量がｄｘ＝０、ｄｙ＞０の場合である。この場合、「３」の文字が枠の下にはみ出すことになる（下位置変動）。
【０２２６】
図４４（ｂ）は、変動の種類が“サイズ変動”の場合の例であり、変動量がｄｓｘ＝ｆｓｘ，ｄｓｙ＝ｆｓｙの場合である。この場合、「３」の文字が枠の上下、左右に接触することになり、「３」の外接矩形が枠に等しくなる。
【０２２７】
図４４（ｃ）は、変動の種類が“文字の傾き変動”の例であり、変動量がｄα＝１０度の場合である。
図４４（ｄ）は、変動の種類が“文字枠の傾き変動”の例であり、変動量がｆα＝−１０度の場合である。
【０２２８】
図４４（ｅ）は、変動の種類が“枠幅変動”の例であり、変動量がｗ＝５の場合である。
図４４（ｆ）は、変動の種類が、“枠の凹凸”の例であり、変動量ｆδ〔Ｌ］の各要素ｆδ〔ｉ］を制御した場合である。
【０２２９】
次に、図４２の枠除去部１６２は、枠接触文字の自動生成部１６１により生成された枠接触文字から枠のみを抽出し、この枠を除去して得られたかすれ文字についての画像データを、文字補完部１６３に出力する。
【０２３０】
文字補完部１６３は、枠除去部１６２によって枠が除去された文字の画像データを、ラベルが付与された文字線分間の距離や方向性等の幾何学的構造を評価して補完する。図４２は、枠接触文字の自動生成部１６１により生成された枠接触文字１６９から枠を除去した後、文字補完部１６３により補完を行って文字補完パターン１７０を生成した例を示している。
【０２３１】
再補完部１６４は、文字補完部１６３によって補完しきれなかった領域について、予め、枠接触文字をラベリングによる連結性を用いて抽出しておき、文字補完部１６３により補完されたパターンと枠接触文字の連結性が一致することを検出することにより、枠に平行な文字線分を補完する。
【０２３２】
文字補完部１６３によって補完された文字補完パターンと再補完部１６４によって補完された再補完パターンとは、基本文字認識部１６５に入力される。
基本文字認識部１６５は、文字補完部１６３によって補完された文字補完パターンと再補完部１６４によって補完された再補完パターンについて、文字認識を実行する。そして、各学習文字についての認識結果を枠接触状態と認識の知識獲得部１６６に出力する。
【０２３３】
枠接触状態と認識の知識獲得部１６６は、基本文字認識部１６５から出力される認識結果を予め与えられている正解データと比較して、全サンプルデータに対する認識率を得る。そして、この認識率を信頼度として、また、誤読文字（誤って認識した文字）と正解の文字との組み合わせを誤読文字対として、知識テーブル１６７に登録する。なお、上記誤読文字対は、例えば、文字コードにより登録される。また、枠接触状態と認識の知識獲得部１６６は、枠と文字の接触状態の特徴を示すパラメータを抽出して、これも知識テーブル１６７に登録する。
【０２３４】
このようにして、知識テーブル１６７には、各文字カテゴリについて、枠と文字の様々な接触状態におけるその文字に対する認識率が、その誤読文字対とともに登録される。
【０２３５】
図４５は、学習により生成された知識テーブル１６７の一例を示す図である。図４５において、知識テーブル１６７には、例えば、誤読文字対（２、７）及び信頼度７７％が、“下位置ずれ変動”の変動量ｄｙ＝５、Ｗ＝５等と共に登録され、変動量が、ｄｙ＝５、Ｗ＝５の“下位置ずれ”の「２」の枠接触文字の場合、基本文字認識部１６５は、２３％の確率で、「２」を誤って「７」と認識してしまうことが示される。すなわち、この場合、基本文字認識部１６５が「７」と認識したとしても、その信頼度は７７％であり、実際の文字が「２」である可能性が２２％ある旨が知識テーブル１６７を参照することにより判断できる。
【０２３６】
同様にして、他の誤読され易い文字対についても、“変動量”、“枠の線幅”、“誤読文字対”及び信頼度が、枠接触状態と認識の知識獲得部１６６によって知識テーブル１６７に登録される。
【０２３７】
なお、誤読文字対（Ｌ１、Ｌ２）は、実際は、文字「Ｌ１」が文字「Ｌ２」に誤って認識されてしまう場合を示すものである。また、上記文字「Ｌ１」、「Ｌ２」には、例えば、該当する文字「Ｌ１」、「Ｌ２」の文字コードが登録される。
【０２３８】
知識テーブル１６７には、図４５に示す変動量ｄｙ＝５，Ｗ＝５の“下位置ずれ変動”以外にも、図４６に示すように“文字枠に対する文字の傾き変動”（この場合、左枠接触）などの図４３に示す各種変動について、各文字カテゴリ毎に登録される。
【０２３９】
すなわち、例えば、図４６に示すように、“下位置ずれ”変動については、例えば、ｄｘ＝「−３」〜「＋３」、ｄｙ＝５、ｗ＝５、ｄｓｙ＝１、ｄα＝「−１０」〜「＋１０」、ｆα＝「−１０」〜「＋１０」が登録される。このように、同じ”下位置ずれ”変動であっても、知識テーブル１６７に登録される変動量は、ｘ方向の位置ずれｄｘ、ｙ方向の位置ずれｄｙのみでなく、その他の変動量が登録される場合がある。また、“左枠接触の文字枠に対する文字の傾き変動”については、例えば、ｄｘ＝「−３」〜「＋３」、ｄｙ＝「−３」〜「＋３」、ｗ＝５，ｄｓｙ＝１，ｄα＝「−２０」〜「＋２０」、ｆα＝「−１０」〜「＋１０」が登録される。
【０２４０】
また、信頼度が予め定められた所定のしきい値（例えば、９０％）以下の誤読文字対（Ｌ１，Ｌ２）について、信頼度がその所定のしきい値以上となるような文字認識方法を学習し、学習した文字認識方法を知識テーブル１６７に登録する。
【０２４１】
例えば、図４５に示すように、ｄｙ＝５，ｗ＝５の”下位置ずれ”の状態の「２」の枠接触文字の文字認識の信頼度は７７％であり、「７」と誤って認識される確率が高いので、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンを、例えば、領域強調の手法により再認識すれば認識率が向上することを学習して知識テーブル１６７に登録しておく。
【０２４２】
この（２、７）の誤読文字対の場合における領域強調の手法を図４７を参照しながら説明する。
まず、図４７（ａ）に示すように、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンの外接矩形１８０を、縦の行がｍ個、横の列がｎ個のｍ×ｎ個の分割領域に分割する。そして、図４７（ｂ）にハッチングで示すように、外接矩形１８０の上半分のｍ／２×ｎ個の領域を特に強調して、文字認識を再度行う。
【０２４３】
すなわち、このｍ／２×ｎ個の領域の特徴パラメータを抽出して、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンが「２」または「７」のいずれであるかを調べる。この領域強調の手法により、認識度が９５％まで向上する。図４５の知識テーブル１６７には、誤読文字対が（２、７）の行に、再認識方法として「領域強調」を、再認識領域として「ｍ／２×ｎ」を、さらに再認識信頼度として「９５％」を登録する。
【０２４４】
この領域強調の手法は、図４８（ａ）に示すような枠接触文字の場合にも有効である。図４８（ａ）は、「２」を表している文字パターンの下部が文字枠１８２に接触している例である。
【０２４５】
この場合、文字補完部１６３により、図４８（ｂ）に示すような「７」に類似する文字補完パターン１８３が得られる。この文字補完パターン１８３に対して図４８（ｃ）に示す外接矩形１８４を算出する。そして、この外接矩形１８４を、図４７に示すように、ｍ×ｎ個の領域に分割した後、上半分のｍ／２×ｎ個の部分領域１８５を特に強調して文字認識すれば、文字補完パターン１８３が「２」と認識される確率が高い、すなわち、正解率（信頼度）が高くなることを学習し、枠接触による誤読文字対（２、７）に対する再認識方法として、上記領域強調の手法を知識テーブル１６７に登録する。
【０２４６】
図４９は、領域強調による文字パターンの再認識方法を示すフローチャートである。
図４９において、まず、ステップＳ６０１に示すように、知識テーブル１６７から信頼度の低い誤読文字対のデータを取り出す。そして、この誤読文字対の左側に登録されている文字について、２値の学習データとしての文字パターンと、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンとを入力する。
【０２４７】
この文字補完パターンまたは再補完パターンは、知識テーブル１６７に登録されている変動量パラメータによって規定されるパターンであり、同一カテゴリであっても複数の形状のパターンを取りうる。
【０２４８】
次に、ステップＳ６０２に示すように、ステップＳ６０１で入力された学習データとしての文字パターンと、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンとを、ｍ×ｎの領域に分割する。
【０２４９】
そして、ステップＳ６０３に示すように、このｍ×ｎの領域内のＸ×Ｙの部分パターンについて文字認識を実行する。そして、この場合の認識率ｚを求める。上記Ｘ×Ｙの部分パターンは、再認識領域である。このとき、Ｘ，Ｙは、それぞれ、ｍ×ｎの領域のＸ方向、Ｙ方向の長さを表す変数であり、Ｘ≦ｍ，Ｙ≦ｎである。また、上記認識率ｚは、上記Ｘ×Ｙの部分パターンを用いて文字認識を行った際の、正解となる確率である。
【０２５０】
すなわち、学習データとしての文字パターンの部分パターンの文字認識結果を正解とみなす。そして、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンについての複数の部分パターンに対する文字認識結果を、学習データとしての文字パターンの部分パターンの文字認識結果と比較していくことにより、文字補完部１６３により補完された文字補完パターンまたは再補完部１６４により補完された再補完パターンについての部分パターンの認識率ｚを求める。
【０２５１】
続いて、ステップＳ６０４に示すように、認識率ｚが最大認識率ｍａｘよりも大きいか否かを判別する。この最大認識率ｍａｘは、Ｘ×Ｙの部分パターンを変化させていった場合における認識率ｚの最大値を記憶する変数であり、最初はある初期値（例えば、「０」）が設定される。
【０２５２】
そして、認識率ｚが最大認識率ｍａｘよりも大きければ、ステップＳ６０５に進んで、この認識率ｚを最大認識率ｍａｘに代入し、続いて、ステップＳ６０６に進んで、長さＸ，Ｙを変更可能か否か調べる。一方、ステップＳ６０４で、認識率ｚが最大認識率ｍａｘ以下であれば、直ちに、このステップＳ６０６に移行する。
【０２５３】
この長さＸ，Ｙの変更操作は、例えば、長さＸ，Ｙの大きさの変更である。また、Ｘ×Ｙの部分パターンのｍ×ｎの領域内での位置変更操作を含んでいてもよい。
【０２５４】
ステップＳ６０６で、長さＸ，Ｙを変更可能であると判別すると、ステップＳ６０３に戻り、長さＸ，Ｙの変更操作を行い、新たなＸ×Ｙの部分パターンを決定し、この部分パターンに対して文字認識を行う。
【０２５５】
以上述べたステップＳ６０３〜Ｓ６０６の処理を、上記ステップＳ５０６で長さＸ，Ｙを変更できないと判別するまで繰り返す。そして、ステップＳ６０６で長さＸ，Ｙを変更できないと判別すると、最大識別率ｍａｘとその最大識別率ｍａｘが得られたＸ×Ｙの部分パターンを、それぞれ、再認識信頼度、再認識領域として知識テーブル１６７に登録する。また、再認識方法として「領域強調」を知識テーブル１６７に登録する。
【０２５６】
なお、図４９のフローチャートは、「領域強調」の手法を用いて再文字認識の方法を学習する例であるが、「領域強調」の手法以外についても、再文字認識の方法を学習するようにしてもよい。
【０２５７】
図５０は、学習により得られた知識テーブル１６７を用いて枠接触文字の文字認識を行う構成を示すブロック図である。
図５０において、枠接触状態の検出部１９１は、入力された未知の枠接触文字について、枠と文字との接触状態を検出する。ここでは、図５０（ａ）の下枠が「２」の下辺と部分的に重なっている枠接触文字パターン２０１と、図５０（ｂ）の下枠が「２」の下辺と完全に重なっている枠接触文字パターン２０３とが入力された例について示している。そして、枠接触状態の検出部１９１は、枠接触文字パターン２０１及び枠接触文字パターン２０３を検出する。
【０２５８】
枠除去部１９２は、枠接触状態の検出部１９１により検出された枠接触文字パターンから枠を除去する。
文字補完部１９３は、枠除去部１９２により枠が除去された文字パターンについて、ラベルが付与された文字線分間の距離や方向性等の幾何学的構造を評価して補完する。
【０２５９】
再補完部１９４は、文字補完部１９３によって補完しきれなかった領域について、予め、枠接触文字をラベリングによる連結性を用いて抽出しておき、文字補完部１６３により補完されたパターンと枠接触文字の連結性が一致することを検出することにより、枠に平行な文字線分を補完する。ここで、再補完パターン２０２は、図５０（ａ）の枠接触文字パターン２０１に対し、再補完部１９４の再補完処理により補完されたパターンを示し、再補完パターン２０４は、図５０（ｂ）の枠接触文字パターン２０３に対し、再補完部１９４の再補完処理により補完することができなかったパターンを示している。
【０２６０】
基本文字認識部１９５は、文字補完部１９３によって補完された文字補完パターンと再補完部１９４によって補完された再補完パターンとのそれぞれに対し、文字認識を実行する。この結果、例えば、図５０（ａ）の再補完パターン２０２に対しては、「２」の文字コードが出力され、図５０（ｂ）の再補完パターン２０４に対しては、「７」の文字コードが出力される。そして、その認識結果により得られた文字コードを、枠接触状態と認識の知識参照部１９６に出力する。
【０２６１】
枠接触状態と認識の知識参照部１９６は、文字補完部１９３によって補完された文字補完パターン又は再補完部１９４によって補完された再補完パターンの外接矩形の位置情報及び図５０（ａ）の枠接触文字パターン２０１又は図５０（ｂ）の枠接触文字パターン２０３から抽出された文字枠の位置情報や幅情報などを基に、変動の種類を求める。
【０２６２】
すなわち、図４３に示されているような”位置ずれ”、”サイズ変動”、”傾き変動”などの文字枠に対する文字の変動、または、”傾き変動”、枠幅変化”、”枠の凹凸”などの文字枠の変動を求める。さらに、求めた各変動の種類について、変動量ｄｘ，ｄｙ，ｄｓｘ，ｄｓｙ，ｄα、ｗ，ｆｓｘ，ｆｓｙ，ｆα、ｆδを算出する。
【０２６３】
次に、算出した変動種類情報及び変動量情報と、基本文字認識部１９５から入力される文字コードとをキー項目として、知識テーブル１６７を検索し、このキー項目に一致する変動種類情報、変動量情報及び誤読文字対を有する行が知識テーブル１６７に登録されているか否か調べる。
【０２６４】
そして、キー項目に一致する行が存在した場合には、この行に登録されている信頼度が所定のしきい値以上であるか否かを判別し、そのしきい値未満であれば、文字補完部１９３によって補完された文字補完パターン又は再補完部１９４によって補完された再補完パターンを再文字認識部１９７に出力し、その行に登録されている再認識方法に従って、文字認識を再度行う。
【０２６５】
すなわち、文字補完部１９３によって補完された文字補完パターン又は再補完部１９４によって補完された再補完パターン、あるいは未知文字の２値画像データを用いて、基本文字認識部１９５による手法とは別の手法で未知画像データに含まれる枠接触文字の再認識を実行する。そして、再認識により得られた文字コードを出力する。
【０２６６】
例えば、基本文字認識部１９５が、再補完部１９４によって補完された再補完パターン２０４の認識結果として、「７」の文字コードを出力した場合、枠接触状態と認識の知識参照部１９６は、再補完パターン２０４の外接矩形の位置情報と枠接触文字パターン２０３から抽出した文字枠の位置情報及び幅情報とを基に、変動の種類及び変動量を求める。この結果、変動の種類として“下位置ずれ”が算出され、この“下位置ずれ”の変動量として「ｄｙ＝５」が算出され、文字枠の幅として「ｗ＝５」が算出される。
【０２６７】
そして、枠接触状態と認識の知識参照部１９６は、変動の種類として“下位置ずれ”、“下位置ずれ”の変動量として「ｄｙ＝５」、文字枠の幅として「ｗ＝５」、及び基本文字認識部１９５から入力された文字コード「７」をキー項目として、図４５の知識テーブル１６７を検索する。この検索の結果、これらのキー項目に対応する行には誤読文字対（２、７）が登録され、基本文字認識部１９５で認識された文字コード「７」の信頼度は７７％であり、２３％の確率で「２」を「７」と読み間違えていることを知る。
【０２６８】
この場合、これらのキー項目に対応する行に登録されている信頼度は所定のしきい値よりも低いので、再文字認識部１９７は、基本文字認識部１９５による手法とは別の手法で未知画像データに含まれる枠接触文字パターン２０３の再認識を実行する。この際、再文字認識部１９７は、知識テーブル１６７のキー項目に対応する行を参照し、再認識方法を特定する。
【０２６９】
すなわち、再文字認識部１９７は、再認識方法として、「領域強調」を行うことを教えられるとともに、「領域強調」を行う場合の再認識領域として、再補完パターン２０４の上半分のｍ／２×ｎの部分領域２０５だけを強調して再認識することを教えられる。また、この場合の再認識信頼度が９５％であることも教えられる。
【０２７０】
再文字認識部１９７は、知識テーブル１６７に登録されている再認識方法に従って、再補完パターン２０４の上半分の部分領域２０５のみについての再認識を行う。そして、再補完パターン２０４の部分領域２０５は、文字コード「２」に対応する文字パターン２０６の部分領域２０７に９５％の確率で一致し、文字コード「７」に対応する文字パターン２０８の部分領域２０９に５％の確率で一致することを知り、未知の枠接触文字パターン２０３の枠に接触した文字の認識結果として、文字コード「２」を出力する。
【０２７１】
図５１は、枠接触状態と認識の知識参照部１９６の動作を示すフローチャートである。
図５１において、まず、ステップＳ１７１に示すように、未知の枠接触文字パターンから抽出した枠と、枠接触文字パターンから分離した文字パターンとに基づいて、文字の枠に対する変動量を算出し、この変動量をキー項目として知識テーブル１６７を探索する。そして、この算出された変動量に一致する変動量を登録している行が知識テーブル１６７に存在するか否かを調べる。
【０２７２】
これにより、例えば、下位置ずれ変動となっている「２」の文字について、その変動量としてｄｘ＝５，ｗ＝５が算出されると、図４５に示す知識テーブル１６７の最上位の行が検出される。
【０２７３】
そして、変動量が一致する行が存在する場合、ステップＳ１７２に進み、基本文字認識部１９５から入力される文字コード（文字認識コード）を誤読文字対に含んでいる行が、変動量が一致する行の中に存在するか否かを調べる。
【０２７４】
これにより、例えば、下位置ずれ変動となっている「２」の文字の場合、図４５に示す知識テーブル１６７の最上位の行が検出される。
そして、ステップＳ１７３に示すように、基本文字認識部１９５から入力される文字コードを誤読文字対に含んでいる行が、変動量が一致する行の中に存在する場合、知識テーブル１６７の該当する行に登録されている再認識信頼度と基本文字認識部１９５により算出された信頼度とを比較し、知識テーブル１６７の該当する行に登録されている再認識信頼度が基本文字認識部１９５により算出された信頼度よりも大きいか否か判別する。
【０２７５】
これにより、例えば、下位置ずれ変動となっている「２」の文字の場合、図４５に示す知識テーブル１６７の最上位の行に登録されている再認識信頼度及び基本文字認識部１９５により算出された信頼度が、それぞれ、「９５％」及び「７７％」であり、知識テーブル１６７の該当する行に登録されている再認識信頼度が基本文字認識部１９５により算出された信頼度よりも大きいと判別される。
【０２７６】
知識テーブル１６７の該当する行に登録されている再認識信頼度が基本文字認識部１９５により算出された信頼度よりも大きい場合、ステップＳ１７４に進み、知識テーブル１６７の該当する行に登録されている再認識信頼度が予め定められたしきい値ｔｈ１よりも大きいか否かを判別し、しきい値ｔｈ１よりも大きけば、ステップＳ１７５に進み、知識テーブル１６７のステップＳ１７２で検出した行に登録されている「再認識方法」及び「再認識領域」を参照する。
【０２７７】
次に、ステップＳ１７６に示すように、文字補完部１９３によって補完された文字補完パターン又は再補完部１９４によって補完された再補完パターンから、知識テーブル１６７で示される「再認識領域」を切り出し、この切り出した領域について、知識テーブル１６７で示される「再認識方法」により文字認識を実行する。そして、その文字認識により得られた文字コードを出力する。
【０２７８】
これにより、例えば、しきい値ｔｈ１が「９５％」よりも小さい場合、基本文字認識部１９５により入力される下位置ずれ変動となっている「２」の文字の補完パターンについて、上半分の「ｍ／２×ｎ」の領域を用いた「領域強調」手法により、文字認識が再度実行され、最終的に「２」の文字コードが出力される。
【０２７９】
なお、枠接触文字の認識方法については、例えば、特願平７−２０５５６４号の明細書及び図面に記載されている。
次に、図３の文字列認識部１５の一実施例について説明する。
【０２８０】
この文字列認識部１５は、図４のステップＳ２のレイアウト解析により抽出された文字列に対し、この文字列から文字を一文字ずつ切り出す際に用いる特性値としてのパラメータについて、文字の統合判定を行うの際の閾値をヒューリスティクに決定するのではなく、統計的に妥当な値を設定するようにしたものである。
【０２８１】
具体的には、各パラメータ毎に、パラメータ値とそのパラメータ値に対する文字の統合の成功又は失敗に関する統計データをとる。そして、各パラメータを個別に評価するのではなく、全てのパラメータを多次元空間上の１点として捉え、多変量解析の手法を用いて、統合が成功した場合と統合が失敗した場合との２群を分離する判別面を上記多次元空間内で求めるようにする。
【０２８２】
すなわち、パターンの特徴を示すＰ個の特性値からなるサンプルデータを、切り出し成功を示す第１の群と切り出し失敗を示す第２の群とに分類し、第１の群と第２の群との判別面をＰ次元空間において生成するものである。
【０２８３】
この判別面は、例えば、判別分析法により求めることができる。すなわち、判別面を線形な判別関数により構成する場合、その判別関数の係数ベクトルは、
Σ^-1（μ₁−μ₂）・・・（３）
で与えられる。
【０２８４】
ここで、
Σ ：第１の群及び第２の群の母分散共分散行列、
μ₁：第１の群の母平均ベクトル、
μ₂：第２の群の母平均ベクトル、
である。
【０２８５】
（３）式の係数ベクトルを有する判別関数は、前記第１の群と前記第２の群の各重心から等距離となるように構成される。
なお、この判別関数の係数ベクトルは、第１の群と第２の群との間の群間変動の群内変動に対する比を最大にするという基準に基づいて、算出することもできる。
【０２８６】
また、文字列から文字を切り出す処理は、パターンの外接矩形の位置、サイズ、並びなどからパターン同士を統合していく統計的処理と、文字列中の濁点、分離文字などを処理するためにパターン形状に着目する非統計的処理に分けて実行する。
【０２８７】
統計的処理では、パターンの外接矩形の位置、縦横比、平均文字サイズに対するサイズ比、隣接するパターン同士の距離、統合したときのサイズ、パターン同士の重なり幅、文字列の粗密度などを切り出しパラメータとして用いる。
【０２８８】
例えば、図５２に示すように、
１）外接矩形２１１の右枠と外接矩形２１２の左枠との距離ａ、
２）外接矩形２１１の左枠と外接矩形２１２の右枠との距離ｂ、
３）外接矩形２１１の右枠と外接矩形２１２の左枠との距離ａと外接矩形２１１の左枠と外接矩形２１２の右枠との距離ｂとの比ｃ、
４）外接矩形２１１の左枠と外接矩形２１２の右枠との距離ｂと外接矩形平均幅ＭＸとの比ｄ、
５）外接矩形２１３の下枠と外接矩形２１３の下枠の中点から外接矩形２１４の下枠の中点とを結ぶ直線とのなす角度ｅ、
６）外接矩形２１３の下枠と外接矩形２１３の右下の頂点から外接矩形２１４の左下の頂点とを結ぶ直線とのなす角度ｆ、
７）外接矩形２１５と外接矩形２１６とが重なっている場合、外接矩形２１５の右枠と外接矩形２１６の左枠との距離ｐと外接矩形２１５の左枠と外接矩形２１６の右枠との距離ｑとの比ｇ、
を切り出しパラメータとして用いる。
【０２８９】
すなわち、
ｃ＝ａ／ｂ・・・（４）
ｄ＝ｂ／ＭＸ・・・（５）
ｇ＝ｐ／ｑ・・・（６）
である。
【０２９０】
次に、統計的処理を図５３のフローチャートを参照しながら説明する。
まず、ステップＳ１８１に示すように、連結パターンの外接矩形を取り出す。次に、ステップＳ１８２に示すように、ステップＳ１８１で取り出した外接矩形の右隣に他の外接矩形があるかどうか調べる。そして、ステップＳ１８１で取り出した外接矩形の右隣に他の外接矩形がない場合、ステップＳ１８１で取り出した外接矩形を統計的処理の対象から除外する。
【０２９１】
一方、ステップＳ１８２において、ステップＳ１８１で取り出した外接矩形の右隣に他の外接矩形があると判断された場合、ステップＳ１８４に進む。
また、ステップＳ１８３に示すように、文字列の外接矩形の平均文字サイズを算出する。ここで、文字列の外接矩形の平均文字サイズを算出する場合、１文字ごとの切り出しがまだ行われていないので、厳密には、正確な平均文字サイズを算出することができない。
【０２９２】
そこで、例えば、連結パターンの外接矩形を仮統合することにより、暫定的に平均文字サイズを算出する。仮統合の方法として、近接する連結パターンを統合した際の縦横比Ｐが、例えば、
Ｎ（＝０．８）＜Ｐ＜Ｍ（＝１．２）
を満たす場合、仮統合を行う。そして、仮統合を行った後の平均文字サイズを算出する。なお、文字列の外接矩形の平均文字サイズは、外接矩形のサイズ別の頻度ヒストグラムを生成して求めるようにしてもよい。
【０２９３】
次に、ステップＳ１８４に示すように、図５２のパラメータａ〜ｇを算出する。
非統計的処理では、文字列中の濁点や分離文字などを対象にしており、分離文字処理と濁点処理とに分ける。
【０２９４】
分離文字に対する処理では、パターンの傾き、線密度、隣接するパターン同士を統合したときのサイズ、パターン同士の距離を切り出しパラメータとして用いる。
【０２９５】
例えば、図５４に示すように、
８）外接矩形２２１の右枠と外接矩形２２２の左枠との距離ａと外接矩形２２１の左枠と外接矩形２２２の右枠との距離ｂとの比ｐ、
９）外接矩形２２１の左枠と外接矩形２２２の右枠との距離ｂと外接矩形平均幅ＭＸとの比ｑ、
１０）外接矩形２１の面積ｃと外接矩形２２の面積ｄとの積と外接矩形平均幅ＭＸと外接矩形平均高さＭＹとの積の平方との比ｒ、
を切り出しパラメータとして用いる。
【０２９６】
すなわち、
ｐ＝ａ／ｂ・・・（７）
ｑ＝ｂ／ＭＸ・・・（８）
ｒ＝（ｃ×ｄ）／（ＭＸ×ＭＹ）² ・・・（９）
である。
【０２９７】
次に、分離文字処理を図５５のフローチャートを参照しながら説明する。この分離文字処理は、例えば、“ハ”又は“ル”などのように２つ以上の連結パターンから構成される分離文字を検出するものである。
【０２９８】
まず、ステップＳ１９１に示すように、連結パターンのうち、右上がりとなっているパターンがあるかどうか判断する。そして、右上がりとなっているパターンがない場合、分離文字処理の対象から除外する。
【０２９９】
一方、ステップＳ１９１において、右上がりとなっているパターンであると判断された場合、ステップＳ１９２に進み、右上がりとなっているパターンの右隣に隣接し、且つ右下がりとなっているパターン、すなわち、例えば、“ハ”に対応するパターン、又は、右上がりとなっているパターンの右隣に隣接し、且つ直角方向に探索した場合のパターンと交差する回数（直角線密度）が２となるパターン、すなわち、例えば、“ル”に対応するパターンがあるかどうか判断する。そして、これらの“ハ”又は“ル”などのような形状のパターンでなければ、分離文字処理の対象から除外する。
【０３００】
一方、ステップＳ１９２において、“ハ”又は“ル”などのような形状のパターンであると判断した場合、ステップＳ１９４に進む。
また、上記ステップＳ１９１、Ｓ１９２とは別に、ステップＳ１９３で、文字列の外接矩形の平均文字サイズを算出する。
【０３０１】
上記ステップＳ１９２とＳ１９３が終了した後、ステップＳ１９４で、図５４に示されたパラメータｐ〜ｒの値を算出する。
また、濁点処理では、濁点候補パターンに着目し、例えば、そのパターンとその隣接パターンを統合したときのサイズ、両パターン間の距離、及びそれらと平均文字サイズとの比を、切り出しパラメータとして用いる。
【０３０２】
すなわち、図５６に示すように、
１１）外接矩形２３１の右枠と外接矩形２３２の左枠との距離ａと外接矩形２３１の左枠と外接矩形２３２の右枠との距離ｂとの比ｐ、
１２）外接矩形２３１の左枠と外接矩形２３２の右枠との距離ｂと外接矩形平均幅ＭＸとの比ｑ、
１３）外接矩形２３１の面積ｃと外接矩形２３２の面積ｄとの積と外接矩形平均幅ＭＸと外接矩形平均高さＭＹとの積の平方との比ｒを、
切り出しパラメータとして用いる。
【０３０３】
すなわち、パラメータｐ〜ｒは、（７）〜（９）式と同様に表すことができる。
次に、濁点処理を図５７のフローチャートを参照しながら、説明する。
【０３０４】
まず、ステップＳ２０１で、濁点候補となるパターンを抽出する。すなわち、例えば、連結パターン抽出手段１により抽出された連結パターンが２つ隣接して存在する場合で、且つそれらを統合した時のサイズと文字列の外接矩形の平均文字サイズとの比が所定のしきい値以下、例えば、１／４以下である場合、濁点候補となるパターンとして抽出する。
【０３０５】
次に、ステップＳ２０２に示すように、濁点候補となるパターンの左隣に隣接する外接矩形があるかどうかを調べる。そして、濁点候補となるパターンの左隣に隣接する外接矩形がない場合、濁点候補となるパターンを濁点処理の対象から除外する。
【０３０６】
一方、ステップＳ２０２において、濁点候補となるパターンの左隣に隣接する外接矩形があると判断された場合、ステップＳ２０４に進む。
また、上記ステップＳ２０１、Ｓ２０２とは別に、ステップＳ２０３で、文字列の外接矩形の平均文字サイズを算出する。そして、上記ステップＳ２０２、Ｓ２０３の処理が終了した後、ステップＳ２０４で、図５６に示されたパラメータｐ〜ｒの値を算出する。
【０３０７】
次に、学習データを用いて、未知の手書き文字列に対する文字の切り出しの信頼度を算出するための判別面を設定し、パラメータ数がｎの場合、切り出しが成功した群と切り出しが失敗した群との２群をｎ次元の空間上に生成する。
【０３０８】
図５８は、切り出しの成否データの算出方法を示すフローチャートである。
図５８において、まず、ステップＳ２１１で、事前に集めた学習データに対して、着目する外接矩形とそれに隣接する外接矩形とを統合して１文字になるかどうかを目視により判断する。そして、着目する外接矩形とそれに隣接する外接矩形とを統合して１文字になる場合、ステップＳ２１２に進み、着目する外接矩形とそれに隣接する外接矩形とを統合して１文字にならない場合、ステップＳ２１３に進む。
【０３０９】
ステップＳ２１２では、着目する外接矩形とそれに隣接する外接矩形とを統合して１文字になる統合成功の場合について、その着目する外接矩形とそれに隣接する外接矩形におけるパラメータの値を記録する。ここで、着目する外接矩形とそれに隣接する外接矩形におけるパラメータは、統計的処理の場合、図４８のパラメータａ〜ｇを用いることができ、非統計的処理の場合、図５４、５６のパラメータｐ〜ｒを用いることができる。
【０３１０】
また、ステップＳ２１３では、着目する外接矩形とそれに隣接する外接矩形とを統合して１文字にならない統合失敗の場合について、その着目する外接矩形とそれに隣接する外接矩形におけるパラメータの値を記録する。
【０３１１】
次に、未知の文字列について、統計的処理における切り出しパラメータと非統計的処理における切り出しパラメータの値を算出し、このパラメータの値によって定まる多次元空間上の点に対し、学習データにより得られている判別面からの距離を求め、これを切り出しの信頼度として定量化する。
【０３１２】
例えば、特徴量パラメータ数が３の場合、図５９に示すように、切り出し成功と切り出し失敗との２群を判別する判別面をＨ、判別面Ｈの単位法線ベクトルをｎとし、あるパラメータの値がｐのベクトル値をとるとき、そのパラメータの値に対応する３次元空間内の点ｐの判別面からの距離ｈは、
ｈ＝ＯＰ・ｎ・・・（１０）
と表される。ここで、ＯＰは、３次元空間内の原点Ｏから３次元空間内の点ｐに向けたベクトルである。
【０３１３】
そして、判別面Ｈからの距離ｈが正をとるか負をとるかで、パラメータの値がどちらの群、すなわち、切り出しが成功した方の群または切り出しが失敗した方の群のいづれの群に属するか、また、パラメータの値が判別面Ｈからどの程度離れているかが分かる。
【０３１４】
次に、図６０に示すように、多次元空間内の学習データの全パラメータに対して、判別面Ｈからの距離ｈに基づいて、切り出し成功のヒストグラム分布２４１と切り出し失敗のヒストグラム分布２４２をとる。一般的に、このヒストグラム分布２４１、２４２は正規分布になるので、ヒストグラム分布２４１、２４２を正規分布で近似する。これらの正規分布は、通常、部分的に重なる領域が生ずる。
【０３１５】
本実施例では、この重なる領域に位置する切り出しパラメータを有する隣接パターンについての切り出しの信頼度に加え、文字認識の信頼度を加味してそれらを統合するか否かを判定する。
【０３１６】
図６１は、切り出し信頼度の算出法の一例を示すフローチャートである。
図６１において、まず、ステップＳ２２１に示すように、複数のパラメータの値によって定まる多次元空間上の点に対する判別面Ｈからの距離ｈを、前記（１０）式により算出する。
【０３１７】
次に、ステップＳ２２２に示すように、学習データにより得られた複数のパラメータの値のヒストグラム分布を正規分布で近似する。すなわち、例えば、図６２に示すように、切り出し成功のヒストグラム分布を正規分布２５１で近似し、切り出し失敗のヒストグラム分布を正規分布２５２で近似する。
【０３１８】
次に、ステップＳ２２３で、２群の重なり領域を算出する。例えば、図６２に示すように、切り出し成功の正規分布２５１と切り出し失敗の正規分布２５２とが重なる領域を２群の重なり領域２５４として算出する。また、このとき、切り出し成功の正規分布２５１の内、上記２群の重なり領域２５４以外の領域２５３を切り出し成功領域と設定する。さらに、切り出し失敗の正規分布２５２の内、上記２群の重なり領域２５４以外の領域２５５を切り出し失敗領域と設定する。
【０３１９】
次に、ステップＳ２２４に示すように、未知文字についての入力パラメータの値のヒストグラム分布上での位置を判定する。
次に、ステップＳ２２５に示すように、未知文字についての入力パラメータの値のヒストグラム分布上での位置を判定した結果、未知文字についての入力パラメータの値が２群の重なり領域２５４に含まれる場合、ステップＳ２２６に進む。そして、２群の重なり領域２５４での未知文字についての入力パラメータの値の位置に基づいて、切り出し信頼度を算出する。
【０３２０】
一方、ステップＳ２２５において、未知文字についての入力パラメータの値が２群の重なり領域２５４に含まれないと判断された場合、ステップＳ２２６に進み、未知文字についての入力パラメータの値が切り出し成功領域２５３に含まれるかどうかを判断する。
【０３２１】
そして、未知文字についての入力パラメータの値が切り出し成功領域２５３に含まれると判断された場合、ステップＳ２２８に進み、切り出し信頼度を“１”とし、未知文字についての入力パラメータの値が切り出し成功領域２５３に含まれないと判断された場合、ステップＳ２２９に進み、切り出し信頼度を“０”とする。
【０３２２】
例えば、図６２において、未知文字についての入力パラメータの値に対する判別面からの距離を算出した結果、未知文字についての入力パラメータの値の判別面からの距離が重なり領域２５４に含まれる場合、未知文字についての入力パラメータの値の判別面からの距離に基づいて、切り出し信頼度を算出する。また、未知文字についての入力パラメータの値の判別面からの距離が切り出し成功領域２５３に含まれる場合、その切り出し信頼度を“１”とする。また、未知文字についての入力パラメータの値の判別面からの距離が切り出し失敗領域２５５に含まれる場合、その切り出し信頼度を“０”に設定する。
【０３２３】
図６３は、２群の重なり領域算出方法の一例を示すフローチャートである。
図６３において、まず、ステップＳ２３１に示すように、学習データから得られた切り出し成功のヒストグラム分布と切り出し失敗のヒストグラム分布のそれぞれについて、ヒストグラム２６１の平均値ｍと分散値ｖとを算出する。
【０３２４】
次に、ステップＳ２３２で、切り出し成功のヒストグラム分布と切り出し失敗のヒストグラム分布について、正規分布曲線２６２とヒストグラム２６１との２乗誤差の総和ｄを算出する。
【０３２５】
次に、ステップＳ２３３で、適合度Ｔを下記の（１１）式により算出する。
Ｔ＝ｄ／Ｓ・・・（１１）
ここで、Ｓは正規分布曲線２６２の面積である。
【０３２６】
次に、ステップＳ２３４で、正規分布曲線２６２の中心から端までの距離Ｌを下記の（１２）式により算出する。
Ｌ＝ｋ×（１＋Ｔ）×ｖ^1/2 ・・・（１２）
ここで、ｋは比例定数である。また、ｖ^1/2は、標準偏差に等しい。
【０３２７】
次に、ステップＳ２３５で、正規分布曲線２６３の右端２６７から正規分布曲線２６４の左端２６６までの間の領域を、２群の重なり領域２６５として設定する。
【０３２８】
次に、切り出し文字の候補に対し、図６１の処理により求めた切り出し信頼度に基づいて認識処理を行うかどうかを決定する。この場合、例えば、切り出し信頼度が高い切り出し文字の候補に対しては認識処理を行わず、切り出し信頼度が低い切り出し文字の候補に対してのみ認識処理を行うようにする。
【０３２９】
ここで、複数の切り出し文字の候補に対して、それらに対する認識の信頼度だけでなく、切り出しの信頼度も考慮して切り出し文字を決定する。このことにより、部分的に見ると文字のように見えるが、文字列全体から見ると間違っているような候補文字を、切り出し文字から除外することができる。例えば、各隣接パターンまたは切り出し確定部の切り出し信頼度をα_i、認識信頼度をβ_i、重み係数をｊとすると、全体の信頼度Ｒは、
Ｒ＝Σ（ｊ・α_i＋β_i）・・・（１３）
と表せる。
【０３３０】
そして、複数の切り出し文字の候補の中から全体の信頼度Ｒが最も大きいものを、最終的な切り出し文字として選択する。
図６４は、“グンマ”という文字列から文字を１文字ずつ切り出す場合を示す図である。ここで、“グンマ”という文字列の切り出しを行うのに先立ち、学習データを用いて、統計的処理と非統計的処理とに対する判別面とヒストグラム値の正規分布曲線を、それぞれ、個別に求める。
【０３３１】
ここで、統計的処理では、文字列の切り出しの成功又は失敗を判定するためのパラメータとして、図５２のパラメータｃ、ｅ、ｆを用い、学習データにより得られた判別面の式は、
０．８４ｘ０＋０．４３ｘ１＋０．３３ｘ２−１４５．２５＝０・・（１４）
であるものとする。
【０３３２】
また、図６３に示す学習データの切り出し成功を示すヒストグラム分布の平均値ｍは１２８．９４２、標準偏差は３４．７７となり、適合度Ｔは（１１）式より０．１２となる。また、比例定数ｋを２とすると、分布中心から端までの距離Ｌは（１２）式より７７．８となる。
【０３３３】
また、図６３に示す学習データの切り出し失敗を示すヒストグラム分布の平均値ｍは７１．１２９、標準偏差は３６．２６となり、適合度Ｔは（１１）式より０．３５となる。また、比例定数ｋを２とすると、分布中心から端までの距離Ｌは（１０）式より９２．２となる。
【０３３４】
図６４において、まず、ステップＳ２４１に示すように、イメージ入力により未知文字についての入力パターンを読み込む。
次に、ステップＳ２４２で、ラベリングにより連結パターンを抽出し、抽出された各連結パターンに対して図６４に示すようにラベル番号▲１▼〜▲６▼を付す。
【０３３５】
次に、ステップＳ２４５に示すように、ステップＳ２４３の統計的処理及びステップＳ２４４の非統計的処理に基づいて、切り出し信頼度の定量化を行う。
ステップＳ２４３の統計的処理では、互いに隣接する連結パターンを統合した場合の切り出し信頼度を、パラメータｃ、ｅ、ｆの値を有する３次元空間上の点に対する判別面からの距離ｈに基づいて算出する。この切り出し信頼度αは、例えば、
α＝（ｈ−ｗ₁）／（ｗ₂−ｗ₁）×１００・・・（１５）
で表すことができる。
【０３３６】
ここで、
ｗ₁：２群の重なり領域の左端の位置
ｗ₂：２群の重なり領域の右端の位置
である。
【０３３７】
例えば、ラベル番号▲１▼のパターンとラベル番号▲２▼のパターンとを統合した場合の切り出し信頼度は８０、ラベル番号▲２▼のパターンとラベル番号▲３▼のパターンとを統合した場合の切り出し信頼度は１２、ラベル番号▲３▼のパターンとラベル番号▲４▼のパターンとを統合した場合の切り出し信頼度は２８、ラベル番号▲４▼のパターンとラベル番号▲５▼のパターンとを統合した場合の切り出し信頼度は９２、ラベル番号▲５▼のパターンとラベル番号▲６▼のパターンとを統合した場合の切り出し信頼度は５となる。
【０３３８】
また、ステップＳ２４４の非統計的処理では、濁点候補を有するパターン“グ”についての切り出し信頼度を、図５６のパラメータｐ〜ｒの値を有する３次元空間上の点に対する判別面からの距離ｈに基づいて算出する。
【０３３９】
例えば、ラベル番号▲１▼のパターンと、ラベル番号▲２▼のパターン及びラベル番号▲３▼のパターンからなる切り出し確定部２７１の濁点パターンとを統合した場合の切り出し信頼度は８５となる。
【０３４０】
このステップＳ２４４の非統計的処理における切り出し信頼度の算出方法を図６５に示す。
まず、ステップＳ２５１で、濁点候補となるパターン２７２を抽出する。例えば、連結パターンが２つ隣接して存在する場合で、且つ、それらを統合した時のサイズと文字列の外接矩形の平均文字サイズとの比が所定のしきい値以下である場合、濁点候補となるパターンとする。
【０３４１】
次に、ステップＳ２５２で、濁点候補となるパターン２７２の左隣に隣接する外接矩形２８１があるかどうかを調べ、この場合、濁点候補となるパターン２７２の左隣に隣接する外接矩形２８１があると判断された結果、ステップＳ２５３に進み、図５６のパラメータｐ〜ｒの値を算出する。
【０３４２】
図６５の例では、
ｐ＝ａ／ｂ＝０．１・・・（１６）
ｑ＝ｂ／ＭＸ＝１．３・・・（１７）
ｒ＝（ｃ×ｄ）／（ＭＸ×ＭＹ）²＝０．３・・・（１８）
となる。
【０３４３】
ここで、
ａ：外接矩形２８１の右枠と外接矩形２７２の左枠との距離
ｂ：外接矩形２８１の左枠と外接矩形２７２の右枠との距離
ｃ：外接矩形２８１の面積
ｄ：外接矩形２７２の面積
ＭＸ：外接矩形平均幅
ＭＹ：外接矩形平均高さ
である
次に、ステップＳ２５４に示すように、パラメータｐ〜ｒの値を有する３次元空間上の点に対する判別面２９３からの距離を算出する。
【０３４４】
このパラメータｐ〜ｒの値を有する３次元空間上の点に対する判別面２９３からの距離を算出するために、学習パターンに基づいて判別面２９３を算出しておく。この判別面２９３は、例えば、学習パターンの文字列の切り出しの成功を示すヒストグラム分布２９２及び失敗を示すヒストグラム分布２９１に基づいて、（３）式により求めることができ、濁点抽出のパラメータｐ〜ｒを用いた場合の判別面２９３の式は、例えば、
０．１７ｘ０＋０．７５ｘ１＋０．６４ｘ２＋３０．４＝０・・（１９）
で表され、３次元空間内の平面の方程式となる。
【０３４５】
従って、判別面２９３からの距離ｈは、（１６）〜（１８）の値を（１９）式に代入して、
ｈ＝０．１７×０．１−０．７５×１．３＋０．６４×０．３＋３０．４
＝２９．６・・・（２０）
となる。
【０３４６】
また、学習データの切り出し成功を示すヒストグラム分布２９２の平均値ｍは３８、標準偏差は２５となり、適合度Ｔは（１１）式より０．２となり、学習データの切り出し失敗を示すヒストグラム分布２９１の平均値ｍは−３４、標準偏差は２８となり、適合度Ｔは（１１）式より０．３となる。
【０３４７】
また、学習データの切り出し成功を示すヒストグラム分布２９２の左端ｗ₁は、比例定数ｋを２とすると、（１２）式より、
ｗ₁＝３８−２×（１＋０．２）×２５＝−２２・・・（２１）
となる。
【０３４８】
また、学習データの切り出し失敗を示すヒストグラム分布２９１の右端ｗ₂は、比例定数ｋを２とすると、（１２）式より、
ｗ₂＝−３４＋２×（１＋０．３）×２８＝３８．８・・・（２２）
となる。
【０３４９】
従って、２群の重なり領域２９４は、判別面からの距離が−２２〜３８．８の間の領域となる。
次に、ステップＳ２５５で、切り出し信頼度αを求める。この切り出し信頼度αは、（２０）〜（２２）の値を（１５）式に代入して、
α＝（２９．６−（−２２））／（３８．８−（−２２））×１００
＝８５・・・（２３）
となる。
【０３５０】
これにより、ラベル番号▲２▼とラベル番号▲３▼とが統合されて切り出し確定部２７１となる。
次に、図６４のステップＳ２４６で、統計的処理と非統計的処理の信頼度を合成する。このとき、切り出し確定部があれば、それを優先する。従って、切り出し確定部２７１の信頼度が優先して合成される。
【０３５１】
この結果、ラベル番号▲１▼のパターンと切り出し確定部２７１のパターンとを統合した場合の切り出し信頼度は８５、切り出し確定部２７１のパターンとラベル番号▲４▼のパターンとを統合した場合の切り出し信頼度は３０、ラベル番号▲４▼のパターンとラベル番号▲５▼のパターンとを統合した場合の切り出し信頼度は９２、ラベル番号▲５▼のパターンとラベル番号▲６▼のパターンとを統合した場合の切り出し信頼度は５となる。
【０３５２】
そして、例えば、切り出し信頼度が所定のしきい値（例えば、９０）より大きいか又は、切り出し信頼度が所定のしきい値（例えば、７０）より大きく且つ、その隣の切り出しパターンの切り出し信頼度との比が所定の値（例えば、５）より大きい場合、パターンの統合を行う。
【０３５３】
また、切り出し信頼度が所定のしきい値（例えば、８）より小さい場合、パターンの統合を行わない。
例えば、ラベル番号▲１▼のパターンと切り出し確定部２７１のパターンとを統合した場合の切り出し信頼度は８５で、その隣のラベル番号▲４▼のパターンに対する切り出し信頼度の比は、８５／３０＝２．８であるので、ラベル番号▲１▼のパターンと切り出し確定部２７１のパターンとを統合しない。また、切り出し確定部２７１のパターンとラベル番号▲４▼のパターンとを統合した場合の切り出し信頼度は３０であり、切り出し確定部２７１のパターンとラベル番号▲４▼のパターンとを統合しない。
【０３５４】
また、ラベル番号▲４▼のパターンとラベル番号▲５▼のパターンとを統合した場合の切り出し信頼度は９２であるので、ラベル番号▲４▼のパターンとラベル番号▲５▼のパターンとを統合する。また、ラベル番号▲５▼のパターンとラベル番号▲６▼のパターンとを統合した場合の切り出し信頼度は５であり、ラベル番号▲５▼のパターンとラベル番号▲６▼のパターンとを統合しない。
【０３５５】
これにより、ラベル番号▲４▼のパターンとラベル番号▲５▼のパターンとを統合した切り出し確定部２７３に対応する外接矩形２７５、ラベル番号▲６▼のパターンに対応する外接矩形２７６が生成される。
【０３５６】
また、新たに生成された切り出し確定部２７３のパターンと切り出し確定部２７１のパターンとを統合した場合の切り出し信頼度を求める。この切り出し信頼度は、図６４の例では、６０となる。
【０３５７】
次に、ステップＳ２４７に示すように、切り出し信頼度によるパターンの統合が終了した時点で、切り出し候補１及び切り出し候補２を抽出する。そして、切り出し候補１及び切り出し候補２のそれぞれの文字に対して認識処理を行い、切り出し候補１及び切り出し候補２における文字内の切り出し信頼度αと認識信頼度βとをそれぞれの文字について求め、切り出し信頼度αと認識信頼度βとの総和をとったものを全体の信頼度Ｒとする。
【０３５８】
例えば、切り出し候補１として、外接矩形２７５、２７６、２７８を切り出した場合、外接矩形２７８内のパターンに対して文字認識を行った場合の認識信頼度βは８０となり、外接矩形２７５内のパターンに対して文字認識を行った場合の認識信頼度βは９０となり、外接矩形２７６内のパターンに対して文字認識を行った場合の認識信頼度βは８５となる。
【０３５９】
また、ラベル番号▲１▼のパターンと切り出し確定部２７１のパターンとを統合した場合の切り出し信頼度αは８５であるので、全体の信頼度Ｒは、重み係数ｊを１とすると、（１３）式により、３４５となる。
【０３６０】
また、切り出し候補２として、外接矩形２７６、２８１、２８２を切り出した場合、外接矩形２８１内のパターンに対して文字認識を行った場合の認識信頼度βは８３となり、外接矩形２８２内のパターンに対して文字認識を行った場合の認識信頼度βは５５となり、外接矩形２７６内のパターンに対して文字認識を行った場合の認識信頼度βは８５となる。
【０３６１】
また、切り出し確定部２７１のパターンと切り出し確定部２７３のパターンとを統合した場合の切り出し信頼度αは６０であり、全体の信頼度Ｒは２８３となる。
【０３６２】
次に、ステップＳ２４８で、切り出し候補１又は切り出し候補２のうち、全体の信頼度Ｒが大きい方の切り出し候補１を切り出し成功の文字候補として選択する。この結果、“グンマ”という文字列から、“グ”、“ン”、“マ”の各文字を１文字ずつ正しく切り出すことができる。
【０３６３】
なお、文字列からの切り出し信頼度を考慮しながら文字の認識処理を行う方法については、例えば、特願平７−２３４９８２号の明細書及び図面に記載されている。
【０３６４】
次に、図３のかすれ文字認識部１９の動作について具体的に説明する。
図６６は、かすれ文字認識部１９の構成の一実施例を示すブロック図である。図６６において、特徴抽出部３０１は、かすれ文字パターンから文字の特徴を抽出し、この抽出した特徴を特徴ベクトルにより表す。一方、かすれ辞書３０２には、かすれ文字についての各カテゴリの特徴ベクトルが格納されている。そして、照合部３０３は、特徴抽出部３０１により抽出した文字パターンの特徴ベクトルを、かすれ辞書３０２に格納されている各カテゴリの特徴ベクトルと照合し、特徴空間上での特徴ベクトル間の距離Ｄ_ij（ｉは未知文字の特徴ベクトル、ｊはかすれ辞書３０２のカテゴリの特徴ベクトル）を算出する。その結果、特徴ベクトル間の距離Ｄ_ijを最小とするカテゴリｊを未知文字ｉとして認識する。
【０３６５】
ここで、特徴空間上での特徴ベクトル間の距離Ｄ_ijは、例えば、ユークリッド距離Σ（ｉ−ｊ）²、シティブロック距離Σ｜ｉ−ｊ｜、又は判別関数などの識別関数を用いて算出する。
【０３６６】
なお、第１位のカテゴリとの距離をＤ_ij1、第２位のカテゴリとの距離をＤ_ij ₂とすると、第１位のカテゴリｊ１、第２位のカテゴリｊ２、カテゴリ間の距離（Ｄ_ij2−Ｄ_ij1）及び信頼度に関するテーブル１を予め作成しておく。また、第１位のカテゴリとの距離をＤ_ij1、第１位のカテゴリｊ１及び信頼度に関するテーブル２も予め作成しておく。そして、テーブル１とテーブル２とからそれぞれ得られる信頼度の小さい方を中間処理結果テーブルに格納する。
【０３６７】
図３のつぶれ文字認識部２１は、かすれ文字認識部１９のかすれ辞書３０２の代わりに、つぶれ文字についての各カテゴリの特徴ベクトルを格納したつぶれ辞書を用いることを除いて、かすれ文字認識部１９と同様の構成とすることができる。
【０３６８】
次に、図３の消し線認識部２６の一実施例について説明する。この消し線認識部２６は、図４のステップＳ４の訂正解析により抽出された訂正文字の候補に対し、例えば、横方向の画素数の和をとったヒストグラムを作成し、このヒストグラム値が所定の値を越えた領域に横消し線が存在するものとして、この領域に存在している横線を除去する。
【０３６９】
次に、この横線を除去することによりかすれた部分を補完し、この補完後のパターンについて辞書照合を行うことにより、文字認識を行う。この結果、文字と認識されたものについては、訂正文字の候補を消し線付き文字とみなし、リジェクトされたものについては、訂正文字の候補を通常文字とみなす。
【０３７０】
例えば、図６７において、訂正文字の候補として、横二重線により訂正された状態の「５」が入力され、この横二重線を除去して補完したパターンが「５」のカテゴリとして認識された結果、入力されたパターンは訂正文字とみなされる。また、訂正文字の候補として、「５」が入力され、この「５」の横線を除去したパターンがリジェクトされた結果、入力されたパターンは訂正文字でないとみなされる。
【０３７１】
次に、図３のくせ字解析部２３の一実施例について説明する。このくせ字解析部２３は、同一のカテゴリに属すると認識された手書き文字を所定のクラスタ数にクラスタリングし、異なるカテゴリに属するクラスタ間の距離の小さいものについては、要素数が少ない方のクラスタの文字カテゴリを要素数が多い方のクラスタの文字カテゴリに修正することにより、別のカテゴリに属するものと誤って認識された手書き文字を正読化する。
【０３７２】
図６８は、「４」の文字カテゴリに属すると判定された手書き文字の特徴ベクトルによるクラスタリング処理を示す図である。
図６８には、認識辞書に格納されている「４」の文字カテゴリの特徴ベクトルとの距離が近いため、「４」の認識結果カテゴリに属すると判定された手書き文字が示されている。ここで、この認識処理では、「２」と手書きされた文字が「４」の認識結果カテゴリに属すると誤って認識されている。
【０３７３】
そして、１回目のクラスタリング処理では、「４」の文字カテゴリに属すると判定された手書き文字をそれぞれ１つのクラスタとみなし、２回目のクラスタリング処理では、クラスタとみなされた手書き文字の間での特徴ベクトルの距離を算出し、特徴ベクトルの距離が最も近いものを１つのクラスタに統合する。この結果、図６８の例では、クラスタ数が１１から１つだけ減少して１０になっている。
【０３７４】
３回目以降のクラスタリング処理においても、クラスタ間での特徴ベクトルの距離を算出し、特徴ベクトルの距離が最も近いものを統合することにより、クラスタ数を減少させ、１１回目のクラスタリング処理でクラスタ数は１となる。
【０３７５】
ここで、クラスタ同士を統合する場合、要素数が１のクラスタ、すなわち、特徴ベクトル同士の距離の比較には、例えば、シティブロック距離を用いる。要素数が複数のクラスタ同士の場合、例えば、重心法を用いる。この重心法は、要素数がＭ個のクラスタのｉ番目（ｉ＝１、２、３、・・・、Ｍ）の要素の特徴ベクトルｘ_iをｘ_i＝（ｘ_i1、ｘ_i2、ｘ_i3、・・・、ｘ_iN）と表した時、そのクラスタを代表する代表ベクトルｘ_mを、そのクラスタの要素の特徴ベクトルｘ_iの平均で表し、
【０３７６】
【数１】

【０３７７】
とする。
そして、代表ベクトルｘ_m同士についてのシティブロック距離を算出することにより、要素数が複数のクラスタ同士の距離の比較を行うものである。
【０３７８】
なお、クラスタ数が１になるまでクラスタリング処理を続けると、「４」の文字カテゴリに属すると誤って認識された「２」の手書き文字も、「４」の文字カテゴリに属すると正しく認識された「４」の手書き文字と同一のクラスタに属するようになるので、クラスタリング処理を途中で打ち切るクラスタリング打ち切り条件を設定する。
【０３７９】
このクラスタリング打ち切り条件としては、例えば、
（１）最終クラスタ数が所定の数（例えば、３）になった時、
（２）クラスタ統合時のクラスタ間距離が所定のしきい値以上になった時、
（３）クラスタ統合時のクラスタ間距離の増加率が所定のしきい値以上になった時、
のいずれかの条件を用いることができる。
【０３８０】
図６９は、クラスタリング処理を示すフローチャートである。
図６９において、まず、ステップＳ２６１に示すように、ある文字カテゴリに属すると認識された手書き文字の特徴ベクトルだけを抽出し、抽出されたそれぞれの手書き文字の特徴ベクトルを１つのクラスタとみなす。
【０３８１】
次に、ステップＳ２６２に示すように、クラスタリング処理を途中で打ち切るクラスタリング打ち切り条件を設定する。
次に、ステップＳ２６３に示すように、ある文字カテゴリについての全てのクラスタの中で、最も距離の近い２つのクラスタを選択する。
【０３８２】
次に、ステップＳ２６４に示すように、ステップＳ２６２で設定したクラスタリング打ち切り条件を満たしているかどうかを判断し、クラスタリング打ち切り条件を満たしていない場合、ステップ２６５に進んで、ステップＳ２６３で選択した２つのクラスタ同士を統合し、ステップＳ２６３に戻り、クラスタを統合する処理を繰り返す。
【０３８３】
そして、クラスタを統合する処理を繰り返した結果、ステップＳ２６４でクラスタリング打ち切り条件を満たすと判断された場合、ステップ２６６に進んで、全ての文字カテゴリに対してクラスタリング処理を行ったかどうかを判断し、全ての文字カテゴリに対してクラスタリング処理を行っていない場合、ステップ２６１に戻り、クラスタリング処理を行っていない文字カテゴリについてのクラスタリング処理を行う。
【０３８４】
一方、ステップ２６６で全ての文字カテゴリに対してクラスタリング処理を行ったと判断された場合、ステップ２６７に進んで、クラスタリング結果をメモリに格納する。
【０３８５】
次に、クラスタリング処理により得られたクラスタリング結果に基づいて、別のカテゴリに属するものと誤って認識された手書き文字を正読化する。
図７０は、「２」と手書きされた文字が「４」の文字カテゴリに属すると誤って認識された認識結果を、正しい文字カテゴリ「２」に正読化する処理を示す図である。
【０３８６】
図７０には、「２」の認識結果カテゴリに属すると判定された手書き文字及び「４」の認識結果カテゴリに属すると判定された手書き文字が示されている。ここで、「３」と手書きされた文字が「２」の認識結果カテゴリに属すると誤って認識され、「２」と手書きされた文字が「４」の認識結果カテゴリに属すると誤って認識され、「４」と手書きされた文字がいずれの認識結果カテゴリにも属さないとしてリジェクトされている。
【０３８７】
次に、クラスタリング打ち切り条件を、同一カテゴリ内における最終クラスタ数が３になった時に設定して、クラスタリング処理を行うことにより、「２」の認識結果カテゴリについてはクラスタａ、ｂ、ｃが生成され、「４」の認識結果カテゴリについてはクラスタｄ、ｅ、ｆが生成され、リジェクトされた３つの「４」の手書き文字についてはそれぞれクラスタｇ、ｈ、ｉが生成されている。
【０３８８】
次に、「２」の認識結果カテゴリに属するクラスタａ、ｂ、ｃと「４」の認識結果カテゴリに属するクラスタｄ、ｅ、ｆとの中から、文字数の少ないクラスタａ、ｄを誤読候補クラスタとして抽出する。
【０３８９】
次に、誤読候補クラスタａとそれ以外のクラスタｂ、ｃ、ｄ、ｅ、ｆのそれぞれとの距離及び誤読候補クラスタｄとそれ以外のクラスタａ、ｂ、ｃ、ｅ、ｆのそれぞれとの距離を算出する。そして、誤読候補クラスタａと最も距離が近いクラスタとしてクラスタｂを抽出し、誤読候補クラスタａとクラスタｂとの間の距離が所定の値以下であるかどうかを判定し、誤読候補クラスタａとクラスタｂとの間の距離は所定の値以下でないので、誤読候補クラスタａはリジェクト化される。
【０３９０】
この結果、「２」の認識結果カテゴリに属すると誤って認識された「３」と手書きされた文字が、「２」の認識結果カテゴリから除外される。
また、誤読候補クラスタｄと最も距離が近いクラスタとしてクラスタｂを抽出し、誤読候補クラスタｄとクラスタｂとの間の距離が所定の値以下であるかどうかを判定し、誤読候補クラスタｄとクラスタｂとの間の距離は所定の値以下なので、誤読候補クラスタｄはクラスタｂと統合されクラスタｊが生成されるとともに、クラスタｊは、要素数が多い方のクラスタｂの属していた「２」の認識結果カテゴリに属すると判定されて、「４」と誤読されたために誤読候補クラスタｄに属するとされた「２」の手書き文字が正読化される。
【０３９１】
さらに、いずれの認識結果カテゴリにも属さないとしてリジェクトされたクラスタｇ、ｈ、ｉとそれ以外のクラスタａ〜ｆとの距離を算出する。そして、クラスタｇと最も距離が近いクラスタとしてクラスタｅを抽出し、クラスタｇとクラスタｅとの間の距離が所定の値以下であるかどうかを判定し、クラスタｇとクラスタｅとの間の距離は所定の値以下なので、クラスタｇはクラスタｅと統合される。
【０３９２】
また、クラスタｈと最も距離が近いクラスタとしてクラスタｅを抽出し、クラスタｈとクラスタｅとの間の距離が所定の値以下であるかどうかを判定し、クラスタｈとクラスタｅとの間の距離は所定の値以下なので、クラスタｈはクラスタｅと統合される。クラスタｇ及びクラスタｈがクラスタｅに統合された結果、クラスタｋが生成されるとともに、クラスタｋは、要素数が多い方のクラスタｅの属していた「４」の認識結果カテゴリに属すると判定されて、認識不能としてリジェクトされた「４」の手書き文字が正読化される。
【０３９３】
また、クラスタｉと最も距離が近いクラスタとしてクラスタｅを抽出し、クラスタｉとクラスタｅとの間の距離が所定の値以下であるかどうかを判定し、クラスタｉとクラスタｅとの間の距離は所定の値以下でないので、クラスタｉはクラスタｅと統合しないようにする。
【０３９４】
図７１は、文字カテゴリ認識結果修正処理を示すフローチャートである。
図７１において、まず、ステップＳ２７１に示すように、図６９のクラスタリング処理により得られたクラスタリング結果についてのデータをメモリから読み出す。
【０３９５】
次に、ステップＳ２７２に示すように、図６９のクラスタリング処理により得られた全てのカテゴリの全てのクラスタについて、各クラスタ間での距離を算出し、各クラスタ間の距離を比較する。
【０３９６】
次に、ステップＳ２７３に示すように、クラスタ間の距離がしきい値以下のクラスタが存在するかどうかを判断し、クラスタ間の距離がしきい値以下のクラスタが存在する場合、ステップＳ２７４に進んで、それらのクラスタ同士を統合し、クラスタ間の距離がしきい値以下のクラスタが存在しない場合、それらのクラスタをリジェクトする。
【０３９７】
ここで、クラスタ統合時のクラスタ間の距離のしきい値として、例えば、２つのクラスタのうち、要素数が多い方のクラスタ内のベクトル間距離の定数倍を用いる。すなわち、要素数がＭ個のクラスタＡと要素数がＮ（Ｍ＞Ｎ）個のクラスタＢとを統合する場合、クラスタＡの代表ベクトルをｘａｍ、クラスタＢの代表ベクトルをｘｂｍ、クラスタＡ内の特徴ベクトルをｘａｉ（ｉ＝１、２、・・・、Ｍ）とすると、クラスタＡ内のベクトル間距離ｄ_tｈは、
【０３９８】
【数２】

【０３９９】
で表される。
従って、クラスタ同士を統合する条件は、例えば、１．５に定数を設定すると、
｜ｘａｍ−ｘｂｍ｜＜１．５ｄ_tｈ
となる。
【０４００】
次に、ステップＳ２７５に示すように、ステップＳ２７４で統合された全てのクラスタについて、クラスタ内の文字カテゴリの判定を行う。
次に、ステップＳ２７６に示すように、統合されたクラスタ同士の文字カテゴリが異なるかどうかを判断し、クラスタ同士の文字カテゴリが異なる場合、ステップＳ２７７に進み、要素数が少ない方のクラスタの文字カテゴリを要素数が多い方のクラスタの文字カテゴリに修正してから、ステップＳ２７８に進む。一方、クラスタ同士の文字カテゴリが一致する場合、ステップＳ２７７をスキップしてステップＳ２７８に進む。
【０４０１】
次に、ステップＳ２７８に示すように、クラスタ内の文字について、その文字カテゴリを出力する。
次に、本発明の一実施例によるパターン認識装置の動作について、図７２の帳票を処理する場合を例にとって、より具体的に説明する。
【０４０２】
図７２は、本発明の一実施例によるパターン認識装置に入力される帳票の例を示す図である。
図７２の帳票には、枠番号１のフリーピッチ枠、枠番号２、３、４の一文字枠、枠番号５のブロック枠、枠番号６の不規則な表が設けられている。また、枠番号１のフリーピッチ枠には、枠に接触した状態で且つ横二重線により訂正されている「５」、枠に接触した状態の「３」、「２」、枠に接触した状態で且つかすれた状態の「７」、くせ字の「４」、「６」、枠からはみ出した状態で且つくせ字の「４」が記入されている。
【０４０３】
枠番号２の一文字枠には「５」が記入され、枠番号３の一文字枠には「３」が記入され、枠番号４の一文字枠には枠からはみ出した状態で且つ横二重線により訂正されている「８」が記入されている。枠番号５のブロック枠のうち、枠番号５−１の枠には横二重線により訂正されているくせ字の「６」が記入され、枠番号５−２の枠には枠に接触した状態で「２」が記入され、枠番号５−３の枠にはくせ字の「４」が記入されている。
【０４０４】
枠番号６の不規則な表のうち、枠番号６−１−１の枠には、枠からはみ出した状態の「３」、「２」、「１」が記入され、枠番号６−１−２の枠には、「６」、「３」、「８」が記入され、枠番号６−１−３の枠、枠番号６−１−４−１の枠、枠番号６−１−４−２の枠、枠番号６−１−４−３の枠、枠番号６−２−１の枠、枠番号６−２−２の枠及び枠番号６−２−３の枠はそれぞれ空欄となっており、枠番号６の不規則な表全体が×印により訂正されている。
【０４０５】
次に、図３の環境認識系１１は、図７２の帳票に対し、図５〜図８の処理を行うことにより、入力画像の状態を図７２の帳票から抽出する。
例えば、図６のレイアウト解析により、図７２の帳票から、枠番号１のフリーピッチ枠、枠番号２、３、４の一文字枠、枠番号５のブロック枠及び枠番号６の不規則な表を抽出するとともに、枠番号１のフリーピッチ枠からは、８つのパターンが文字の候補として抽出され、枠番号２、３、４の一文字枠からは、それぞれ１つのパターンが文字の候補として抽出され、枠番号５のブロック枠からは、３つのパターンが文字の候補として抽出され、枠番号６−１−１の枠からは、３つのパターンが文字の候補として抽出され、枠番号６−１−２の枠からは、３つのパターンが文字の候補として抽出され、枠番号６−１−３の枠、枠番号６−１−４−１の枠、枠番号６−１−４−２の枠、枠番号６−１−４−３の枠、枠番号６−２−１の枠、枠番号６−２−２の枠及び枠番号６−２−３の枠からは、文字の候補は抽出されない。
【０４０６】
ここで、図７２の帳票から文字列を抽出するには、例えば、図１４及び図１５に示したテキスト抽出方法を使用し、図７２の帳票から罫線を抽出するには、例えば、図１６〜図２２に示した罫線抽出方法を使用し、図７２の帳票から枠や表を抽出するには、例えば、図２３及び図２４に示した枠抽出方法を使用する。
【０４０７】
さらに、枠番号１のフリーピッチ枠から抽出された第１番目のパターン、第２番目のパターン、第５番目のパターン、第８番目のパターンは、枠接触文字の候補とされる。また、枠番号４の一文字枠から抽出されたパターン、枠番号５−２の枠から抽出されたパターン、枠番号６−１−１の枠から抽出された第１番目のパターンも、枠接触文字の候補とされる。
【０４０８】
ここで、図７２の帳票から枠接触文字の候補を抽出するには、例えば、図２７及び図２８に示した枠接触文字抽出方法を使用する。
また、図７の品質解析により、図７２の帳票から、かすれ状態やつぶれ状態や高品質文字などを検出する。この例では、画像の品質は正常で、かすれ状態やつぶれ状態や高品質文字などは検出されない。
【０４０９】
また、図８の訂正解析により、図７２の帳票から訂正文字候補を抽出する。この例では、枠番号１のフリーピッチ枠から抽出された第１番目のパターン、枠番号２、４の一文字枠から抽出されたパターン、枠番号５−１の枠から抽出されたパターン及び枠番号６の不規則な表から抽出されたパターンは、訂正文字候補とされる。
【０４１０】
ここで、図７２の帳票から訂正文字の候補を抽出するには、例えば、図３０に示した特徴量抽出方法を使用する。
次に、環境認識系１１は、入力画像から抽出した文字の候補ごとに、図５〜図８の処理により帳票から抽出した状態を記入した中間処理結果テーブルを作成する。
【０４１１】
図７３は、図５〜図８の処理により帳票から抽出した状態を記入した中間処理結果テーブルを示す図である。
図７３において、枠番号１の欄には、「枠種類」として「フリーピッチ」、「文字数」として「８」が記入され、枠番号１の第１番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「有２」、「品質」として「正常」が記入され、枠番号１の第２番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」が記入され、枠番号１の第８番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」が記入されている。
【０４１２】
ここで、「消し線」の欄の「有１」は複数文字に対して消し線候補が存在していることを示し、「消し線」の欄の「有２」は一文字に対して消し線候補が存在していることを示している。
【０４１３】
枠番号２の欄には、「枠種類」として「一文字」、「枠接触有無」として「無」、「消し線」として「有２」、「品質」として「正常」、「文字数」として「１」が記入され、枠番号３の欄には、「枠種類」として「一文字」、「枠接触有無」として「無」、「消し線」として「無」、「品質」として「正常」、「文字数」として「１」が記入され、枠番号４の欄には、「枠種類」として「一文字」、「枠接触有無」として「有」、「消し線」として「有２」、「品質」として「正常」、「文字数」として「１」が記入されている。
【０４１４】
枠番号５の欄には、「枠種類」として「はしご」、「文字数」として「３」が記入され、枠番号５−１の欄には、「枠接触有無」として「無」、「消し線」として「有２」、「品質」として「正常」、「文字数」として「１」が記入され、枠番号５−２の欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」、「文字数」として「１」が記入され、枠番号５−３の欄には、「枠接触有無」として「無」、「消し線」として「無」、「品質」として「正常」、「文字数」として「１」が記入されている。
【０４１５】
枠番号６の欄には、「枠種類」として「表」が記入され、枠番号６−１−１の欄には、「枠種類」として「フリーピッチ」、「枠接触有無」として「有」、「消し線」として「有１」、「品質」として「正常」が記入され、枠番号６−２−２の欄には、「枠種類」として「フリーピッチ」、「枠接触有無」として「無」、「消し線」として「有１」、「品質」として「正常」が記入されている。
【０４１６】
次に、環境認識系１１は、図５〜図８の処理により帳票から抽出した状態に基づいて、図９の処理を行う。
すなわち、図７３の中間処理結果テーブルに記入された入力画像の状態に基づいて、図３の文字認識部１２の基本文字認識部１７、文字列認識部１５、接触文字認識部１３、かすれ文字認識部１９、つぶれ文字認識部２１、又は非文字認識部２５の消し線認識部２６及び雑音認識部２８のいずれの処理を呼び出すかを処理順序制御ルールを参照しながら決定し、決定した処理を図７３の中間処理結果テーブルの「処理呼出し」の欄に記入する。そして、図７３の中間処理結果テーブルの「処理呼出し」の欄に記入された処理をどのような順序で実行するかを、処理順序テーブルを参照しながら決定し、決定した順序を図７３の中間処理結果テーブルの「処理順序」の欄に記入する。
【０４１７】
処理順序制御ルールの例としては、
（Ａ１）もし、ある処理対象に対し、中間処理結果テーブルの状態を示す欄が「有」で、その状態に対応する処理が実行されていないならば、その状態に対応する処理を中間処理結果テーブルの「処理呼出し」の欄に記入する、
（Ａ２）もし、ある処理対象に対し、中間処理結果テーブルの状態を示す全ての欄が「無」、または「正常」で、基本文字認識部１７の処理が実行されていないならば、中間処理結果テーブルの「処理呼出し」の欄に「基本」と記入する、
（Ａ３）もし、ある処理対象に対し、中間処理結果テーブルに記入された状態に対応する処理が複数個存在しているならば、複数個の処理の順序を決定している処理順序テーブルをアクセスして「処理呼出し」の欄の順序を並び替える、
（Ａ４）もし、ある処理対象に対し、中間処理結果テーブルに記入された状態に対応する処理が終了したならば、終了した処理を中間処理結果テーブルの「処理完了」の欄に記入するとともに、次に行うべき指示や処理の中断や終了を示す指示を中間処理結果テーブルの「処理指示」の欄に記入し、それらの情報に基づいて、中間処理結果テーブルの「処理呼出し」の欄の順序を並び替える、
などがある。
【０４１８】
図７４は、処理順序テーブルの一例を示す図である。
図７４において、処理順序テーブルには、例えば、
（Ｂ１）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に１つの処理しか記入されていない場合は、中間処理結果テーブルの「処理順序」の欄にその処理を記入する、
（Ｂ２）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「黒枠／フリーピッチ」と記入された場合は、中間処理結果テーブルの「処理順序」の欄に「黒枠→フリーピッチ」と記入する、
（Ｂ３）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「消し線（有２）／黒枠」と記入された場合は、「黒枠→一文字消し線」と記入する、
（Ｂ４）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「黒枠／フリーピッチ／消し線（有２）」と記入された場合は、中間処理結果テーブルの「処理順序」の欄に「黒枠→一文字消し線→フリーピッチ」と記入する、（Ｂ５）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「黒枠／フリーピッチ／消し線（有１）」と記入された場合は、中間処理結果テーブルの「処理順序」の欄に複数文字の「消し線→黒枠→フリーピッチ」と記入する、
（Ｂ６）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「フリーピッチ／消し線（有１）」と記入された場合は、中間処理結果テーブルの「処理順序」の欄に「複数文字の消し線→フリーピッチ」と記入する、
（Ｂ７）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「処理Ａ、Ｂ、Ｃ」と記入され、中間処理結果テーブルの「処理順序」の欄に「処理Ｂ→処理Ａ→処理Ｃ」と記入されている場合で、中間処理結果テーブルの「処理完了」の欄に「処理Ｂ」と記入された場合、中間処理結果テーブルの「処理順序」の欄を「処理Ａ→処理Ｃ」に更新する、
（Ｂ８）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「処理Ａ、Ｂ、Ｃ」と記入され、中間処理結果テーブルの「処理順序」の欄に「処理Ｂ→処理Ａ→処理Ｃ」と記入されている場合で、中間処理結果テーブルの「処理完了」の欄に「処理Ｂ」と記入され、中間処理結果テーブルの「処理指示」の欄に「処理Ｃにスキップ」と記入された場合、中間処理結果テーブルの「処理順序」の欄を「処理Ｃ」に更新する、
（Ｂ９）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「処理Ａ、Ｂ、Ｃ」と記入され、中間処理結果テーブルの「処理順序」の欄に「処理Ｂ→処理Ａ→処理Ｃ」と記入されている場合で、中間処理結果テーブルの「処理完了」の欄に「処理Ｂ」と記入され、中間処理結果テーブルの「処理指示」の欄に「処理Ｃと処理Ａとの順序逆転」と記入された場合、中間処理結果テーブルの「処理順序」の欄を「処理Ｃ→処理Ａ」に更新する、
（Ｂ１０）ある処理対象に対し、中間処理結果テーブルの「処理呼出し」の欄に「処理Ｂ、Ａ」と記入され、中間処理結果テーブルの「処理完了」の欄に「処理Ａ」と記入され、中間処理結果テーブルの「処理指示」の欄に「終了」と記入された場合、中間処理結果テーブルの「処理順序」の欄を「終了」とする、
などの手順が格納されている。
【０４１９】
図７５は、図７３の中間処理結果テーブルに記入された入力画像の状態に基づいて呼び出す処理を「処理呼出し」の欄に記入するとともに、「処理呼出し」の欄に記入された処理を実行する順序を「処理順序」の欄に記入した例を示す図である。
【０４２０】
図７５において、枠番号１の欄には、「枠種類」として「フリーピッチ」が記入され、枠番号１の第１番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「有２」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠／フリーピッチ／消し線（有２）」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ４）を参照し、「処理順序」の欄に「黒枠→一文字消し線→フリーピッチ」と記入する。
【０４２１】
枠番号１の第２番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠／フリーピッチ」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ２）を参照し、「処理順序」の欄に「黒枠→フリーピッチ」と記入する。
【０４２２】
枠番号１の第８番目のパターンの欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠／フリーピッチ」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ２）を参照し、「処理順序」の欄に「黒枠→フリーピッチ」と記入する。
【０４２３】
枠番号２の欄には、「枠種類」として「一文字」、「枠接触有無」として「無」、「消し線」として「有２」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「消し線（有２）」と記入するとともに、処理順序制御ルールの（Ａ１）に従って「処理順序」の欄に「一文字消し線」と記入する。
【０４２４】
枠番号３の欄には、「枠種類」として「一文字」、「枠接触有無」として「無」、「消し線」として「無」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ２）に従って「処理呼び出し」の欄に「基本」と記入するとともに、処理順序制御ルールの（Ａ１）に従って「処理順序」の欄に「基本」と記入する。
【０４２５】
枠番号４の欄には、「枠種類」として「一文字」、「枠接触有無」として「有」、「消し線」として「有２」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠／消し線（有２）」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ３）を参照し、「処理順序」の欄に「黒枠→一文字消し線」と記入する。
【０４２６】
枠番号５の欄には、「枠種類」として「はしご」が記入され、枠番号５−１の欄には、「枠接触有無」として「無」、「消し線」として「有２」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「消し線（有２）」と記入するとともに、処理順序制御ルールの（Ａ１）に従って「処理順序」の欄に「一文字消し線」と記入する。
【０４２７】
枠番号５−２の欄には、「枠接触有無」として「有」、「消し線」として「無」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠」と記入するとともに、処理順序制御ルールの（Ａ１）に従って「処理順序」の欄に「黒枠」と記入する。
【０４２８】
枠番号５−３の欄には、「枠接触有無」として「無」、「消し線」として「無」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ２）に従って「処理呼び出し」の欄に「基本」と記入するとともに、処理順序制御ルールの（Ａ１）に従って「処理順序」の欄に「基本」と記入する。
【０４２９】
枠番号６の欄には、「枠種類」として「表」が記入され、枠番号６−１−１の欄には、「枠種類」として「フリーピッチ」、「枠接触有無」として「有」、「消し線」として「有１」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「黒枠／フリーピッチ／消し線（有１）」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ５）を参照し、「処理順序」の欄に「複数文字の消し線→黒枠→フリーピッチ」と記入する。
【０４３０】
枠番号６−２−２の欄には、「枠種類」として「フリーピッチ」、「枠接触有無」として「無」、「消し線」として「有１」、「品質」として「正常」が記入されているので、処理順序制御ルールの（Ａ１）に従って「処理呼び出し」の欄に「フリーピッチ／消し線（有１）」と記入するとともに、処理順序制御ルールの（Ａ３）に従って処理順序テーブルの（Ｂ６）を参照し、「処理順序」の欄に複数文字の「複数文字の消し線→フリーピッチ」と記入する。
【０４３１】
次に、「処理呼出し」の欄及び「処理順序」の欄が記入された図７５の中間処理結果テーブルに基づいて、処理実行ルールを参照しながら最初の認識処理を実行する。そして、処理が完了した認識処理を中間処理結果テーブルの「処理完了」の欄に記入するとともに、その時の認識処理で得られた信頼度を中間処理結果テーブルの「信頼度」の欄に記入する。
【０４３２】
また、中間処理結果テーブルの「処理順序」の欄を、図７４の処理順序テーブルの（Ｂ７）〜（Ｂ９）に従って更新するとともに、処理実行ルールによって指示される次の処理がある場合は、中間処理結果テーブルの「処理指示」の欄にその処理を記入する。
【０４３３】
処理実行ルールとしては、例えば、
（Ｃ１）もし、ある処理対象に対し、中間処理結果テーブルの「処理順序」の欄に記入されている処理が存在するならば、優先順位の最も高い処理を実行する。そして、実行した処理が終了したならば、中間処理結果テーブルの「処理完了」の欄に終了した処理を記入し、中間処理結果テーブルの「処理順序」の欄からその処理を削除する。また、次に実行する処理を指示する場合は、中間処理結果テーブルの「処理指示」の欄にその処理を記入する、
（Ｃ２）もし、ある処理を実行した結果、あるパターンが非文字ではなく、文字であると判断され、その文字コードが所定の値以上の信頼度で算出されたならば、「個人筆記特性」による文字認識処理を呼び出すことを中間処理結果テーブルの「処理指示」の欄に記入する、
（Ｃ３）もし、ある処理を実行した結果、あるパターンが消し線であると判断され、その消し線が所定の値以上の信頼度で算出されたならば、中間処理結果テーブルの「処理指示」の欄に「終了」と記入し、中間処理結果テーブルの「処理順序」の欄に記入されているそれ以降の処理を打ち切って、処理を終了させる、
（Ｃ４）もし、中間処理結果テーブルの「処理順序」の欄の最初に「フリーピッチ」と記入され、同じ枠番号の他の処理対象についての「フリーピッチ」より前の処理が未処理であるならば、同じ枠番号の全て処理対象の「処理順序」の欄の最初に「フリーピッチ」と記入された後、同じ枠番号の全て処理対象の「フリーピッチ」の処理を同時に実行する、
（Ｃ５）もし、中間処理結果テーブルの「処理順序」の欄に記入された全ての処理が終了し、全ての処理対象について、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるか、又は「個人筆記特性」と記入されたならば、「処理指示」の欄に「個人筆記特性」と記入されている処理対象に対して、「個人筆記特性」による文字認識処理を呼び出してその処理を実行し、「個人筆記特性」による文字認識処理が終了したならば、中間処理結果テーブルの「処理指示」の欄に「終了」と記入する、
（Ｃ６）もし、全ての処理対象について、中間処理結果テーブルの「処理指示」の欄に終了と記入されたならば、全ての処理を終了して認識結果を出力する、
などがある。
【０４３４】
図７６は、図７５の中間処理結果テーブルに基づいて、処理実行ルールを参照しながら認識処理を実行し、その時の認識処理で得られた信頼度を中間処理結果テーブルの「信頼度」の欄に記入し、処理実行ルールに基づいて中間処理結果テーブルの「処理順序」の欄を更新するとともに、中間処理結果テーブルの「処理指示」の欄に記入を行った例を示す図である。
【０４３５】
まず、図７５の中間処理結果テーブルの枠番号１の第１番目のパターンの「処理順序」の欄において、最初に「黒枠」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンに対し、「黒枠」に対応する図３の接触文字認識部１３の処理を実行する。
【０４３６】
この接触文字認識部１３では、例えば、図３９及び図４０に示したように、枠を除去したパターンに対して文字補完や再補完を行うことにより、枠接触文字についての文字認識を行う。また、文字補完や再補完を用いても十分な信頼度が得られないパターンについては、知識テーブル１４を参照し、図４２〜図５１に示した学習文字に対する再文字認識を行うことにより、枠接触文字についての文字認識を行う。
【０４３７】
接触文字認識部１３の文字認識処理により、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンの認識信頼度が２０％と算出された結果、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンは文字でないとみなされ、中間処理結果テーブルの「文字コード」の欄に「リジェクト」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「２０％」と記入される。
【０４３８】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠」と記入され、中間処理結果テーブルの「処理順序」の欄が「一文字消し線→フリーピッチ」に更新される。
【０４３９】
次に、図７５の中間処理結果テーブルの枠番号１の第２番目のパターンの「処理順序」の欄において、最初に「黒枠」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンに対し、「黒枠」に対応する図３の接触文字認識部１３の処理を実行し、枠接触文字についての文字認識を行う。
【０４４０】
接触文字認識部１３の文字認識処理により、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンは、認識信頼度が６０％の確率で文字カテゴリ「３」であると認識され、中間処理結果テーブルの「文字コード」の欄に「３」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「６０％」と記入される。
【０４４１】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠」と記入され、中間処理結果テーブルの「処理順序」の欄が「フリーピッチ」に更新される。
次に、図７５の中間処理結果テーブルの枠番号１の第８番目のパターンの「処理順序」の欄において、最初に「黒枠」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンに対し、「黒枠」に対応する図３の接触文字認識部１３の処理を実行し、枠接触文字についての文字認識を行う。
【０４４２】
接触文字認識部１３の文字認識処理により、図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンは、認識信頼度が９５％の確率で文字カテゴリ「４」であると認識され、中間処理結果テーブルの「文字コード」の欄に「４」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入される。
【０４４３】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠」と記入され、中間処理結果テーブルの「処理順序」の欄が「フリーピッチ」に更新される。
次に、図７５の中間処理結果テーブルの枠番号２の「処理順序」の欄において、「一文字消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号２の一文字枠から抽出されたパターンに対し、「一文字消し線」に対応する図３の消し線認識部２６の処理を実行する。
【０４４４】
この消し線認識部２６では、例えば、図６７に示したように、訂正文字の候補として抽出されたパターンから所定値以上のヒストグラム値を有する横線を除去し、この横線を除去したパターンが、文字と認識された場合は、除去した横線を消し線とみなすことにより、訂正文字の候補として抽出されたパターンを訂正文字と認識し、所定値以上のヒストグラム値を有する横線を除去したパターンが、リジェクトされた場合は、除去した横線を消し線を文字に一部とみなして消し線でないとすることにより、訂正文字の候補として抽出されたパターンを通常文字と認識する。
【０４４５】
消し線認識部２６の消し線認識処理により、図７２の枠番号２の一文字枠から抽出されたパターンの認識信頼度が１０％と算出された結果、図７２の枠番号２の一文字枠から抽出されたパターンは訂正文字でないとみなされ、中間処理結果テーブルの「信頼度」の欄に「１０％」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「基本」と記入される。
【０４４６】
また、中間処理結果テーブルの「処理完了」の欄に「消し線」と記入され、中間処理結果テーブルの「処理順序」の欄に「基本」と記入される。
次に、図７５の中間処理結果テーブルの枠番号３の「処理順序」の欄において、「基本」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号３の一文字枠から抽出されたパターンに対し、「基本」に対応する図３の基本文字認識部１７の処理を実行する。
【０４４７】
この基本文字認識部１７では、例えば、図３１に示したように、入力された未知文字の特徴を抽出し、この未知文字の特徴を特徴ベクトルにより表し、基本辞書に予め格納されている各カテゴリの特徴ベクトルと照合することにより、特徴空間上での特徴ベクトル間の距離を算出し、特徴ベクトル間の距離を最小とする文字カテゴリを未知文字として認識する。
【０４４８】
また、基本文字認識部１７は、未知文字の輪郭の凹凸の個数を算出することにより、未知文字の変形度を算出する。そして、未知文字の変形度が大きくて、認識率が低下する場合は、知識テーブル１８を参照し、図３４〜図３８に示した詳細識別法を用いて文字認識を実行する。
【０４４９】
基本文字認識部１７の文字認識処理により、図７２の枠番号３の一文字枠から抽出されたパターンは、認識信頼度が９５％の確率で文字カテゴリ「３」である１認識され、中間処理結果テーブルの「文字コード」の欄に「３」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入される。
【０４５０】
また、中間処理結果テーブルの「処理完了」の欄に「基本」と記入され、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７５の中間処理結果テーブルの枠番号４の「処理順序」の欄において、最初に「黒枠」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号４の一文字枠から抽出されたパターンに対し、「黒枠」に対応する図３の接触文字認識部１３の処理を実行し、枠接触文字についての文字認識を行う。
【０４５１】
接触文字認識部１３の文字認識処理により、図７２の枠番号４の一文字枠から抽出されたパターンの認識信頼度が１５％と算出された結果、図７２の枠番号４の一文字枠から抽出されたパターンは文字でないとみなされ、中間処理結果テーブルの「文字コード」の欄に「リジェクト」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「１５％」と記入される。
【０４５２】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠」と記入され、中間処理結果テーブルの「処理順序」の欄が「一文字消し線」に更新される。
次に、図７５の中間処理結果テーブルの枠番号５−１の「処理順序」の欄において、「一文字消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号５−１の枠から抽出されたパターンに対し、「一文字消し線」に対応する図３の消し線認識部２６の処理を実行し、訂正文字の候補として抽出されたパターンの認識処理を行う。
【０４５３】
消し線認識部２６の消し線認識処理により、図７２の枠番号５−１の枠から抽出されたパターンの認識信頼度が９５％と算出された結果、図７２の枠番号５−１の枠から抽出されたパターンは訂正文字とみなされ、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入されるとともに、中間処理結果テーブルの「処理完了」の欄に「消し線」と記入される。
【０４５４】
また、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるとともに、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７５の中間処理結果テーブルの枠番号５−２の「処理順序」の欄において、「黒枠」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号５−２の枠から抽出されたパターンに対し、「黒枠」に対応する図３の接触文字認識部１３の処理を実行し、枠接触文字についての文字認識を行う。
【０４５５】
ここで、図７２の枠番号５−２の枠から抽出されたパターンは、下線部分が枠と接触し、図３９の文字補完や図４０の再補完による処理では十分な信頼度が得られないので、図５０（ｂ）に示したように、図４５の知識テーブル１６７を参照することにより、誤読文字対（２、７）を獲得し、図４７に示した領域強調の手法により、再文字認識を行う。
【０４５６】
接触文字認識部１３の文字認識処理により、図７２の枠番号５−２の枠から抽出されたパターンは、認識信頼度が９５％の確率で文字カテゴリ「２」であると認識され、中間処理結果テーブルの「文字コード」の欄に「２」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入される。
【０４５７】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠」と記入され、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７５の中間処理結果テーブルの枠番号５−３の「処理順序」の欄において、「基本」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号５−３の枠から抽出されたパターンに対し、「基本」に対応する図３の基本文字認識部１７の処理を実行し、基本文字についての文字認識処理を行う。
【０４５８】
基本文字認識部１７の文字認識処理により、図７２の枠番号５−３の枠から抽出されたパターンは、認識信頼度が９０％の確率で文字カテゴリ「６」であると認識され、中間処理結果テーブルの「文字コード」の欄に「６」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９０％」と記入される。
【０４５９】
また、中間処理結果テーブルの「処理完了」の欄に「基本」と記入され、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７５の中間処理結果テーブルの枠番号６−１−１の「処理順序」の欄において、最初に「複数文字の消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、「複数文字の消し線」に対応する図３の消し線認識部２６の処理を実行し、消し線の認識処理を行う。
【０４６０】
消し線認識部２６の消し線認識処理により、枠番号６の表から消し線が抽出され、その消し線の認識信頼度が９８％と算出された結果、図７２の枠番号６−１−１の枠から抽出されたパターンは訂正文字とみなされ、中間処理結果テーブルの「文字コード」の欄に「消し線」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９８％」と記入され、中間処理結果テーブルの「処理完了」の欄に「消し線」と記入される。
【０４６１】
また、処理実行ルールの（Ｃ３）に従って、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるとともに、中間処理結果テーブルの「処理順序」の欄は空欄となる。
【０４６２】
次に、図７５の中間処理結果テーブルの枠番号６−２−２の「処理順序」の欄において、最初に「複数文字の消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、「複数文字の消し線」に対応する図３の消し線認識部２６の処理を実行し、消し線の認識処理を行う。
【０４６３】
消し線認識部２６の消し線認識処理により、枠番号６の表から消し線が抽出され、その消し線の認識信頼度が９８％と算出された結果、図７２の枠番号６−２−２の枠から抽出されたパターンは訂正文字とみなされ、中間処理結果テーブルの「文字コード」の欄に「消し線」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９８％」と記入され、中間処理結果テーブルの「処理完了」の欄に「消し線」と記入される。
【０４６４】
また、処理実行ルールの（Ｃ３）に従って、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるとともに、中間処理結果テーブルの「処理順序」の欄は空欄となる。
【０４６５】
以上に処理により、図７６の中間処理結果テーブルが生成される。ここで、図７６の中間処理結果テーブルの「処理順序」の欄には、次に呼び出す処理が記入されているので、処理実行ルール（Ｃ１）に従って処理を続行する。
【０４６６】
図７７は、図７６の中間処理結果テーブルに基づいて認識処理を続行し、その際に得られた結果を示す図である。
まず、図７６の中間処理結果テーブルの枠番号１の第１番目のパターンの「処理順序」の欄において、最初に「一文字消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンに対し、「一文字消し線」に対応する図３の消し線認識部２６の処理を実行し、訂正文字についての認識処理を行う。
【０４６７】
消し線認識部２６の認識処理により、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンの認識信頼度が９６％と算出された結果、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンは訂正文字とみなされ、中間処理結果テーブルの「文字コード」の欄に「消し線」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９６％」と記入され、中間処理結果テーブルの「処理完了」の欄に「黒枠／消し線」と記入される。
【０４６８】
また、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるとともに、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７５の中間処理結果テーブルの枠番号１の第２番目のパターンの「処理順序」の欄において、「フリーピッチ」と指示されているので、処理実行ルールの（Ｃ４）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンに対し、同じ枠番号１の他の全てのパターンの「処理順序」の欄が「フリーピッチ」となるまで待機し、枠番号１の全てのパターンの「処理順序」の欄が「フリーピッチ」となった時に、枠番号１のフリーピッチ枠から抽出された全てのパターンを対象として、「フリーピッチ」に対応する図３の文字列認識部１５の処理を実行し、文字の切り出し信頼度を考慮しながら文字認識を行う。
【０４６９】
次に、図７５の中間処理結果テーブルの枠番号１の第８番目のパターンの「処理順序」の欄において、「フリーピッチ」と指示されているので、処理実行ルールの（Ｃ４）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンに対し、同じ枠番号１の他の全てのパターンの「処理順序」の欄が「フリーピッチ」となるまで待機し、枠番号１の全てのパターンの「処理順序」の欄が「フリーピッチ」となった時に、枠番号１のフリーピッチ枠から抽出された全てのパターンを対象として、「フリーピッチ」に対応する図３の文字列認識部１５の処理を実行し、文字の切り出し信頼度を考慮しながら認識処理を行う。
【０４７０】
そして、枠番号１の全てのパターンの「処理順序」の欄が「フリーピッチ」となった場合、図７２の枠番号１のフリーピッチ枠から抽出された全てのパターンを対象として、文字列認識部１５の文字認識処理を行う。
【０４７１】
ここで、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンについては、図７７の中間処理結果テーブルの枠番号１の第１番目のパターンの「処理指示」の欄が「終了」となっているので、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンを文字列認識部１５の処理対象から除外し、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンから第８番目のパターンについて、文字列認識部１５の認識処理を実行する。
【０４７２】
この文字列認識部１５では、例えば、図５２〜図６５に示したように、文字を切り出した際の信頼度を判別面からの距離に基づいて算出し、（文字切り出しの信頼度）と（文字認識の信頼度）との積が最大となるものを、切り出し文字とする。
【０４７３】
文字列認識部１５の認識処理により、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンは、認識信頼度が９５％の確率で文字カテゴリ「３」であると認識され、中間処理結果テーブルの「文字コード」の欄に「３」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入される。
【０４７４】
また、処理実行ルールの（Ｃ１）に従って、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ」と記入され、中間処理結果テーブルの「処理順序」の欄が空欄となり、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４７５】
図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンは、認識信頼度が９８％の確率で文字カテゴリ「４」であると認識され、中間処理結果テーブルの「文字コード」の欄に「４」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９８％」と記入される。
【０４７６】
また、処理実行ルールの（Ｃ１）に従って、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ」と記入され、中間処理結果テーブルの「処理順序」の欄が空欄となり、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４７７】
また、図７２の枠番号１のフリーピッチ枠から抽出された第３番目のパターンは、文字カテゴリ「２」であると認識され、図７２の枠番号１のフリーピッチ枠から抽出された第４番目のパターンと図７２の枠番号１のフリーピッチ枠から抽出された第５番目のパターンとは、文字列認識部１５の認識処理により１つの文字に統合され、文字カテゴリ「７」であると認識され、図７２の枠番号１のフリーピッチ枠から抽出された第６番目のパターンは、文字カテゴリ「４」であると認識され、図７２の枠番号１のフリーピッチ枠から抽出された第７番目のパターンは、文字カテゴリ「６」であると認識される。
【０４７８】
この結果、図７７の中間処理結果テーブルの「文字数」の欄は「７」に変更される。
次に、図７６の中間処理結果テーブルの枠番号２の「処理順序」の欄において、「基本」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号２の一文字枠から抽出されたパターンに対し、「基本」に対応する図３の基本文字認識部１７の処理を実行し、基本文字についての文字認識処理を行う。
【０４７９】
基本文字認識部１７の文字認識処理により、図７２の枠番号２の一文字枠から抽出されたパターンは、認識信頼度が９７％の確率で文字カテゴリ「５」であると認識され、中間処理結果テーブルの「文字コード」の欄に「５」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９７％」と記入される。
【０４８０】
また、中間処理結果テーブルの「処理呼び出し」の欄に「消し線（有２）／基本」と記入され、中間処理結果テーブルの「処理完了」の欄に「消し線／基本」と記入され、中間処理結果テーブルの「処理順序」の欄は空欄となり、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４８１】
次に、図７６の中間処理結果テーブルの枠番号３の「処理順序」の欄は空欄となっているので、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４８２】
次に、図７６の中間処理結果テーブルの枠番号４の「処理順序」の欄において、「一文字消し線」と指示されているので、処理実行ルールの（Ｃ１）に従って、図７２の枠番号４の一文字枠から抽出されたパターンに対し、「一文字消し線」に対応する図３の消し線認識部２６の処理を実行し、訂正文字の候補として抽出されたパターンの認識処理を行う。
【０４８３】
消し線認識部２６の消し線認識処理により、図７２の枠番号４の一文字枠から抽出されたパターンの認識信頼度が９５％と算出された結果、図７２の枠番号４の一文字枠から抽出されたパターンは訂正文字とみなされ、中間処理結果テーブルの「信頼度」の欄に「９５％」と記入されるとともに、中間処理結果テーブルの「処理完了」の欄に「黒枠／消し線」と記入される。
【０４８４】
また、中間処理結果テーブルの「処理指示」の欄に「終了」と記入されるとともに、中間処理結果テーブルの「処理順序」の欄は空欄となる。
次に、図７６の中間処理結果テーブルの枠番号５−１の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号５−１の枠から抽出されたパターンについては、処理を行わない。
【０４８５】
次に、図７６の中間処理結果テーブルの枠番号５−２の「処理順序」の欄は空欄となっているので、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４８６】
次に、図７６の中間処理結果テーブルの枠番号５−３の「処理順序」の欄は空欄となっているので、処理実行ルールの（Ｃ４）に従って、中間処理結果テーブルの「処理指示」の欄に「個人筆記特性」と記入される。
【０４８７】
次に、図７６の中間処理結果テーブルの枠番号６−１−１の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号６−１−１の枠から抽出されたパターンについては、処理を行わない。
【０４８８】
次に、図７６の中間処理結果テーブルの枠番号６−２−２の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号６−１−１の枠から抽出されたパターンについては、処理を行わない。
【０４８９】
以上に処理により、図７７の中間処理結果テーブルが生成される。ここで、図７７の中間処理結果テーブルの「処理指示」の欄には、「個人筆記特性」と記入されているものがあるので、処理実行ルール（Ｃ５）に従って処理を続行する。
【０４９０】
図７８は、図７７の中間処理結果テーブルに基づいて認識処理を続行し、その際に得られた結果を示す図である。
まず、図７６の中間処理結果テーブルの枠番号１の第１番目のパターンの「処理指示」の欄に「終了」と記入されているので、図７２の枠番号１のフリーピッチ枠から抽出された第１番目のパターンについては、処理を行わない。
【０４９１】
次に、図７５の中間処理結果テーブルの枠番号１の第２番目のパターンの「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第２番目のパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０４９２】
このくせ字解析部２３は、例えば、図６８〜図７１に示したように、同一筆者が書いた手書き文字を各カテゴリごとにクラスタリングし、クラスタリングにより得られた手書き文字の第１のクラスタと距離が近く、且つ他のカテゴリに属する第２のクラスタで要素数が少ないものを第１のクラスタに統合することにより、第２のクラスタに属する手書き文字のカテゴリを第１のクラスタのカテゴリに修正する。
【０４９３】
くせ字解析部２３の解析処理により、図７２の枠番号１のフリーピッチ枠から抽出された第２のパターンは、認識信頼度が９７％の確率で文字カテゴリ「３」であると認識され、中間処理結果テーブルの「文字コード」の欄に「３」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９７％」と記入される。
【０４９４】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０４９５】
次に、図７５の中間処理結果テーブルの枠番号１の第８番目のパターンの「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０４９６】
くせ字解析部２３の解析処理により、図７２の枠番号１のフリーピッチ枠から抽出された第８番目のパターンは、認識信頼度が９８％の確率で文字カテゴリ「４」であると認識され、中間処理結果テーブルの「文字コード」の欄に「４」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９８％」と記入される。
【０４９７】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０４９８】
次に、図７６の中間処理結果テーブルの枠番号２の「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号２の一文字枠から抽出されたパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０４９９】
くせ字解析部２３の解析処理により、図７２の枠番号２の一文字枠から抽出されたパターンは、認識信頼度が９７％の確率で文字カテゴリ「５」であると認識され、中間処理結果テーブルの「文字コード」の欄に「５」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９７％」と記入される。
【０５００】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０５０１】
次に、図７６の中間処理結果テーブルの枠番号３の「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号３の一文字枠から抽出されたパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０５０２】
くせ字解析部２３の解析処理により、図７２の枠番号３の一文字枠から抽出されたパターンは、認識信頼度が９７％の確率で文字カテゴリ「３」であると認識され、中間処理結果テーブルの「文字コード」の欄に「３」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９７％」と記入される。
【０５０３】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０５０４】
次に、図７６の中間処理結果テーブルの枠番号４の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号４の一文字枠から抽出されたパターンについては、処理を行わない。
【０５０５】
次に、図７６の中間処理結果テーブルの枠番号５−１の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号５−１の枠から抽出されたパターンについては、処理を行わない。
【０５０６】
次に、図７６の中間処理結果テーブルの枠番号５−２の「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号５−２の枠から抽出されたパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０５０７】
くせ字解析部２３の解析処理により、図７２の枠番号５−２の枠から抽出されたパターンは、認識信頼度が９７％の確率で文字カテゴリ「２」であると認識され、中間処理結果テーブルの「文字コード」の欄に「２」と記入されるとともに、中間処理結果テーブルの「信頼度」の欄に「９７％」と記入される。
【０５０８】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０５０９】
次に、図７６の中間処理結果テーブルの枠番号５−３の「処理指示」の欄に「個人筆記特性」と記入されているので、処理実行ルールの（Ｃ５）に従って、図７２の枠番号５−３の枠から抽出されたパターンに対し、「個人筆記特性」に対応する図３のくせ字解析部２３の処理を実行する。
【０５１０】
くせ字解析部２３の解析処理により、図７２の枠番号５−３の枠から抽出されたパターンは、認識信頼度が９６％の確率で文字カテゴリ「４」であると認識され、中間処理結果テーブルの「文字コード」の欄が「４」に変更されるとともに、中間処理結果テーブルの「信頼度」の欄に「９６％」と記入される。
【０５１１】
また、中間処理結果テーブルの「処理完了」の欄に「黒枠／フリーピッチ／個人筆記特性」と記入されるとともに、中間処理結果テーブルの「処理指示」の欄に「終了」と記入される。
【０５１２】
次に、図７６の中間処理結果テーブルの枠番号６−１−１の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号６−１−１の枠から抽出されたパターンについては、処理を行わない。
【０５１３】
次に、図７６の中間処理結果テーブルの枠番号６−２−２の「処理指示」の欄に「終了」と記入されているので、図７２の枠番号６−１−１の枠から抽出されたパターンについては、処理を行わない。
【０５１４】
以上に処理により、図７８の中間処理結果テーブルが生成される。ここで、図７８の中間処理結果テーブルの「処理指示」の欄は、全ての処理対象に対して「終了」と記入されているので、処理実行ルール（Ｃ６）に従って、全ての処理を終了する。
【０５１５】
以上説明したように、本発明の実施例によれば、文字認識部１２及び非文字認識部２５では、環境認識系１１で認識された入力画像の状態を処理するために適合した認識処理を行う。
【０５１６】
例えば、環境認識系１１が罫線に接触した文字を抽出した場合、罫線に接触した文字についての認識処理を専用に行う接触文字認識部１３を使用し、環境認識系１１がフリーピッチ文字列を抽出した場合、フリーピッチ文字列についての認識処理を専用に行う文字列認識部１５を使用し、環境認識系１１がかすれ文字を抽出した場合、かすれ文字についての認識処理を専用に行うかすれ文字認識部１９を使用し、環境認識系１１がつぶれ文字を抽出した場合、つぶれ文字についての認識処理を専用に行うつぶれ文字認識部２１を使用し、環境認識系１１が非文字を抽出した場合、非文字についての認識処理を専用に非文字認識部２５を使用する。
【０５１７】
また、文字認識部１９又は非文字認識部２５の認識結果についての信頼度を算出し、信頼度が低い文字や非文字については、環境認識系１１、文字認識部１９及び非文字認識部２５の間で相互にフィードバックを行うようにして他の処理をやり直すようにし、信頼度が高くなるか、又は実行できる処理がなくなった場合に全体の処理を終了する。
【０５１８】
このように、本発明の実施例によれば、文字が書かれている環境に応じて、文字を認識する際に使用する特徴及び識別法をアダプティブに変化させて認識処理を実行することができるので、文書や帳票の様々な環境に対応した高精度な文字認識が可能となる。
【０５１９】
また、文字コードのみを認識結果として出力するだけでなく、環境認識系１１による環境認識結果を文字認識結果と同時に出力することができるとともに、環境認識結果と文字認識結果とが相互に一致した時に文字認識結果を出力することが可能となり、文字認識結果に対する確認機能及び信頼性を向上させることができる。
【０５２０】
さらに、非文字認識部２５を専用に設け、非文字認識を文字認識と独立して行うことができるので、文字認識及び非文字認識の信頼性を向上させることができる。
【０５２１】
さらにまた、各文字が書かれている環境に応じた独立な認識処理を行うことができるので、各認識処理における辞書や知識を増加させることにより、認識信頼度を向上させることができる。
【０５２２】
【発明の効果】
以上説明したように、本発明によれば、処理対象の状態を入力画像から抽出し、その状態に適した認識処理を処理対象ごとに選択することにより、様々な状態を有する入力画像に対し、それぞれの状態に適したパターン認識処理を行うことができ、認識処理を精度よく行うことが可能となる。また、処理対象の評価が、その状態を抽出する時と、その処理対象についての認識処理を行う時の両方で行われるので、認識処理の精度をより一層向上させることができる。
【０５２３】
また、本発明の一態様によれば、処理対象の状態を入力画像から抽出し、第１の状態を有する処理対象に対しては、第１の状態専用のパターン認識処理を行い、第２の状態を有する処理対象に対しては、第２の状態専用のパターン認識処理を行うことにより、第１の状態を有する処理対象の認識処理と第２の状態を有する処理対象の認識処理とが互いに相互作用を及ぼすことがなくなり、認識処理を精度よく行うことが可能となる。
【０５２４】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、認識辞書を使い分けることにより、それぞれの状態に対して最適な認識辞書を使用することができ、認識処理の精度を向上させることが可能となる。
【０５２５】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、識別関数を使い分けることにより、それぞれの状態に対して最適な識別関数を使用しながら認識処理を行うことができ、認識処理の精度を向上させることが可能となる。
【０５２６】
また、本発明の一態様によれば、様々な状態を有する入力画像に対し、知識を使い分けることにより、それぞれの状態に対して最適な知識を使用しながら認識処理を行うことができ、認識処理の精度を向上させることが可能となる。
【０５２７】
また、本発明の一態様によれば、認識処理による信頼度が所定の値以上となるまで、同一の処理対象に対して複数の認識処理を行うようにすることにより、認識処理の信頼度を上げることができ、認識処理の精度を向上させることができる。
【０５２８】
また、本発明の一態様によれば、非文字についての認識処理と文字についての認識処理とを別々に行うようにすることにより、文字を非文字とみなしたり、非文字を文字とみなしたりして認識処理が行われることが減少し、認識処理を精度よく行うことが可能となる。
【図面の簡単な説明】
【図１】本発明の一実施例によるパターン認識装置の機能的な構成を示すブロック図である。
【図２】図１の環境認識手段のより具体的な構成の一実施例を示すブロック図である。
【図３】図１のパターン認識装置のより具体的な構成の一実施例を示すブロック図である。
【図４】図３の環境認識系の全体的な動作の一実施例を示すフローチャートである。
【図５】図４の前処理部の動作の一実施例を示すフローチャートである。
【図６】図４のレイアウト解析部の動作の一実施例を示すフローチャートである。
【図７】図４の品質解析部の動作の一実施例を示すフローチャートである。
【図８】図４の訂正解析部の動作の一実施例を示すフローチャートである。
【図９】図４の文字認識／非文字認識への制御部の動作の一実施例を示すフローチャートである。
【図１０】本発明の一実施例によるパターン認識装置のシステム構成を示すブロック図である。
【図１１】本発明の一実施例によるパターン認識装置のより具体的なシステム構成を示すブロック図である。
【図１２】本発明の一実施例によるパターン認識装置のラベリング処理の一例を示す図である。
【図１３】本発明の一実施例によるパターン認識装置のラベリング処理の圧縮表現を示す図である。
【図１４】本発明の一実施例によるパターン認識装置のテキスト抽出処理の一例を示す図である。
【図１５】本発明の一実施例によるパターン認識装置のテキスト抽出処理における部分領域の一例を示す図である。
【図１６】本発明の一実施例によるパターン認識装置の罫線抽出処理における隣接投影法を説明する図である。
【図１７】本発明の一実施例によるパターン認識装置の罫線抽出処理におけるパターンの投影結果を示す図である。
【図１８】本発明の一実施例によるパターン認識装置の罫線抽出処理を示すフローチャートである。
【図１９】本発明の一実施例によるパターン認識装置の罫線抽出処理を示す図である。
【図２０】本発明の一実施例によるパターン認識装置の罫線抽出処理におけるかすれ罫線の補完方法を説明する図である。
【図２１】本発明の一実施例によるパターン認識装置のかすれ罫線の補完方法を示すフローチャートである。
【図２２】本発明の一実施例によるパターン認識装置のかすれ罫線の補完の際の探索方向を示す図である。
【図２３】本発明の一実施例によるパターン認識装置の一文字枠抽出処理を示すフローチャートである。
【図２４】本発明の一実施例によるパターン認識装置のブロック枠抽出処理を示すフローチャートである。
【図２５】本発明の一実施例によるパターン認識装置の枠及び表の種類を示す図である。
【図２６】本発明の一実施例によるパターン認識装置の画像縮小処理を示すフローチャートである。
【図２７】本発明の一実施例によるパターン認識装置の枠接触有無判断処理を説明する図である。
【図２８】本発明の一実施例によるパターン認識装置の枠接触有無判断処理を示すフローチャートである。
【図２９】本発明の一実施例によるパターン認識装置の消し線の種類を示す図である。
【図３０】本発明の一実施例によるパターン認識装置の訂正文字の特徴量の算出方法を説明する図である。
【図３１】図３の基本文字認識部の構成例を示すブロック図である。
【図３２】図３の基本文字認識部における特徴ベクトルの算出方法の一例を示す図である。
【図３３】図３の基本文字認識部における特徴ベクトル間の距離の算出方法の一例を示す図である。
【図３４】図３の基本文字認識部における詳細識別法の文字セグメントの抽出方法を説明する図である。
【図３５】図３の基本文字認識部における詳細識別法の端点の検出方法を説明する図である。
【図３６】図３の基本文字認識部における詳細識別法の角度変化の検出方法を説明する図である。
【図３７】図３の基本文字認識部における詳細識別法の文字セグメントの対応関係を説明する図である。
【図３８】図３の基本文字認識部における詳細識別法の処理を示すフローチャートである。
【図３９】図３の接触文字認識部における文字補完の方法を示す図である。
【図４０】図３の接触文字認識部における再補完の方法を示す図である。
【図４１】図３の接触文字認識部における補完誤読文字の例を示す図である。
【図４２】図３の接触文字認識部における文字の学習方法の一例を示すブロック図である。
【図４３】図３の接触文字認識部における枠接触文字の生成方法を説明する図である。
【図４４】図３の接触文字認識部における枠接触文字の生成例を示す図である。
【図４５】図３の接触文字認識部における知識テーブルの一例を示す図である。
【図４６】図３の接触文字認識部における知識テーブルに登録される変動種類及び変動量の一例を示す図である。
【図４７】図３の接触文字認識部の領域強調による再認識領域の一例を示す図である。
【図４８】図３の接触文字認識部の領域強調による再認識方法を説明する図である。
【図４９】図３の接触文字認識部の領域強調による再認識処理を示すフローチャートである。
【図５０】図３の接触文字認識部における文字の再認識方法の一例を示すブロック図である。
【図５１】図３の接触文字認識部における文字の再認識処理を示すフローチャートである。
【図５２】図３の文字列認識部の統計的処理によるパラメータの図形的意味を説明する図である。
【図５３】図３の文字列認識部の統計的処理を示すフローチャートである。
【図５４】図３の文字列認識部の分離文字処理によるパラメータの図形的意味を説明する図である。
【図５５】図３の文字列認識部の分離文字処理を示すフローチャートである。
【図５６】図３の文字列認識部の濁点処理によるパラメータの図形的意味を説明する図である。
【図５７】図３の文字列認識部の濁点処理を示すフローチャートである。
【図５８】図３の文字列認識部の文字切り出し成否データの算出処理を示すフローチャートである。
【図５９】図３の文字列認識部の文字切り出し信頼度の定量化方法を示す図である。
【図６０】図３の文字列認識部の度数分布の生成方法を示す図である。
【図６１】図３の文字列認識部の文字切り出し信頼度の算出方法を示すフローチャートである。
【図６２】図３の文字列認識部における文字の切り出し成功及び切り出し失敗のヒストグラム分布の一例を示す図である。
【図６３】図３の文字列認識部における文字の切り出し成功及び切り出し失敗の２群の重なり領域算出法を示す図である。
【図６４】図３の文字列認識部における文字の切り出し処理の流れを示す図である。
【図６５】図３の文字列認識部の非統計的処理における文字の切り出し処理の流れを示す図である。
【図６６】図３のかすれ文字認識部の構成例を示すブロック図である。
【図６７】図３の消し線認識部の処理の一例を示す図である。
【図６８】図３のくせ字解析部によるクラスタリング処理の流れを示す図である。
【図６９】図３のくせ字解析部によるクラスタリング処理を示すフローチャートである。
【図７０】図３のくせ字解析部による文字カテゴリ判定結果修正処理の流れを示す図である。
【図７１】図３のくせ字解析部による文字カテゴリ判定結果修正処理を示すフローチャートである。
【図７２】本発明の一実施例によるパターン認識装置の処理対象となる帳票の例を示す図である。
【図７３】本発明の一実施例によるパターン認識装置の中間処理結果テーブルの一例を示す図である。
【図７４】本発明の一実施例によるパターン認識装置の処理順序テーブルの一例を示す図である。
【図７５】本発明の一実施例によるパターン認識装置の中間処理結果テーブルの一例を示す図である。
【図７６】本発明の一実施例によるパターン認識装置の中間処理結果テーブルの一例を示す図である。
【図７７】本発明の一実施例によるパターン認識装置の中間処理結果テーブルの一例を示す図である。
【図７８】本発明の一実施例によるパターン認識装置の中間処理結果テーブルの一例を示す図である。
【図７９】従来のパターン認識装置の構成を示すブロック図である。
【符号の説明】
１環境認識手段
２第１のパターン認識手段
４第２のパターン認識手段
６第Ｎのパターン認識手段
３、５、７信頼度算出手段
１ａ状態抽出手段
１ｂ認識処理制御手段
１ｃ中間処理結果テーブル作成手段
１ｄ処理順序制御ルール格納手段
１ｅ処理実行ルール格納手段
１ｆ処理順序テーブル
１１環境認識系
１２文字認識部
１３接触文字認識部
１５文字列認識部
１７基本文字認識部
１９かすれ文字認識部
２１つぶれ文字認識部
２３くせ字解析部
２５非文字認識部
２６消し線認識部
２８雑音認識部
１４、１６、１８、２０、２２、２４、２７、２９知識テーブル
３０環境認識系
３１レイアウト解析部
３２訂正解析部
３３文字認識系／非文字認識系
３４基本文字認識部
３５黒枠接触文字認識部
３６フリーピッチ文字列認識部
３７消し線認識部
３８環境認識系
３９くせ字解析部
４０終了判定処理部
４１画像格納部
４２処理条件格納部
４３ラベル画像格納部
４４中間処理結果テーブル
５０プログラムメモリ
５１中央演算処理ユニット
５２画像メモリ
５３ワークメモリ
５４バス
５５インターフェイス回路
５６ディスプレイ
５７プリンタ
５８メモリ
５９スキャナ
６０辞書ファイル[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a pattern recognition apparatus and a pattern recognition method, and in particular, recognizes characters, figures and symbols in not only a handwritten character recognition apparatus but also a printed character recognition apparatus or drawing recognition apparatus according to various states of an input image. Is what you want to do.
[0002]
[Prior art]
Handwritten character recognition devices such as OCR (Optical Character Reader) automatically read characters written on accounting forms, etc., and automatically enter characters to find characters manually from accounting forms. I was trying to save the effort of key input.
[0003]
FIG. 79 is a block diagram showing a configuration of a conventional handwritten character recognition apparatus.
In FIG. 79, a form / document 311 is read by a scanner, and a multi-value image of the form / document 311 is obtained.
[0004]
Next, the pre-processing unit 312 performs binarization of the multi-valued image, noise removal, and form / document 311 inclination correction.
Next, the character cutout unit 313 cuts out characters one by one by using predefined ruled line information and character position information.
[0005]
Next, the character recognition unit 314 performs character recognition for each character and outputs a character code. Here, the character recognition is performed by matching each feature of the unknown character pattern cut out by the character cutout unit 313 with the feature of each character category registered in the recognition dictionary 315 one by one. Done.
[0006]
For example, a two-dimensional character pattern is converted into a feature vector on a feature space representing character features, and the feature in the feature space is used as the similarity between the unknown character pattern and the character category registered in advance in the recognition dictionary 315. Calculate the distance between vectors. Then, the one having the shortest distance between the feature vector of the unknown character pattern and the feature vector of the character category registered in advance in the recognition dictionary 315 is recognized as the character category corresponding to the unknown character pattern.
[0007]
Here, in order to prevent non-characters such as strikethrough, noise, and patterns from being erroneously recognized as characters and preventing the character code of the characters from being output for non-characters, the distance between the two feature vectors Set the threshold. If the distance between the two feature vectors is equal to or greater than this threshold value, it is assumed that the unknown character pattern corresponds to any of the character categories registered in advance in the recognition dictionary 315, or a non-character It was judged that it was, and the rejection code was output.
[0008]
In addition, as a recognition dictionary 315, a dictionary in which characteristics of each character category of high-quality characters, blurred characters, and collapsed characters are registered is prepared. For high-quality characters, the recognition dictionary 315 for high-quality characters is stored. By using the recognition dictionary 315 for blurred characters and using the recognition dictionary 315 for blurred characters, the character quality of the form / document 311 can be improved. I was able to cope with the difference.
[0009]
[Problems to be solved by the invention]
However, the conventional handwritten character recognition device can perform the following on one character that is cut out, whether the character is blurred, the character is crushed, or the character is a high-quality character. Processing is performed uniformly using the same recognition dictionary 315.
[0010]
For this reason, the information on the blurred character registered in the recognition dictionary 315 adversely affects the recognition process of the high-quality character, and the blurred character is registered in the recognition dictionary 315. There was a problem of being unable to read.
[0011]
Also, there are various environments in which characters are written, such as not only fading and crushing but also characters in contact with ruled lines, and when the uniform recognition dictionary 315 tries to cope with various environments However, there is a problem in that it cannot interact with each other and greatly improve the accuracy of the recognition process.
[0012]
  Accordingly, an object of the present invention is to provide a pattern recognition device capable of accurately performing appropriate recognition processing according to the environment in which characters are written.PlaceIs to provide.
[0013]
[Means for Solving the Problems]
  In order to solve the above-described problems, the pattern recognition apparatus according to the present invention is a processing target of an input image., Frame contact characters that are characters that touch the borderLayout analysis means for analyzing whether the image is included, and the input image processing targetBlurred or crushed charactersWhether or notanalysisQuality to doanalysisMeans for processing the input image, Correction characters by strikethroughCorrection analysis means for analyzing whether or not is included,A first knowledge table storing knowledge about a method for recognizing the frame contact character, and performing pattern recognition processing of the frame contact character included in the processing target based on the knowledge stored in the first knowledge table. A first pattern recognition means to perform, and a second knowledge table storing knowledge about a method of recognizing a blurred or crushed character, and based on the knowledge stored in the second knowledge table, the processing target Second pattern recognition means for performing pattern recognition processing of the included blurred or crushed characters, and a third knowledge table storing knowledge about a correction character recognition method, and stored in the third knowledge table Based on knowledge, knowledge of third pattern recognition means for recognizing the pattern of the corrected character included in the processing target and basic character or character string recognition method 4th pattern recognition means which has the stored 4th knowledge table and performs pattern recognition of the basic character or character string contained in the said processing object based on the knowledge stored in this 4th knowledge table When,The processing target is processed by the layout analysis means.Frame contact characterIf it is analyzed thatFirstPattern recognition meansInThe recognition process is performed and the quality isanalysisMeans for processingFaint or collapsed charactersIs includedanalysisSaidSecondPattern recognition meansInA recognition process is performed, and the correction analysis meansCorrection characterIf it is analyzed thatThirdPattern recognition meansInLet the recognition processWhen the processing target includes only basic characters or character strings as characters or character strings, the fourth pattern recognition unit performs recognition processing.With recognition processing control meansThe recognition processing control means, when the same processing target includes a plurality of states of basic characters or character strings, frame contact characters, blurred or collapsed characters, and corrected characters, Corresponding to the processing order storage means for storing the processing order indicating the order in which the recognition processing by the fourth pattern recognition means is executed, and the results of the analysis by the layout analysis means, the quality analysis means, and the correction analysis means A processing order control rule storage means for storing a calling procedure indicating which pattern recognition means to call from among the first to fourth pattern recognition means, and a basic character or character string for the processing target, When it is analyzed that a plurality of states of a frame contact character, a blurred or collapsed character, and a correction character are included, the previous stored in the processing order control rule storage unit, Each recognition process by the pattern recognition means performed on the processing object based on the calling procedure corresponding to a plurality of states and the processing order corresponding to the plurality of states stored in the processing order storage means Intermediate processing result table creation means for creating an intermediate processing result table in which the execution order is entered, and the pattern based on the execution order entered in the intermediate processing result table created by the intermediate processing result table creation means It is characterized by having recognition means perform recognition processing..
  According to one aspect of the present invention, pattern recognition is performed by extracting a processing target state from an input image and selecting a recognition process suitable for the state for each processing target.
[0014]
As a result, pattern recognition processing suitable for each state can be performed on an input image having various states, and the recognition processing can be performed with high accuracy.
[0015]
In addition, according to one aspect of the present invention, the state of the processing target is extracted from the input image, the processing target having the first state is subjected to the pattern recognition processing dedicated to the first state, and the second state For a processing target having a state, a pattern recognition process dedicated to the second state is performed.
[0016]
As a result, the recognition processing for the processing target having the first state and the recognition processing for the processing target having the second state do not interact with each other, and the recognition processing can be performed with high accuracy.
[0017]
In addition, according to one aspect of the present invention, the recognition dictionary is used properly for input images having various states.
For this reason, for example, even when blurred characters, collapsed characters, or high-quality characters are mixed in the input image, a recognition dictionary suitable for blurred characters is used for blurred characters. It is possible to perform recognition processing using a recognition dictionary suitable for collapsed characters, and for high-quality characters using a recognition dictionary suitable for high-quality characters. Become.
[0018]
In addition, according to one aspect of the present invention, the discriminant function is selectively used for input images having various states.
Thus, for example, for a character written in one character frame, character recognition is performed using a city block distance, and for a character written in a free pitch frame, a character extraction reliability is determined using a discriminant function. Character recognition can be performed in consideration of the degree, and the recognition process can be performed with high accuracy.
[0019]
Further, according to one aspect of the present invention, knowledge is selectively used for input images having various states.
By this, for example, if the deformation of unknown characters is large and the correspondence relationship with the character category stored in the recognition dictionary cannot be obtained, the correspondence between the unknown character and the character category is divided by dividing the character into character segments. When a relationship is taken, or when a character is cut out from a character string, a cutout reliability is calculated using a discriminant function generated based on the learning pattern, or when a character recognition for a frame contact character is performed, a learning pattern is used. It is possible to evaluate the recognition reliability of the frame contact character using the determined reliability, and to perform the recognition process with high accuracy.
[0020]
In addition, according to one aspect of the present invention, when a plurality of recognition processes are called for the same processing target, the recognition processes are performed according to the priority order until the reliability by the recognition process is equal to or higher than a predetermined value. I am doing so.
[0021]
As a result, the reliability of the recognition process can be increased, and the accuracy of the recognition process can be improved.
Further, according to one aspect of the present invention, non-characters are extracted from the input image, and recognition processing for the non-characters is performed separately from recognition processing for the characters.
[0022]
As a result, the recognition process is reduced when the character is regarded as a non-character or the non-character is regarded as a character, and the recognition process can be performed with high accuracy.
[0023]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, a pattern recognition apparatus according to an embodiment of the present invention will be described with reference to the drawings.
[0024]
FIG. 1 is a block diagram showing a functional configuration of a pattern recognition apparatus according to an embodiment of the present invention.
In FIG. 1, the environment recognition means 1 extracts the 1st-Nth state from an input image. Here, the state extracted from the input image includes, for example, a state in which characters are written in one character frame, a free pitch frame, a table, etc., a contact state between characters and a frame, and a character blur. The state, the collapsed state of the character, and the state where the character is erased by a strike line.
[0025]
The first pattern recognition means 2 performs dedicated pattern recognition processing for the processing target having the first state, and the second pattern recognition means 4 performs pattern recognition processing for the processing target having the second state. The Nth pattern recognition means 6 performs the pattern recognition processing for the processing target having the Nth state exclusively.
[0026]
Here, the first to Nth pattern recognition means 2, 4, and 6 include reliability calculation means 3, 5, and 7 for calculating the reliability of each recognition result, and the first to Nth pattern recognitions The reliability about the recognition result by the

means

2, 4, 6 is calculated.
[0027]
And the environment recognition means 1 calls the thing corresponding to a 1st-Nth state from the 1st-Nth pattern recognition means 2, 4, and 6, and performs recognition processing.
For example, when the environment recognition unit 1 extracts the first state from the input image, the pattern recognition process by the first pattern recognition unit 2 is called for the processing target in the first state, and the first state is extracted from the input image. When the second state is extracted, the pattern recognition processing by the second pattern recognition unit 4 is called for the processing target of the second state, and when the Nth state is extracted from the input image, the Nth The pattern recognition process by the Nth pattern recognition means 6 is called for the processing target in the state.
[0028]
Further, when the environment recognition unit 1 extracts, for example, the first state and the second state for the same processing target, the pattern recognition process by the first pattern recognition unit 2 and the second pattern recognition unit 4 is called for the same processing target.
[0029]
For example, it is assumed that the first state is a state in which characters are written in one character frame, the second state is a state in which a character string is written in a free pitch frame, and the third state is a character and frame. Are in contact with each other, the fourth state is a blurred character state, the fifth state is a collapsed character state, and the sixth state is corrected with a strikethrough. If it is in a state, the first pattern recognition means 2 performs a recognition process for characters written in one character frame, and the second pattern recognition means 4 recognizes a character string written in a free pitch frame. The third pattern recognition means performs recognition processing for frame contact characters, the fourth pattern recognition means performs recognition processing for faint characters, and the fifth pattern recognition means performs recognition processing for collapsed characters. And the sixth Turn recognition means performs the recognition process for the correction character.
[0030]
When the environment recognizing unit 1 extracts one character frame from the input image, the first pattern recognizing unit 2 executes recognition processing on the character written in the one character frame, and the environment recognizing unit 1 When a free pitch frame is extracted from the input image, the second pattern recognition unit 4 executes recognition processing on the character written in the free pitch frame, and the environment recognition unit 1 detects the frame contact character from the input image. When the frame recognition character is extracted by the third pattern recognition unit, and the environment recognition unit 1 extracts a blurred character from the input image, a fourth character is detected for the blurred character. When the recognition processing is executed by the pattern recognition unit and the environment recognition unit 1 extracts a collapsed character from the input image, the fifth pattern recognition unit recognizes the collapsed character. To execute the management, environment recognition unit 1, if the extracted candidates of the correction character from the input image, to candidates of the correction character to execute the recognition processing by the sixth pattern recognition means.
[0031]
For example, when the environment recognition unit 1 extracts a frame contact character in contact with the free pitch frame from the input image, the pattern recognition unit 2 and the pattern are detected for the frame contact character in contact with the free pitch frame. When recognition processing is executed by the recognition unit 3 and a frame contact character with a strike line that is in contact with the free pitch frame is extracted from the input image, the frame contact character with a strike line that is in contact with the free pitch frame On the other hand, the second pattern recognition unit 4, the third pattern recognition unit, and the sixth pattern recognition unit execute recognition processing.
[0032]
Here, when a plurality of states for the same processing target are extracted from the input image and a plurality of

pattern recognition units

2, 4, 6 are called in response thereto, the plurality of

pattern recognition units

2, 4, 6 are The order of recognition processing by the plurality of pattern recognition means 2, 4, 6 is determined on the basis of the processing order table that stores the order of calling. Then, by the recognition processing by the pattern recognition means 2, 4, 6, the reliability by the plurality of pattern recognition means 2, 4, 6 is obtained until the reliability higher than a predetermined threshold is obtained by the reliability calculation means 3, 5, 7. The recognition process is executed sequentially according to the calling order.
[0033]
For example, when the environment recognition unit 1 extracts a frame contact character that is in contact with the free pitch frame from the input image, the pattern recognition unit 3 performs a recognition process on the frame contact character that is in contact with the free pitch frame. After executing the recognition processing by the pattern recognition means 2 and extracting a frame contact character with a strike line in contact with the free pitch frame from the input image, with a strike line in contact with the free pitch frame For the frame contact character, the recognition process by the third pattern recognition unit is executed, the recognition process by the sixth pattern recognition unit is executed, and the recognition process by the second pattern recognition unit 4 is further executed.
[0034]
FIG. 2 is a block diagram showing a configuration of an embodiment of the environment recognition unit 1 of FIG.
In FIG. 2, the state extraction means 1a extracts the first to Nth states from the input image.
[0035]
The recognition processing control means 1b corresponds to any one of the first to Nth pattern recognition means 2, 4 and 6 in FIG. 1 corresponding to the first to Nth states extracted by the state extraction means 1a. One or more are called to perform recognition processing.
[0036]
The processing order table 1f is configured such that when a plurality of recognition units are called out from the first to Nth

pattern recognition units

2, 4, and 6, these first to Nth

pattern recognition units

2, 4, and 6 are used. The processing order indicating in what order is executed is stored.
[0037]
The processing order control rule storage means 1d calls which recognition means from among the first to Nth pattern recognition means 2, 4, 6 based on the first to Nth states extracted by the state extraction means 1a. Stores the calling procedure that indicates
[0038]
The intermediate processing result table creating unit 1c is configured to execute the first to Nth pattern recognition units 2 based on the calling procedure stored in the processing order control rule storage unit 1d and the processing order stored in the processing order table 1f. An intermediate processing result table indicating the execution order of 4 and 6 is created.
[0039]
The process execution rule storage unit 1e stores a procedure for instructing execution of the next process based on the execution result of the recognition process entered in the intermediate process result table.
FIG. 3 is a block diagram showing a specific configuration of a pattern recognition apparatus according to an embodiment of the present invention.
[0040]
In FIG. 3, the environment recognition system 11 extracts the state of the input image, and based on the extracted state, the basic character recognition unit 17, the character string recognition unit 15, the contact character recognition unit 13 of the character recognition unit 12, Any one or more of the blurred character recognizing unit 19, the broken character recognizing unit 21, or the non-character recognizing unit 25, the strikethrough line recognizing unit 26, and the noise recognizing unit 28 are called. Here, in order to extract the state of the input image, layout analysis, quality analysis, and correction analysis of the input image are performed.
[0041]
The character recognition unit 12 performs character recognition processing for each state of the input image, and includes a basic character recognition unit 17 that performs character recognition for characters, and a character string recognition unit that performs character recognition B and character segmentation B for character strings. 15, a character recognition unit 13 that performs character recognition A and character cutout A for a character that touches the frame, a character recognition C that performs character cutout C and a character cutout C that performs character cutout C, and a character recognition D that corresponds to a collapsed character And a character recognition unit 21 that performs character segmentation D, and a character recognition unit E that performs character segmentation E and a character segmentation unit 23 that performs character segmentation E.
[0042]
Further, the basic character recognition unit 17, the character string recognition unit 15, the contact character recognition unit 13, the blurred character recognition unit 19, the collapsed character recognition unit 21, and the habit recognition unit 23 each store knowledge about the character recognition technique. Knowledge tables 14, 16, 18, 20, 22, 24 are provided. The knowledge table 14 stores, for example, knowledge about the frame contact state and the recognition reliability and knowledge about the overlapping partial pattern method, and the knowledge table 16 combines knowledge about the extraction reliability and extraction and recognition, for example. Knowledge about the law is stored, and the knowledge table 18 stores knowledge about the detailed identification method, for example.
[0043]
The non-character recognition unit 25 performs non-character recognition processing for each state of the input image. The non-character recognition unit 26 performs non-character recognition F and non-character cut-out F for strikethrough, and non-character recognition G for noise. And a noise recognition unit 28 that performs non-character cutout G.
[0044]
Further, the strikethrough line recognition unit 26 and the noise recognition unit 28 include knowledge tables 27 and 29 that store knowledge about the non-character recognition method, respectively.
FIG. 4 is a flowchart showing an example of the overall processing of the environment recognition system 11.
[0045]
In FIG. 4, first, as shown in step S1, preprocessing of the input image is performed. In the preprocessing of the input image, the input image binarized by a facsimile or a scanner is labeled, and the input image and the label image are stored. Note that the input image and the label image can be accessed at any time in the subsequent processing.
[0046]
FIG. 5 is a flowchart showing preprocessing of the input image of FIG.
In FIG. 5, as shown in step S11, by labeling the binarized input image, the connected pattern is extracted and labeled, and the extracted label image and the input image are stored. At this time, the memory capacity is reduced by compressing and expressing the labeled connection pattern by adding and subtracting the circumscribed rectangle. According to the compressed representation of the labeled connection pattern, for example, an A4 size (about 3000 × 4000) document / form input by a high resolution scanner of 400 dpi can be expressed within several hundred kilobytes. .
[0047]
Next, layout analysis is performed as shown in step S2 of FIG. This layout analysis performs text recognition, ruled line extraction, frame extraction, frame type and table determination, frame contact character presence / absence determination, and figure recognition based on the size and arrangement state of the labeled connected pattern. .
[0048]
FIG. 6 is a flowchart showing the layout analysis of FIG.
In FIG. 6, first, as shown in step S21, text recognition is performed. In this text recognition, the size of the labeled connection pattern is analyzed, and a pattern having a relatively small connection pattern size is extracted and regarded as a character candidate. Then, the text is extracted by integrating adjacent character candidates.
[0049]
Next, ruled line extraction is performed as shown in step S22. In this ruled line extraction, a ruled line is extracted by searching for a connected pattern that has not been recognized as a text in step S21 and having a large vertical or horizontal histogram value.
[0050]
Next, as shown in step S23, frame extraction is performed. In this frame extraction, ruled lines corresponding to four sides are found from the ruled lines extracted in step S22, and a frame is extracted.
Next, as shown in step S24, frame type / table discrimination is performed. In the frame type / table discrimination, the type of the frame is discriminated and an attribute of the type of frame is given to the frame extracted in step S23. Examples of the frame type attribute include a single character frame, a block frame, a free pitch frame, and a table.
[0051]
Next, as shown in step S25, it is determined whether or not there is a frame contact character. The determination of the presence or absence of the frame contact character is made by detecting whether or not there is an intersecting pattern when searching within the frame along the frame line. If there is an intersecting pattern, the character is in contact with the frame. Judge that. Here, even if there is an intersecting pattern, characters may protrude from the frame next to the frame of interest. It is assumed that it is not a contact character for the frame that is being used.
[0052]
Next, as shown in step S26, figure recognition is performed. In this figure recognition, a figure attribute is assigned to a connection pattern having a relatively large size to which an attribute such as a text, a frame, or a table is not given.
[0053]
Next, quality analysis is performed as shown in step S3 of FIG. This quality analysis detects whether the input image is blurred or crushed, and includes a global quality analysis and a local quality analysis.
[0054]
In this quality analysis, for a predetermined area, the value of (the number of connected areas whose area and vertical / horizontal length are each equal to or less than a predetermined threshold) / (the number of all connected areas in the predetermined area) is When it is larger than the predetermined value, it is determined that the image is faint.
[0055]
Further, by using information obtained by partially integrating the ruled lines that have been blurred when extracting the ruled lines, for a predetermined area, (total of the lengths of the complemented parts when the blurred ruled lines are complemented) / (each ruled line When the value of (total length of) is larger than a predetermined value, it is determined that the image is faint.
[0056]
Further, for a predetermined area, when the value of (number of connected areas where the black pixel density is larger than a predetermined threshold) / (number of all connected areas of the predetermined area) is larger than a predetermined value, to decide.
[0057]
FIG. 7 is a flowchart showing the quality analysis of FIG.
In FIG. 7, first, as shown in step S31, a global quality analysis is performed. This global quality analysis is performed on the entire document / form, and whether the threshold for binarizing the input image is appropriate or not is determined for the document / form sent by facsimile. Analyzes whether the quality is not always incorrect because the noise has spread all over, and whether blurring or crushing has occurred.
[0058]
Next, as shown in step S32, local quality analysis is performed. This local quality analysis is performed to check whether blurring or crushing has occurred in each area to which attributes such as one character frame, text, free pitch frame, and table have been assigned by layout analysis, or whether noise has occurred. To conduct quality analysis.
[0059]
Next, correction analysis is performed as shown in step S4 of FIG. In this correction analysis, a strike line is extracted from the input image, and a character recognition process can be omitted for a character corrected by the strike line.
[0060]
FIG. 8 is a flowchart showing the correction analysis of FIG.
In FIG. 8, first, as shown in step S41, correction feature extraction is performed. This correction feature extraction is to extract the features that are effective for the correction character. The correction character is roughly divided into a collapsed character, a character erased with a double line, a character erased with a diagonal line, and a character erased with a burst. There are four types, and the characteristics of each corrected character are extracted by calculating the black pixel line density, line density, Euler number, histogram value, and the like.
[0061]
Next, as shown in step S42, correction character candidate extraction is performed. In this correction character candidate extraction, a correction character candidate is extracted from a difference in distribution between a corrected character and an uncorrected normal character in a feature space representing the characteristics of the corrected character.
[0062]
Next, as shown in step S5 of FIG. 4, character recognition / non-character recognition control is performed. This character recognition / non-character recognition control is performed based on the state of the input image extracted in steps S2 to S4 in FIG. 4, the basic character recognition unit 17, the character string recognition unit 15, and the contact character recognition of the character recognition unit 12. This determines whether to call out the striker recognition unit 26 or the noise recognition unit 28 of the unit 13, the blurred character recognition unit 19, the collapsed character recognition unit 21 or the non-character recognition unit 25, and reads / processes the intermediate processing result table Execution of sequence control rules, termination determination, and processing based on processing execution rules are performed.
[0063]
Here, the processing order control rule is based on the state extracted by the environment recognition system 11, the basic character recognition unit 17, the character string recognition unit 15, the contact character recognition unit 13, the blurred character recognition unit 19 of the character recognition unit 12, A procedure for calling one of the striker character recognition unit 21 and the striker line recognition unit 26 and the noise recognition unit 28 of the non-character recognition unit 25 is shown.
[0064]
The process execution rule indicates a procedure of what process is to be performed next based on the result of the recognition process called by the process order control rule.
In addition, the intermediate processing result table fills in the state of the input image extracted in steps S2 to S4 in FIG. 4 for each region to which attributes such as one character frame, text, free pitch frame, and table are given by layout analysis. At the same time, the process called by the input process order control rule is entered in the process order stored in the process order table.
[0065]
For example, when the environment recognition system 11 extracts a character, the basic character recognition unit 17 is called for the character to perform recognition processing, and the environment recognition system 11 extracts the text in step S21 of FIG. In this case, for this text, the character string recognition unit 15 is called to execute recognition processing, and when the environment recognition system 11 extracts a frame contact character in step S25 of FIG. In step S32 in FIG. 7, the environment recognition system 11 performs a recognition process by calling the contact character recognition unit 13 (in step S32 in FIG. ) / (Number of all connected areas in the predetermined area) is determined to be larger than a predetermined value, the character recognition unit 19 is called for recognition processing for characters in this area. The environment recognition system 11 When it is determined in step S32 of FIG. 7 that the value of (number of connected regions where the black pixel density is greater than a predetermined threshold value) / (number of all connected regions of the predetermined region) is greater than a predetermined value For the characters in this area, the collapsing character recognition unit 21 is called to perform recognition processing, and when the environment recognition system 11 extracts the correction character candidates in step S42 in FIG. On the other hand, if the strike line recognition unit 26 is called to perform recognition processing and the environment recognition system 11 detects noise in step S32 of FIG. 7, the noise recognition unit 28 is called for this noise. Perform recognition processing.
[0066]
FIG. 9 is a flowchart showing control of character recognition / non-character recognition in FIG. In FIG. 9, first, as shown in step 51, the intermediate process result table reading / processing order control rule is executed.
[0067]
Next, as shown in step 52, end determination is performed. This end determination is determined to be ended when all the processes of the intermediate process result table are completed and all the process instruction columns of the intermediate process result table are filled based on the process order control rule. If it is determined in the end determination that the process is not completed, the process proceeds to step 53, the process according to the process execution rule is executed, the process returns to step 51, and the above processes are repeated until it is determined in step 52 that the process is ended.
[0068]
FIG. 10 is a block diagram showing a system configuration of a pattern recognition apparatus according to an embodiment of the present invention.
In FIG. 10, an image storage unit 41 stores a form image, and a processing condition storage unit 42 stores a definition structure such as a layout structure of a form and read character information, for example, a frame position, type, size, character type, and number of characters. The label image storage unit 43 stores the labeled label image in a compressed expression.
[0069]
The environment recognition system 30 includes a layout analysis unit 31 and a correction analysis unit 32, the environment recognition system 38 includes a comb character analysis unit 39 and an end determination processing unit 40, and the character recognition system / non-character recognition system 33 is a basic character recognition unit. 34, a black frame contact character recognition unit 35, a free pitch character string recognition unit 36, and a strikethrough line recognition unit 37.
[0070]
The layout analysis unit 31 performs ruled line extraction, frame extraction, and black frame contact character extraction on the label image stored in the label image storage unit 43 while referring to the definition body stored in the processing condition storage unit 42. Here, the format information such as the position and size of the frame and the information related to the inclination are stored in advance as form data, and a method of extracting ruled lines and frames based on this form data is disclosed in, for example, Japanese Patent Laid-Open No. Sho 62-62. No. 21288 and JP-A-3-126186.
[0071]
For example, as described in JP-A-6-309498 and JP-A-7-28937, ruled line extraction and frame extraction can be performed without requiring input of format information such as frame position and size. You may make it perform.
[0072]
The correction analysis unit 32 extracts a strikethrough candidate, the habit analysis unit 39 analyzes the habit character based on personal writing characteristics, and the end determination processing unit 40 determines the end of character recognition, and determines that the end is determined by end determination. If it is, the character recognition result is output.
[0073]
The basic character recognition unit 34 recognizes a character cut out for each character, and the black frame contact character recognition unit 35 removes the frame from the black frame contact character and complements the blurred character by removing the frame. The free pitch character string recognition unit 36 performs character recognition for the character string in consideration of the extraction reliability when extracting the character from the character string, and the strikethrough line recognition unit 37 corrects the character. The strikethrough is recognized based on the black pixel line density, line density, Euler number, histogram, etc. of the character.
[0074]
The intermediate processing result table 44 stores a processing order indicating which processing of the character recognition system / non-character recognition system 33 is executed based on the state extracted by the

environment recognition systems

30 and 38 and the processing result thereof. .
[0075]
FIG. 11 is a block diagram showing a specific configuration of a character recognition system to which the pattern recognition apparatus of FIGS.
In FIG. 11, 51 is a central processing unit (CPU) that performs overall processing, 52 is a program memory that stores programs executed by the

CPU

51, 53 is an image memory that stores image data in a bitmap format, and 54 is Work memory used for image processing, 55 is a scanner that optically reads an image, 56 is a memory that temporarily stores information read by the

scanner

55, 57 is a dictionary file that stores the characteristics of each character image, 58 A display for displaying the recognition result, 59 is a printer for printing the recognition result, 60 is an input / output interface of the display 58 and the

printer

59, 61 is a CPU 51, program memory 52, image memory 53, work memory 54, memory 56, dictionary file 57 , Input / output interface 60 and driver 64 A continuous bus, 62 is a communication interface for transmitting and receiving data and programs via the communication network 63, 64 is a driver, 65 is a hard disk, 66 is an IC memory card, 67 is a magnetic tape, 68 is a floppy disk, and 69 is An optical disc such as a CD-ROM or a DVD-ROM.
[0076]
The character recognition system temporarily stores image data read by the scanner 55 in the memory 56 and develops the image data in the image memory 53 in a bitmap format. Then, pattern extraction processing is performed on the binary image data copied from the image memory 53 to the work memory 54. Based on the result, the character image is cut out from the image data read by the scanner 55, the feature of the cut out character image is compared with the feature data stored in the dictionary file 57, and the character is recognized. Thereafter, the recognition result is output to the display 58 or the printer 59.
[0077]
In this character recognition system, the pattern extraction apparatus of FIGS. 1 to 3 is realized as a function of the CPU 51 that performs processing in accordance with a program stored in the program memory 52. Here, the program for performing the pattern extraction processing can be stored in advance in the ROM of the program memory 52. Further, after loading a program for performing pattern extraction processing from a storage medium such as the hard disk 65, IC memory card 66, magnetic tape 67, floppy disk 68 or optical disk 69 into the RAM of the program memory 52, the CPU 51 executes the program. You may do it.
[0078]
Furthermore, a program for performing pattern extraction processing can be extracted from the communication network 63 via the communication interface 62. The communication network 63 connected to the communication interface 62 includes, for example, a LAN (Local Area Network), a WAN (Wide Area Network), the Internet, an analog telephone network, a digital telephone network (ISDN: Integrated Service Digital Network), and a PHS (Personal Handy Network). System) and a wireless communication network such as satellite communication can be used.
[0079]
Hereinafter, the configuration of the environment recognition system 11, the character recognition unit 12, and the non-character recognition unit 25 of FIG. 3 will be described in more detail.
FIG. 12 is a diagram for explaining the labeling process in step S11 of FIG.
[0080]
In FIG. 12, when a binary image composed of “0” and “1” is input to the labeling processing unit 70, the labeling processing unit 70 receives the connection pattern composed of connected pixels. And a label image with a label for each connection pattern is generated and stored in the label image storage unit 71. For example, when a binary image 72 composed of “0” and “1” is input, labels “1”, “2”, and “3” are attached to each connection pattern to generate a label image 73.
[0081]
Here, for example, if 255 connected patterns exist in one image, 255 labels are required, so 8 bits per pixel are required, and the storage capacity required for the label image storage unit 71 is: The number of pixels in one image is eight times larger, and a large storage capacity is required to store the label image.
[0082]
FIG. 13 is a diagram for explaining a method of reducing the storage capacity required for the label image storage unit 71 by compressing and expressing the label image 73 of FIG.
In FIG. 13, for example, the connection pattern A in FIG.₁And connection pattern A₂As shown in FIG. 13B, a label “1” and a label “2” are attached to each of the patterns, and as shown in FIG.₁Circumscribed rectangle B circumscribing₁And connection pattern A₂Circumscribed rectangle B circumscribing₂Has been generated. Circumscribed rectangle B₁And circumscribed rectangle B₂Is a circumscribed rectangle B as shown in FIG.₁And circumscribed rectangle B₂Coordinates of the upper left vertex of (x₁, Y₁) And the coordinates of the lower right vertex (x₂, Y₂).
[0083]
And connection pattern A₁Circumscribed rectangle B circumscribing₁And connection pattern A₂Circumscribed rectangle B circumscribing₂To determine whether or not they overlap, the connection pattern A₁Circumscribed rectangle B circumscribing₁And connection pattern A₂Circumscribed rectangle B circumscribing₂And the circumscribed rectangle B₁And circumscribed rectangle B₂Coordinates of the upper left vertex of (x₁, Y₁) And the coordinates of the lower right vertex (x₂, Y₂) Is memorized.
[0084]
On the other hand, connection pattern A₁Circumscribed rectangle B circumscribing₁And connection pattern A₂Circumscribed rectangle B circumscribing₂And the circumscribed rectangle B in a smaller rectangular area so that it does not overlap other circumscribed rectangles₁And circumscribed rectangle B₂And the subdivided rectangular area is the original circumscribed rectangle B₁And circumscribed rectangle B₂It is determined which one of them belongs, and the connection pattern A₁And connection pattern A₂Are expressed by operations such as sum and difference of the subdivided rectangular areas.
[0085]
For example, in FIG.₁Is the connection pattern A₁Using the rectangular area (1-2) included in the rectangular area (1-1) and the rectangular area (1-1) belonging to
A₁= (1-1)-(1-2)
Thus, it can be expressed by the difference between the rectangular area (1-1) and the rectangular area (1-2).
[0086]
In addition, connection pattern A₂Is the connection pattern A₂A rectangular area (2-1) included in the rectangular area (2-1), a rectangular area (2-2) included in the rectangular area (2-1), and a rectangular area (2-3) included in the rectangular area (2-2). ,
A₂= (2-1)-(2-2) + (2-3)
As described above, the difference between the rectangular area (2-1) and the rectangular area (2-2) and the sum of the rectangular area (2-3) can be expressed.
[0087]
Thus, by expressing the connection pattern with the circumscribed rectangles of the pixels to be connected, the amount of information expressing the connection pattern can be reduced, and the storage capacity required to store the label image can be reduced.
[0088]
The method for compressing and expressing the label image is described in, for example, Japanese Patent Application Laid-Open No. 8-55219.
FIG. 14 is a flowchart showing an embodiment of the text recognition process in step S21 of FIG.
[0089]
In FIG. 14, first, as shown in step S61, a document is read by a scanner, and image data of the read document is stored in a memory.
Next, as shown in step S62, in the image data read in step S61, attention is paid only to a strip-shaped partial area in a specific section in the horizontal direction, labeling is performed in the focused partial area, and black The circumscribed rectangle of the connected pixel is obtained.
[0090]
For example, there are a plurality of documents A, B, and C as processing targets, and the area of the character string 81 of the document A in FIG. 15A is within the range of the section A as shown in FIG. The area of the character string 82 of the document B in FIG. 15B is within the section A as shown in FIG. 15D, and the area of the character string 83 of the document C in FIG. As shown in FIG. 15 (d), when it is within the range of the section B, attention is paid only to the partial areas of the sections A and B, and the labeling process is performed only in the strip shape of the partial areas, thereby The circumscribed rectangle of the pixel is obtained.
[0091]
Next, as shown in step S63, the difference between the height of the circumscribed rectangle obtained in step S62 and the height ylen of the rectangle obtained in advance is within the threshold value thy, and the circumscribed rectangle obtained in step S62. Only the circumscribed rectangle whose difference between the width of the rectangle and the predetermined width xlen of the rectangle is within the threshold thx is extracted. Then, the coordinates in the y direction (vertical direction) where the circumscribed rectangle exists are obtained and stored in the memory.
[0092]
Next, as shown in step S64, attention is paid to a horizontally long partial region in which the length in the left-right direction including the rectangle extracted in step S62 is equal to the image width around the coordinate in the y direction obtained in step S63.
[0093]
Next, as shown in step S65, the circumscribed rectangle of the black connected pixel is obtained by labeling the horizontally long partial region obtained in step S64.
Next, as shown in step S66, the difference between the height of the circumscribed rectangle obtained in step S65 and the height ylen of the rectangle obtained in advance is within the threshold value thy, and the circumscribed rectangle obtained in step S65. Only the circumscribed rectangle whose difference between the width of the rectangle and the predetermined width xlen of the rectangle is within the threshold thx is extracted and stored in the memory.
[0094]
Next, as shown in step S67, the rectangles extracted in step S66 are sorted by the x coordinate, the pitch is calculated from the interval between the extracted center lines of the rectangles, and the pitch obtained by this calculation is obtained in advance. A text in which a predetermined number th or more of rectangles whose difference from the pitch pitch is within the threshold th pitch is arranged in the horizontal direction is output as text.
[0095]
This text extraction method is described in, for example, Japanese Patent Laid-Open No. 8-171609.
Next, an example of the ruled line extraction process in step S22 of FIG. 6 will be described more specifically.
[0096]
This ruled line extraction process divides a connection pattern obtained by labeling into a plurality of horizontal and vertical directions, calculates adjacent projection values of the connection patterns within the respective ranges divided in the horizontal and vertical directions, A ruled line is extracted by detecting a part of a line segment or a part of a straight line by rectangular approximation.
[0097]
Here, the adjacent projection is obtained by adding the projection value of the surrounding row or column to the projection value of the attention row or column. The projected value of the target row or target column is the sum of the black pixels existing in that row or column.
[0098]
FIG. 16 is a diagram for explaining this adjacent projection processing.
In FIG. 16, when the projection value of i row is p (i), the adjacent projection value P (i) can be calculated by the equation (1).
[0099]
P (i) = p (i−j) +... + P (i) +... + P (i + j) (1)
In the example shown in FIG. 16, j = 1 is set in the equation (1).
FIG. 17 is a diagram illustrating an example of projection values of partial patterns.
[0100]
In FIG. 17, the length in the vertical direction is L_Y, The lateral length is L_XWhen the projection value Ph (i) in the horizontal direction j of the rectangle 84 is HP (i) and the projection value Pv (j) in the vertical direction i of the rectangle 84 is VP (j), HP (1) = HP (n) = M, HP (2) -HP (n-1) = 2, VP (1) = VP (m) = n, VP (2) -VP (m-1) = 2.
[0101]
Thus, since the projection value becomes large in the part where the straight line constituting the rectangle 84 exists, the straight line constituting the ruled line can be extracted by calculating this projection value.
[0102]
For example, by detecting a partial pattern in which the ratio between the adjacent projection value and the vertical and horizontal division lengths is greater than or equal to a predetermined threshold, it is possible to extract candidates for straight lines constituting the ruled line.
[0103]
FIG. 18 is a flowchart showing ruled line extraction processing.
In FIG. 18, first, as shown in step 601, it is determined whether or not the ratio between the adjacent projection value and the vertical and horizontal division lengths is equal to or greater than a predetermined threshold value. If it is determined that the ratio between the adjacent projection value and the vertical and horizontal division lengths is not equal to or greater than the predetermined threshold value, the process proceeds to step S602, and it is considered that there is no line segment constituting the ruled line.
[0104]
On the other hand, if it is determined in step S601 that the ratio between the adjacent projection value and the vertical and horizontal division lengths is equal to or greater than a predetermined threshold value, the process proceeds to step S603, and there is a line segment constituting the ruled line. I reckon.
[0105]
Next, in step S604, it is determined whether the pattern regarded as the line segment in step S603 is in contact with the line segments existing above and below the pattern. If it is determined that the pattern is not in contact with the upper and lower line segments, the process proceeds to step S605, where the pattern is a rectangular line segment.
[0106]
On the other hand, if it is determined in step S604 that the pattern regarded as the line segment in step S603 is in contact with the line segment existing above and below it, the process proceeds to step S606, and the pattern and the line segment existing above and below the line segment are determined. To integrate. In step S607, the line segments integrated in step S606 are detected as rectangular line segments. For example, three rectangular line segments 85 as shown in FIG. 19A are integrated to obtain one rectangular line segment 86 shown in FIG. Thereafter, a ruled line is extracted by searching for the rectangular line segment obtained in step S605 or step S607.
[0107]
This ruled line extraction process is described in, for example, Japanese Patent Laid-Open No. 6-309498.
FIG. 20 is a diagram for explaining a method of performing a search while complementing a blurred rule line in the ruled line extraction process in step S22 of FIG.
[0108]
In this method of complementing the fading ruled line, when searching for a pattern constituting a straight line, even if there is a blank area having no pattern in the direction of search, there is a pattern for a blank area having a certain number of pixels or less. It is assumed that the search is performed.
[0109]
For example, as shown in FIG. 20, when searching for a pixel 92 constituting the straight line 91 with respect to the straight line 91, the search is performed assuming that there is a pixel 92 for a blank area 93 having a certain number of pixels or less. I do.
[0110]
FIG. 21 is a flowchart illustrating a method for complementing the blurred ruled line in the ruled line extraction process.
In FIG. 21, first, as shown in step S71, the X coordinate of the thinnest portion of the pattern within a predetermined rectangular range is calculated.
[0111]
Next, as shown in step S72, the center point of the pattern at the X coordinate calculated in step S71 is calculated. Then, as shown in step S73, the center point of the pattern calculated in step S72 is set as the search start point. Here, the reason why the starting point of the search is the thinnest part of the pattern is that it is unlikely that the thinnest part is a character, so that the straight line that becomes the frame can be searched more reliably.
[0112]
In step S74, the straight line search direction is set to the right.
Next, as shown in step S75, the initial value of variable K for counting the length of the blank area is set to zero.
[0113]
Next, as shown in step S76, the starting point obtained in step S73 is set as the current location of the pattern search.
Next, as shown in step S77, it is determined whether the current location of the search set in step S76 is within the rectangular range noted in step S71, and the current location of the search is the rectangular range noted in step S71. If not, the process proceeds to step S86.
[0114]
On the other hand, if it is determined in step S77 that the current location of the search is within the rectangular range of interest in step S71, the process proceeds to step S78 to determine whether there is a pattern adjacent to the search direction as viewed from the current location of the search. Here, the fact that there is a pattern next to the search direction as viewed from the current location of search means that a pattern 102 exists at a position adjacent to the right direction as viewed from the pattern 101 as shown in FIG. If it is determined that there is a pattern 102 next to the search direction as viewed from the current position of the search, the process proceeds to step S81, and the pattern 102 adjacent to the search direction is set as the current position of the search.
[0115]
On the other hand, if it is determined in step S78 that there is no pattern next to the search direction as viewed from the current location of the search, the process proceeds to step S79 to determine whether or not there is a pattern diagonally adjacent to the search direction as viewed from the current location of the search.
[0116]
Here, the fact that there is a pattern diagonally next to the search direction as viewed from the current location of search means that the pattern 104a or 104b exists at a position diagonally adjacent to the right as viewed from the pattern 103, as shown in FIG. is doing. If it is determined that the patterns 104a and 104b are diagonally adjacent to the search direction as viewed from the current location of the search, the process proceeds to step S83, and the patterns 104a and 104b diagonally adjacent to the search direction are set as the current location of the search. When there are two patterns 104a and 104b that are diagonally adjacent to the search direction, one of the patterns 104a and 104b is set as the current location of the search.
On the other hand, if it is determined in step S79 that there is no pattern 104a, 104b diagonally adjacent to the search direction as viewed from the current location of the search, the process proceeds to step S80, and whether the variable K for counting the length of the blank area is equal to or less than the threshold value. Determine if. If the variable K for counting the length of the blank area is equal to or smaller than the threshold value, the process proceeds to step S84, and a pixel that is adjacent to the search direction as viewed from the current position of the search and does not constitute a pattern is set as the current position. For example, in FIG. 20, a search is performed assuming that there is a pattern for the blank area 93 having a certain number of pixels or less.
[0117]
Next, as shown in step S85, the value of the variable K for counting the length of the blank area is increased by 1 dot, and the process returns to step S77.
On the other hand, if it is determined in step S80 that the variable K for counting the length of the blank area is not less than or equal to the threshold value, the process proceeds to step S86, and it is determined whether or not the search direction is set to the right. If the search direction is not set to the right, the process ends.
[0118]
If the search direction is set to the right in step S86, the process proceeds to step S87, and the search direction is set to the left. And the process of step S75-step S85 is repeated similarly to the process performed by setting the search direction to the right.
[0119]
Here, when processing is performed with the search direction set to the left, there is a pattern adjacent to the search direction as viewed from the current location of the search, as shown in FIG. Means that it exists. In addition, the fact that there is a pattern diagonally adjacent to the search direction as viewed from the current location of the search means that a pattern 108a or a pattern 108b exists at a position diagonally adjacent to the left as viewed from the pattern 107, as shown in FIG. ing.
[0120]
The method for complementing this faint ruled line is described, for example, in the specification and drawings of Japanese Patent Application No. 8-107568.
Next, the frame extraction process in step S23 of FIG. 6 will be described.
[0121]
FIG. 23 is a flowchart showing an embodiment of the single character frame extraction process.
In FIG. 23, first, as shown in step S91, the pattern detected as a rectangular line segment by the process of FIG. 18 is searched. At this time, as shown in the flowchart of FIG. 21, a blank area having a predetermined length is searched for that there is a pattern, and the blur is complemented.
[0122]
Next, as shown in step S92, as a result of the search in step S91, it is determined whether the pattern is interrupted at a predetermined length. If the pattern is not interrupted at a predetermined length, the block of FIG. Proceed to the frame extraction process. On the other hand, if the pattern is interrupted at a predetermined length, the process proceeds to step S93, where the searched line segments are integrated to detect a straight line.
[0123]
Next, as shown in step S94, straight lines surrounding four sides are extracted from the straight lines detected in step S93.
Next, as shown in step S95, it is determined whether or not the size of the portion surrounded by the four sides with a straight line is within a predetermined range of the size of one character frame in the same image. If the size of the enclosed part is within a predetermined range of the size of one character frame in the same image, the process proceeds to step S96, and the part surrounded by the four sides is regarded as one character frame, If the size of the portion surrounded by the four sides is not within the predetermined range of the size of one character frame in the same image, the process proceeds to step S97, and the portion surrounded by the four sides is not a one character frame. I reckon.
[0124]
FIG. 24 is a flowchart illustrating an example of block frame extraction processing. In FIG. 24, first, as shown in step S101, it is determined whether or not the horizontal line detected by the search has a length greater than or equal to a predetermined value, and the length of the horizontal line detected by the search is smaller than the predetermined value. In this case, the process proceeds to step S102, and the horizontal straight line is regarded as not a horizontal frame. On the other hand, if the length of the horizontal line detected by the search is greater than or equal to a predetermined value, the process proceeds to step S102, and the horizontal line detected by the search is regarded as a horizontal frame.
[0125]
Next, as shown in step S104, two adjacent horizontal frames are extracted from the horizontal frame extracted in step S103.
Next, as shown in step S105, the range sandwiched between the two horizontal frames taken out in step S104 is regarded as one block block frame.
[0126]
Next, as shown in step S106, a vertical line is detected by extracting a vertical rectangular line from the rectangular lines detected by the process of FIG.
Next, as shown in step S107, the vertical line detected in step S106 is searched. In step S108, it is determined whether the vertical line has reached the upper and lower horizontal frames taken out in step S104. If the vertical line does not reach the upper and lower horizontal frames, the process advances to step S109 to exclude the vertical line from the vertical frame candidates. On the other hand, when the vertical line reaches the upper and lower horizontal frames, the process proceeds to step S110, and the vertical line is set as a vertical frame candidate.
[0127]
Next, as shown in step S111, it is determined whether the processing target is a regular tabular block frame or an irregular tabular block frame. If the processing target is a regular tabular block frame, the process advances to step S112 to calculate the interval between the vertical lines regarded as candidates for the vertical frame in step S110 and to calculate the calculated vertical lines. A histogram indicating the relationship between the interval and the appearance frequency is calculated.
[0128]
Next, as shown in step S113, among the vertical lines within the range sandwiched between two adjacent horizontal frames, vertical lines that form different intervals from other vertical lines are candidates for the vertical frame. And the process ends with the remaining vertical lines as vertical frames.
[0129]
On the other hand, if it is determined in step S111 that the processing target is an irregular tabular block frame, the processing ends with all the vertical frame candidates determined in step S110 as vertical frames.
[0130]
Next, the frame type / table discrimination process in step S24 of FIG. 6 will be described.
FIG. 25 is a diagram illustrating an example of a frame and a table extracted by the frame extraction process in step S23 of FIG.
[0131]
In FIG. 25, FIG. 25 (a) is a single character frame, FIG. 25 (b) is a free pitch frame, FIG. 25 (c) is a block frame, FIG. 25 (d) is a regular table, and FIG. A regular table is shown. One character frame attribute is assigned to one character frame, a free pitch frame attribute is assigned to a free pitch frame, a block frame attribute is assigned to a block frame, and a table attribute is assigned to a table.
[0132]
The frame extraction process and the frame type / table discrimination process are described in, for example, Japanese Patent Laid-Open No. 7-28937.
Next, the frame contact presence / absence determination process in step S25 of FIG. 6 will be described. Here, an example will be described in which the original input image is reduced at a reduction ratio of 1 / n by OR processing and then the frame contact presence / absence determination processing is performed. Here, coordinates are set corresponding to each pixel of the image, the X coordinate is set in the horizontal direction of the image, the Y coordinate is set in the vertical direction of the image, the X coordinate increases rightward, and the Y coordinate increases downward. It is supposed to be.
[0133]
FIG. 26 is a flowchart illustrating an embodiment of the input image reduction process. In FIG. 26, first, an original image is input as shown in step S121. Next, as shown in step S122, a range of horizontal n pixels × vertical n pixels from the upper left of the original image (upper left coordinates (1, 1), lower right coordinates (X, Y)) is set.
[0134]
Next, as shown in step S123, it is determined whether or not there is a black pixel in the set range of the original image. If there is a black pixel in the set range of the original image, the process proceeds to step S124 and the reduction is performed. If the pixel of the image coordinates (X / n, Y / n) is a black pixel and there is no black pixel within the set range of the original image, the process proceeds to step S125, and the coordinates (X / n, Y / Let n) be a white pixel.
[0135]
Next, as shown in step S126, it is determined whether or not the process has been completed to the lower right of the original image. If the process has not been completed to the lower right of the original image, the process proceeds to step S127 and the right end of the original image is reached. Determine whether it has been reached.
[0136]
If the right end of the original image has not been reached, a range of horizontal n pixels × vertical n pixels (upper left coordinates (x, y), lower right coordinates (X, Y)) is set to the right of the processed range, When the right end of the original image is reached, it is below the processed range and from the left end of the original image to a range of horizontal n pixels × vertical n pixels (upper left coordinates (x, y), lower right coordinates (X, Y) ) Is set, and the process returns to step S123, and the above processing is repeated until the reduction processing is completed for the entire range of the original image.
[0137]
Next, the inside of the frame line in the compressed image data reduced by the reduction process of the input image is searched along the frame to determine whether the character is in contact with the frame, and the character is in contact. With respect to the sides, the rectangular area is enlarged outward by a predetermined distance, and the coordinates of the enlarged rectangular area are converted into coordinates in the original image data.
[0138]
For example, as shown in FIG. 27A, a frame line range 110 of the compressed image data is extracted, and a character 112 of “4” exists in a rectangular area surrounded by the frame line. It is assumed that the character 112 touches the lower frame line 111.
[0139]
Next, as shown in FIG. 27B, when the search is performed straight along the inside of the frame line and intersects with the pattern in the middle of the search, a character exists in the vicinity of the frame line, and this character is a frame. Assuming that there is a high possibility of being in contact with the line, it is assumed that the character 112 of “4” existing in the rectangular area surrounded by the frame line is in contact with the frame. In this example, the character “4” 112 is in contact with the lower frame 111.
[0140]
Next, as a result of searching along the inside of the frame line 111 and assuming that the character 112 is in contact with the frame line 111, as shown in FIG. 27C, the frame in which the character 112 is in contact A rectangular area surrounded by a frame line is expanded from the line 111 toward the outside, and the enlarged rectangular area 113 is set as a character area where the character 112 exists. If it is determined that the character is not in contact with the frame line, the inside of the frame is directly used as the character region.
[0141]
Next, in order to obtain the character area in the original image data from the character area in the compressed image data, the coordinates of the rectangular area 113 in FIG. 27C are converted to the coordinates in the original image data. Thereby, as shown in FIG. 27D, a rectangular area 116 in the original image data can be obtained.
[0142]
Next, projection processing is performed on the frame line 114 in the rectangular area 116 of the original image data, and the frame coordinates of the frame line 114 are calculated from the original image data. At this time, the frame line 114 is represented by a strip-like rectangle having a predetermined length. Then, as shown in FIG. 27 (e), the pattern existing in the rectangular area 116 is sent to the character interpolation process, and the frame line 114 is touched based on the frame coordinates of the frame line 114 calculated from the original image data. Completion processing of the existing character 115 is performed.
[0143]
FIG. 28 is a flowchart illustrating an example of a frame contact presence / absence determination process. In FIG. 28, first, as shown in step S131, rectangular representation by compressed image data is performed, for example, by the processing of FIG.
[0144]
Next, as shown in step S132, a rectangular portion surrounded by four vertical and horizontal straight lines is extracted.
Next, as shown in step S133, the coordinates indicating the upper left and lower right of the rectangle indicating the inside of the straight line are calculated.
[0145]
Next, as shown in step S134, a compressed image is searched for along four sides (upper horizontal frame, lower horizontal frame, right vertical frame, left vertical frame) indicating the inside of the frame.
Next, as shown in step S135, when the image pattern is intersected during the search, it is assumed that the character is in contact with the side on which the search was performed.
[0146]
Next, as shown in step S136, the rectangular area in the original image data is calculated from the rectangular area in the compressed image data by converting the rectangular coordinate value indicating the inside of the frame into the coordinate value on the original image.
[0147]
Next, as shown in step S137, the rectangular area calculated in step S136 is set as a character area in the original image data.
Next, as shown in step S138, it is determined whether or not the character is in contact with the frame by the processing in step S135. If the character is in contact with the frame, the contact character range acquisition processing in steps S139 to S143 is performed. .
[0148]
In the contact character range acquisition process, first, in step S139, the character area is expanded outward from the side where the character is in contact, and the position outside the character area position calculated by step S137 by a certain distance is set to the character area. And the end.
[0149]
Next, as shown in step S140, by converting the position coordinates of the frame line included in the character area calculated in step S139 into the coordinate values on the original image, the position coordinates of the frame line in the compressed image data are converted from the original position coordinates. The position coordinates of the frame line in the image data are calculated.
[0150]
Next, as shown in step S141, with respect to the frame area of the original image data acquired based on the position coordinates of the frame line in the original image data calculated in step S140, the horizontal frame is the horizontal direction, and the vertical frame is the vertical direction. Projection processing is performed.
[0151]
Next, as shown in step S142, an area having a projection value equal to or larger than a certain value is set as frame coordinates on the original image.
Next, as shown in step S143, the calculated coordinate value indicating the character area on the original image and the coordinate value indicating the position of the frame line in the character area are passed to the character complementing process.
[0152]
Next, as shown in step S144, the calculated coordinate value indicating the character area on the original image is set as the character area.
Note that the frame contact presence / absence determination processing is described, for example, in the specification and drawings of Japanese Patent Application No. 8-107568.
[0153]
Next, the correction feature extraction process in step S41 and the correction character candidate extraction process in step S42 in FIG. 8 will be described.
FIG. 29 is a diagram illustrating an example of a corrected character.
[0154]
In FIG. 29, the corrected character is a character that has been erased with a strike-through line. As a form of the corrected character, as shown in FIG. 29 (a), the character is erased with an “x” mark, FIG. 29 (b). ), The character is erased by a horizontal double line, the character is erased by an oblique line, as shown in FIG. 29 (c), and the character is wavy as shown in FIG. 29 (d). There are various things such as those erased by painting, and those erased by painting the characters black as shown in FIG.
[0155]
For such a corrected character, a characteristic peculiar to the corrected character is extracted. Features unique to this corrected character include “linear density in a predetermined direction”, “Euler number”, “black pixel density”, and the like.
[0156]
“Linear density in a predetermined direction” is a value obtained by counting the number of times of changing from a white pixel to a black pixel (or from a black pixel to a white pixel) when an image in a rectangle is scanned along a predetermined fixed direction. Further, the predetermined direction is set in a direction perpendicular to the direction of the line segment assumed as a strike line.
[0157]
For example, FIG. 30A shows an example in which the maximum linear density in the vertical direction is counted for the character “6”. In this case, the maximum linear density in the vertical direction is 3.
The “linear density in a predetermined direction” of a corrected character tends to be larger than the “line density in a predetermined direction” of a normal character. By calculating this “line density in a predetermined direction”, a candidate for a corrected character is determined. Can be extracted.
[0158]
“Euler number” E is a value obtained by subtracting the number H of holes in the image from the number C of connected components connected to each other in the image.
For example, FIG. 30B shows an example in which there are only two connected components connected to each other in the image, and there is only one hole in the image. The number of Eulers in this example E is E = C−H = 2-1 = 1.
[0159]
The “Euler number” of the corrected character tends to be a negative value having a large absolute value, and the “Euler number” of the normal character tends to be a small value (2 to −1). Therefore, by calculating the “Euler number”, correction character candidates can be extracted.
[0160]
“Black pixel density” D is the ratio of the area (number of black pixels) B of the image of interest B to the area S of the circumscribed rectangle of the image of interest.
For example, FIG. 30C shows an example in which the black pixel density D is calculated for the character “4”, and the area of the circumscribed rectangle circumscribing the character “4” is S, “4”. If the area of the character of B is B, then D = B / S.
[0161]
The “black pixel density” of the corrected character tends to be larger than the “black pixel density” of the normal character. By calculating this “black pixel density”, a candidate for the corrected character can be extracted.
[0162]
Next, the basic character recognition unit 17 in FIG. 3 will be specifically described.
FIG. 31 is a block diagram showing an example of the configuration of the basic character recognition unit 17.
In FIG. 31, a feature extraction unit 121 extracts a character feature from an input unknown character pattern, and represents the extracted feature by a feature vector. On the other hand, the basic dictionary 122 stores feature vectors of each character category.
[0163]
Then, the matching unit 123 matches the feature vector of the unknown character pattern extracted by the feature extraction unit 121 with the feature vector of each character category stored in the basic dictionary 122, and between the feature vectors on the feature space. Distance D_ij(I is a feature vector of an unknown character, j is a feature vector of a category in the basic dictionary 122). As a result, the distance D between feature vectors_ijIs recognized as an unknown character i.
[0164]
Here, the distance D between feature vectors in the feature space_ijIs, for example, the Euclidean distance Σ (ij)², City block distance Σ | i−j |, or a discriminant function such as a discriminant function.
[0165]
The distance from the first category is D_ij1, D is the distance from the second category_ij2Then, the first category j1, the second category j2, and the distance between categories (D_ij2-D_ij1) And a table 1 relating to the reliability. Also, the distance from the first category is D_ij1The table 2 relating to the first category j1 and reliability is also created in advance. Then, the smaller reliability obtained from Table 1 and Table 2 is stored in the intermediate processing result table.
[0166]
FIG. 32 is a diagram illustrating an example of calculating a feature vector.
In this example, the character “2” is written in a total of 20 squares of 5 × 4 in the vertical direction of FIG. “0” is taken from the upper left to the lower right of the cell and the numerical values of “1” or “0” appearing at that time are arranged in this order as a feature vector.
[0167]
For example, the feature vector vectorA in the case of FIG. 32B is vectorA = (1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, and the feature vector vectorB in the case of FIG. 32C is vectorB = (0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1 , 1, 0, 0, 0, 1, 1, 1, 1, and the feature vector vectorC in the case of FIG. 32D is vectorC = (1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1,.
[0168]
FIG. 33 shows the distance D between feature vectors by the city block distance d (i, j)._ijIt is a figure which shows the example which calculates.
Here, the city block distance d (i, j) is the i-th feature vector x, where N is the dimension number of the feature vector and i is the feature vector number._iIs x_i= (X_i1, X_i2, X_i3, ... x_iN) And the j-th feature vector x_jIs x_j= (X_j1, X_j2, X_j3, ... x_jN). The i-th feature vector x_iAnd the jth feature vector x_jThe city block distance d (i, j) to
d (i, j) = | x_i-X_j| (2)
It is defined as
[0169]
For example, in FIG. 33, it is assumed that feature vectors of character categories “1”, “2”, “3”, and “4” are registered in the basic dictionary 122. Here, the feature vector vector1 of the character category “1” is vector1 = (0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0,), “2” character category feature vector vector 2 is vector 2 = (1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1 , 0, 0, 0, 1, 1, 1, 1,), “3” character vector feature vector vector3 is vector3 = (1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1,), “4” character category feature vector vector4 is vector4 = (1, 0, 1, 0, 1, 0) , 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0).
[0170]
Then, the feature vector vector is vector = (0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, ) Is input, the feature vector vector, the feature vector vector1 of the character category “1” registered in the basic dictionary 122, the feature vector vector2, “3” of the character category “2”. The city block distance d (i, j) between the character category feature vector vector3 and the character category feature vector “4” of “4” is calculated by the equation (2).
[0171]
That is, the city block distance d (i, j) between the feature vector vector of unknown characters and the feature vector vector1 of the character category “1” is d (i, j) = | vector-vector1 | = | 0− 0 | + | 1-1 | + | 1-1 | + | 1-0 | + | 0-0 | + | 0-1 | + | 0-1 | + | 1-0 | + | 1-0 | + | 1-1 | + | 1-1 | + | 1-0 | + | 1-0 | + | 0-1 | + | 0-1 | + | 0-0 | + | 1-0 | + | 1-1 | + | 1-1 | + | 1-0 | = 11.
[0172]
Similarly, the city block distance d (i, j) between the feature vector vector of the unknown character and the feature vector vector2 of the character category “2” is d (i, j) = | vector-vector2 | = 1, The city block distance d (i, j) between the feature vector vector of the unknown character and the feature vector vector3 of the character category “3” is d (i, j) = | vector-vector3 | = 3, The city block distance d (i, j) between the feature vector vector and the feature vector vector4 of the character category “4” is d (i, j) = | vector-vector4 | = 11.
[0173]
Here, a feature vector vector of an unknown character, a feature vector vector1 of a character category “1”, a feature vector vector2 of a character category “2”, a feature vector vector3 of a character category “3”, a character category of “4” City block distance d (i, j) between the character vector vector of the unknown character and the character vector vector 2 of the character category “2” among the city block distances d (i, j) between each of the feature vectors vector 4 of j) is the smallest.
[0174]
Therefore, the feature vector vector is vector = (0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, ) Is determined to belong to the character category “2”.
[0175]
Next, the detailed identification method stored in the knowledge table 18 of the basic character recognition unit 17 in FIG. 3 will be described. In this detailed identification method, local partial patterns of each character category are extracted as character segments, and the character segment position and angle change amount of unknown characters and the character segment position and angle change amount stored in advance in the segment dictionary By comparing, the character is recognized while taking correspondence between the unknown character and the character category.
[0176]
FIG. 34 is a diagram for explaining a character segment extraction method.
FIG. 34A shows a binary image pattern for the character “2”, and the hatched portion indicates the character portion represented by black pixels.
[0177]
FIG. 34B shows a contour line extracted from the binary image pattern of FIG. 34A, and the dotted line portion shows the original binary image pattern.
FIG. 34C shows a state in which the contour line of FIG. 34B is divided into character segments S1 and S2 and end point portions T1 and T2. The end point portions T1 and T2 correspond to the start and end of writing of the character “2” in FIG.
[0178]
FIG. 35 is a diagram for explaining an endpoint detection method.
In FIG. 35, the end points are detected as places where the inclination of the contour line changes abruptly. Specifically, three points A, B, and C separated by a predetermined interval are placed on the contour line S, and the three points A, B , C is detected as an end point on the contour line where the angle θ formed with the middle point A as a vertex is equal to or less than a predetermined value.
[0179]
When the character segment is extracted from the binary image pattern by dividing the outline of the character at the end points, for example, representative points X, Y, and Z are taken at regular intervals on the character segment. Then, the angle formed by successive representative points X, Y, and Z is obtained, and the amount of change in angle from the first representative point on the character segment to each representative point is accumulated as the feature amount at each representative point X, Y, Z. Find the value.
[0180]
FIG. 36 is a diagram for explaining a method of detecting an angle change.
In FIG. 36, representative points X, Y, and Z separated by an arbitrary interval are placed on the contour line S, a vector XY drawn from the representative point X to the representative point Y, and a vector YZ drawn from the representative point Y to the representative point Z. The angle θ between the vector XY and the vector YZ₂Is the angle change at the representative point Y.
[0181]
The angle change at the representative point X on the contour line S, which is the initial value of the angle change, is the angle θ between the vector GX and the vector XY drawn from the center of gravity G of the character to the representative point X.₁Is the angle change at the representative point X.
[0182]
The feature amount at each representative point X, Y, Z is represented by a value obtained by accumulating the angle change from the representative point X having the initial value of the angle change to each representative point Y, Z. For example, the feature at the representative point Y The quantity is θ₁+ Θ₂It becomes the value of.
[0183]
After calculating the cumulative value of the angle change amount at the representative point on the character segment of the unknown character, the representative point for the character segment of the unknown character is matched with the representative point of the character segment stored in the segment dictionary. . That is, the distance between the accumulated value of the angle change amount of the representative point for the character segment of the unknown character and the accumulated value of the angle change amount of the representative point of the character segment stored in the segment dictionary is calculated. The representative point of the character segment of the smaller segment dictionary is made to correspond to the representative point of the character segment of the unknown character.
[0184]
FIG. 37A is a diagram illustrating a correspondence relationship between representative points of character segments of unknown characters and representative points of character segments of the segment dictionary.
In FIG. 37 (a), the representative point a₁~ A₈Represents a representative point on a character segment of an unknown character and a representative point b₁~ B₈Represents a representative point on the character segment stored in the segment dictionary. And the representative point a for the character segment of the unknown character₁~ A₈Is a representative point b of the character segment stored in the segment dictionary.₁~ B₈It corresponds to.
[0185]
After obtaining the correspondence between the representative point of the character segment of the unknown character and the representative point of the character segment of the segment dictionary, the representative of the character segment of the unknown character corresponding to the reference point on the character segment stored in the segment dictionary Let the point be the inspection point.
[0186]
FIG. 37B is a diagram showing a correspondence relationship between the reference points and the inspection points.
In FIG. 37 (b), the character segment reference point d stored in the segment dictionary.₁, D₂Is the check point c of the character segment of the unknown character.₁, C₂It corresponds to.
[0187]
After obtaining the correspondence between the reference point and the inspection point, the inspection point c of the character segment of the unknown character₁, C₂The inspection information about is calculated.
This inspection information is, for example, for one inspection point, for each inspection point absolute position information indicating where the inspection point exists in the entire character image, and for two inspection points. The information includes relative position information such as the distance and direction between the inspection points, and information on angle change and linearity between the inspection points for two or more inspection points.
[0188]
As a result of calculating the inspection information about the inspection point, when a predetermined determination condition is satisfied, the character category of the character segment stored in the segment dictionary that satisfies the determination condition is output as the recognition result of the unknown character.
[0189]
For example, as the determination condition, the inspection point c on the character segment in FIG.₁Inspection point c along the character segment from₂If the angle change up to is used as inspection information, the character image of the character segment whose angle change is 60 degrees or more belongs to the character category “2” in the segment dictionary stored corresponding to the character segment. In this case, the inspection point c on the character segment in FIG.₁Inspection point c along the character segment from₂By calculating the angle change up to, it can be recognized that the character pattern of FIG. 34A belongs to the character category “2”.
[0190]
FIG. 38 is a flowchart showing character recognition processing by the detailed identification method.
In FIG. 38, first, as shown in step S150, a form or the like to be character-recognized is scanned with a scanner, and the read character image is binarized into a monochrome binary image.
[0191]
Next, as shown in step S151, character segments are extracted from the binary image data obtained in step S150.
Next, as shown in step S152, a character segment that is not associated with a character segment of an unknown character is extracted from the plurality of character segments stored in the segment dictionary.
[0192]
Next, as shown in step S153, a correspondence relationship is established between the character segment extracted from the segment dictionary and the character segment of the unknown character.
Next, as shown in step S154, an inspection point is determined from the representative points taken on the character segment of the unknown character, and inspection information about this inspection point is calculated.
[0193]
Next, as shown in step S155, based on the inspection information calculated in step S154, the character segment extracted from the segment dictionary is compared with the character segment of the unknown character, and the character segment inspection information extracted from the segment dictionary By determining whether the character segment inspection information of the unknown character matches, character candidate determination processing for the unknown character is performed.
Next, as shown in step S156, when the character candidate is determined in the character candidate determination process for the unknown character, the character category corresponding to the character segment extracted in step S153 is output as the recognition result. On the other hand, if the character candidate is not determined, the process proceeds to step S157, where it is determined whether there is an unprocessed character segment that is not associated with the character segment of the unknown character in the segment dictionary. If it is in the segment dictionary, the process returns to step S152 to repeat the above processing.
[0194]
On the other hand, if there is no unprocessed character segment in the segment dictionary that is not associated with the character segment of the unknown character, the input unknown character is judged to be unrecognizable, and a recognition result indicating that it cannot be recognized is output. To do.
[0195]
The detailed identification method is described in, for example, Japanese Patent Laid-Open No. 6-309501.
Next, an embodiment of the contact character recognition unit 13 in FIG. 3 will be described.
[0196]
FIG. 39 is a diagram for explaining the character complementing process of the contact character recognition unit 13.
In this character complementing process, only the frame is extracted from the binary image of the frame contact character and this frame is removed. At this time, the frame contact portion of the character line segment that is in contact with the frame of the frame contact character is faded, and the character line segment is broken into a plurality of portions, so each label is assigned to the broken character line segment. Evaluate the geometric structure such as distance and directionality between the character lines and complement it.
[0197]
For example, as shown in FIG. 39A, the label “1” is attached to the binary image that is connected because the character pattern 131 representing “3” and the frame 132 are in contact with each other. Then, by extracting the frame 132 from the binary image of FIG. 39A and removing the frame 132, as shown in FIG. 39B, three character patterns 131 representing “3” are obtained. The three character line segments to which the label “1”, the label “2”, and the label “3” are assigned are generated.
[0198]
For the three character lines to which the label “1”, the label “2”, and the label “3” are assigned, the geometric structure such as the distance and direction between the character lines to which each label is assigned is evaluated. Complement it. As a result, the three character lines to which the label “1”, the label “2”, and the label “3” are attached are connected, and the label “1” is attached as shown in FIG. A character complement pattern 132 representing “3” is generated.
[0199]
The character restored by the character complementing process is recognized as a recognized character candidate. In this recognition processing, the code of the character category with the smallest difference is output in comparison with the standard pattern registered in the character category dictionary.
[0200]
FIG. 40 is a diagram for explaining the re-complementation process of the contact character recognition unit 13.
In this re-complementation process, when a character line segment parallel to the frame touches the frame and the character line segment parallel to the frame disappears because the frame is removed, this character line segment is complemented. By extracting the contact characters using the connectivity by labeling and detecting that the character complement pattern complemented by the character completion process matches the connectivity of the frame contact characters, the character line segment parallel to the frame is detected. Complement.
[0201]
For example, as shown in FIG. 40A, the label “1” is attached to the binary image that is connected because the character pattern 141 representing “7” and the frame 142 are in contact with each other. Then, by extracting the frame 142 from the binary image of FIG. 40A and removing this frame 142, as shown in FIG. 40B, three character patterns 141 representing “7” are obtained. The three character line segments to which the label “1”, the label “2”, and the label “3” are assigned are generated.
[0202]
For the three character lines to which the label “1”, the label “2”, and the label “3” are assigned, the geometric structure such as the distance and direction between the character lines to which each label is assigned is evaluated. Complement it. As a result, the two character line segments to which the label “1” and the label “2” are assigned are connected, and the label “1” and the label “2” are assigned as shown in FIG. A character complement pattern 142 composed of two character line segments is generated.
[0203]
In this case, what is complemented by the character complementing process is only between the portion with the label “1” and the portion with the label “2” in FIG. ) And the part to which the label “3” is assigned cannot be complemented. The part to which the label “1” in FIG. 40B and the part to which the label “3” is assigned are complemented by a recomplementation process.
[0204]
In this re-complementation process, the frame contact characters are extracted in advance using the connectivity by labeling, and by detecting that the connectivity of the pattern in FIG. 40 (c) matches the connectivity of the frame contact characters, Complements parallel character lines. That is, the pattern with the label “1” and the pattern with the label “2” in FIG. 40C are connected to each other before the frame is removed, as shown in FIG. Therefore, the pattern with the label “1” and the pattern with the label “2” in FIG. 40C are connected to each other using a line segment parallel to the frame.
[0205]
As a result, the binary image “7” divided into the two character line segments of the label “1” and the label “2” in FIG. 40C is complemented, and as shown in FIG. A recomplementation pattern 143 representing “7” with “1” attached thereto is generated.
[0206]
The character restored by this re-complementation process is recognized as a recognized character candidate. In this recognition processing, the code of the character category with the smallest difference is output in comparison with the standard pattern registered in the character category dictionary.
[0207]
That is, in the example shown in FIG. 40, the character complement pattern 142 shown in FIG. 40C is recognized as belonging to the character category “Li”. Further, the re-complementation pattern 143 shown in FIG. 40D is recognized as belonging to the character category “7”. Then, it is determined that “7” is less dissimilar than “Li”, and is finally recognized as “7”, and its character code is output.
[0208]
Next, a case where the contact character recognition unit 13 in FIG. 3 performs recognition processing while referring to the knowledge table 14 will be described.
FIG. 41 is a diagram for explaining an example of recognizing a frame contact character by learning a misread character pair and registering it in the knowledge table 14.
[0209]
In the case of this example, as shown in FIG. 41A, the label “1” is attached to the binary image that is connected because the character pattern 151 representing “2” and the frame 152 are in contact with each other. . Then, by extracting the frame 152 from the binary image of FIG. 41A and removing the frame 152, as shown in FIG. 41B, the character 151 representing “2” becomes the label “1”. Separated into two partial patterns of label “2”.
[0210]
Next, as shown in FIG. 41 (c), two partial patterns of label “1” and label “2” in FIG. 41 (b) are connected by character completion processing to generate a character completion pattern 153. .
[0211]
In this case, the underlined portion of the character pattern 151 representing “2” contacts the frame 152, and the contact portion almost completely overlaps the frame 152. For this reason, even if recomplementation processing is used, the underlined portion of the character pattern 151 representing “2” cannot be complemented, and the character “2” may be erroneously recognized as “7”. Increases nature.
[0212]
In this way, part of the frame contact character does not protrude from the frame and completely overlaps the frame, so if you misrecognize it as another character, learn and register a misread character pair. Thus, the frame contact character is correctly recognized.
[0213]
Hereinafter, a method of recognizing frame contact characters by learning and registering misread character pairs will be described.
FIG. 42 is a block diagram showing a configuration for learning misread character pairs in the contact character recognition unit 13 of FIG.
[0214]
The frame contact character automatic generation unit 161 superimposes an uncontacted learning character on the input frame on the frame to generate a frame contact character. Here, a plurality of frame contact characters are generated for the same learning character by the method of changing the learning character with respect to the frame. FIG. 42 shows an example in which a learning character 168 representing “2” is input to the frame contact character automatic generation unit 161 to generate a frame contact character 169 in which the lower side and the lower frame of the character “2” overlap. ing. The information generated by the frame contact character automatic generation unit 161 is registered in the knowledge table 167.
[0215]
There are, for example, two types of variations when superimposing a frame on a learning character: “character variations with respect to a character frame” and “character variations with respect to a character frame”. There are “position shift”, “size variation”, “tilt variation”, etc., and “character frame variation” includes, for example, “tilt variation”, “frame width variation”, “size variation”, and “frame unevenness”. and so on.
[0216]
Further, there are the following parameters as parameters representing the amount of variation for these variations. Note that the x-axis is set in the vertical direction and the y-axis is set in the horizontal direction.
1. Variation of characters with respect to character frame
Misalignment: dx, dy,
Here, dx (position indicated by a black circle in FIG. 43) and dy (position indicated by x in FIG. 43) are the magnitudes in the x direction and y direction of the difference between the center of gravity of the character and the center of gravity of the character frame, respectively. Represents
[0217]
Size variation: dsx, dsy,
Here, dsx and dsy represent the size of the character in the x and y directions, respectively.
[0218]
Tilt variation: dα
Here, dα represents the inclination angle of the character with respect to the vertical line.
2. Variation of character frame
Inclination fluctuation: fα,
Here, the inclination angle of the character frame with respect to the perpendicular is expressed.
[0219]
Frame width fluctuation: w,
Here, w represents the width of the character frame.
Size variation: fsx, fsy,
Here, fsx and fsy represent the size of the character in the x and y directions, respectively.
[0220]
Unevenness of the frame: fδ
Here, fδ is a parameter for controlling the unevenness of the character frame in consideration of, for example, quality deterioration of the character frame printed on a facsimile or the like. For example, if the perimeter of the character frame is L, fδ is expressed as an array fδ [L] of this size L, and each element fδ [i] (i = 1, 2, 3,...) Of this array. Takes an integer value in the range of −β to + β determined by random number generation.
[0221]
A frame contact character is generated by performing an operation F (dx, dy, dsx, dsy, dα, w, fsx, fsy, fα, fδ) on the learning character based on the type and amount of variation. To do.
[0222]
FIG. 43 is a diagram illustrating an example in which a frame contact character is generated by synthesizing the frame 172 with the learning character 171 representing “7”.
As shown in FIG. 43 (a), by applying the conversion operation F (dx, dy, dsx, dsy, dα, w, fsx, fsy, fα, fδ) to the learning character 171 representing “7”. As shown in FIG. 43B, the frame contact character “7” that contacts the frame 172 is generated.
[0223]
That is, by applying the conversion operation F (dx, dy, dsx, dsy, dα, w, fsx, fsy, fα, fδ) to the learning character 171 and the frame 172, the learning character 171 and the frame 172 are overlapped. , Generate a frame contact character. In this case, for example, the conversion operation F (dx, dy, dsx, dsy, dα, w, fsx, fsy, fα, fδ) is executed while fixing the position of the center of gravity of the frame 172.
[0224]
FIG. 44 shows examples of various frame contact characters generated for the learning character “3” when the size variation fsx in the x direction and the size variation fsy in the y direction are fixed and the size of the frame is fixed. FIG.
[0225]
FIG. 44A shows an example of the case where the type of variation is “position shift”, where the amount of variation is dx = 0 and dy> 0. In this case, the character “3” protrudes below the frame (lower position fluctuation).
[0226]
FIG. 44B shows an example in which the type of variation is “size variation”, where the variation is dsx = fsx, dsy = fsy. In this case, the character “3” contacts the top, bottom, left and right of the frame, and the circumscribed rectangle “3” is equal to the frame.
[0227]
FIG. 44C shows an example in which the type of fluctuation is “character inclination fluctuation”, and the fluctuation amount is dα = 10 degrees.
FIG. 44D is an example in which the type of variation is “character frame tilt variation”, and the amount of variation is fα = −10 degrees.
[0228]
FIG. 44E shows an example in which the variation type is “frame width variation”, and the variation amount is w = 5.
FIG. 44F shows an example in which the type of variation is “frame irregularities”, in which each element fδ [i] of the variation fδ [L] is controlled.
[0229]
Next, the frame removal unit 162 in FIG. 42 extracts only the frame from the frame contact character generated by the frame contact character automatic generation unit 161 and removes the frame to obtain image data on the blurred character. And output to the character complementing unit 163.
[0230]
The character complementing unit 163 complements the image data of the character from which the frame has been removed by the frame removing unit 162 by evaluating the geometric structure such as the distance and directionality between the character line segments to which the labels are attached. FIG. 42 shows an example in which a character complement pattern 170 is generated by removing a frame from the frame contact character 169 generated by the automatic frame contact character generation unit 161 and then complementing it by the character complementing unit 163.
[0231]
The re-completion unit 164 previously extracts the frame contact characters using the connectivity by labeling for the area that cannot be complemented by the character complement unit 163, and the pattern complemented by the character complement unit 163 and the frame contact characters The character line segment parallel to the frame is complemented by detecting that the connectivity of the two matches.
[0232]
The character completion pattern supplemented by the character complementing unit 163 and the recomplementation pattern supplemented by the recomplementing unit 164 are input to the basic character recognition unit 165.
The basic character recognition unit 165 performs character recognition on the character complement pattern supplemented by the character complement unit 163 and the re-complement pattern supplemented by the re-complement unit 164. And the recognition result about each learning character is output to the knowledge acquisition part 166 of a frame contact state and recognition.
[0233]
The knowledge acquisition unit 166 for the frame contact state and recognition compares the recognition result output from the basic character recognition unit 165 with correct data given in advance to obtain a recognition rate for all sample data. Then, the recognition rate is registered in the knowledge table 167 as reliability, and a combination of a misread character (a character recognized in error) and a correct character is a misread character pair. Note that the misread character pairs are registered by, for example, character codes. Further, the knowledge acquisition unit 166 for the frame contact state and recognition extracts a parameter indicating the feature of the contact state between the frame and the character, and registers this parameter in the knowledge table 167 as well.
[0234]
In this way, in the knowledge table 167, for each character category, the recognition rate for the character in various contact states between the frame and the character is registered together with the misread character pair.
[0235]
FIG. 45 is a diagram illustrating an example of the knowledge table 167 generated by learning. 45, in the knowledge table 167, for example, the misread character pair (2, 7) and the reliability 77% are registered together with the fluctuation amount dy = 5, W = 5, etc. of the “lower position fluctuation”, and the fluctuation amount. However, in the case of “2” frame contact character with “lower position deviation” of dy = 5 and W = 5, the basic character recognition unit 165 erroneously recognizes “2” as “7” with a probability of 23%. It will be shown. That is, in this case, even if the basic character recognition unit 165 recognizes “7”, the reliability is 77%, and the knowledge table 167 indicates that the possibility that the actual character is “2” is 22%. This can be determined by referring to it.
[0236]
Similarly, for other character pairs that are easily misread, the “variation amount”, “line width of the frame”, “misread character pairs”, and the reliability are obtained by the knowledge acquisition unit 166 of the frame contact state and recognition by the knowledge table 167. Registered in
[0237]
The misread character pair (L1, L2) actually indicates a case where the character “L1” is erroneously recognized as the character “L2”. For example, the character codes of the corresponding characters “L1” and “L2” are registered in the characters “L1” and “L2”.
[0238]
In the knowledge table 167, in addition to the “lower position deviation fluctuation” of the fluctuation amounts dy = 5 and W = 5 shown in FIG. 45, the “character inclination fluctuation relative to the character frame” (in this case, left 43 are registered for each character category.
[0239]
That is, for example, as shown in FIG. 46, regarding the “lower position deviation” fluctuation, for example, dx = “− 3” to “+3”, dy = 5, w = 5, dsy = 1, dα = “− 10” ”To“ +10 ”and fα =“ − 10 ”to“ +10 ”are registered. As described above, even for the same “lower position deviation” fluctuation, the fluctuation amount registered in the knowledge table 167 is not only the positional deviation dx in the x direction and the positional deviation dy in the y direction, but also other fluctuation amounts are registered. May be. As for “character inclination fluctuation with respect to the character frame in contact with the left frame”, for example, dx = “− 3” to “+3”, dy = “− 3” to “+3”, w = 5, dsy = 1, dα = “− 20” to “+20” and fα = “− 10” to “+10” are registered.
[0240]
In addition, a character recognition method in which the reliability of a misread character pair (L1, L2) having a reliability of a predetermined threshold value (for example, 90%) or less is equal to or higher than the predetermined threshold value. The learned character recognition method is registered in the knowledge table 167.
[0241]
For example, as shown in FIG. 45, the character recognition reliability of the frame contact character “2” in the “lower position deviation” state of dy = 5, w = 5 is 77%, and is erroneously set to “7”. Since the probability of recognition is high, the recognition rate can be improved by re-recognizing the character complement pattern supplemented by the character complement unit 163 or the re-complement pattern supplemented by the re-complement unit 164 by, for example, a region enhancement method. Is registered in the knowledge table 167.
[0242]
A region emphasizing method in the case of this (2, 7) misread character pair will be described with reference to FIG.
First, as shown in FIG. 47 (a), a circumscribed rectangle 180 of a character complement pattern supplemented by the character complement unit 163 or a re-complete pattern supplemented by the re-complement unit 164 has m vertical rows, The column is divided into n m × n divided regions. Then, as shown by hatching in FIG. 47B, character recognition is performed again with particular emphasis on the upper half m / 2 × n areas of the circumscribed rectangle 180.
[0243]
In other words, the feature parameters of the m / 2 × n regions are extracted, and the character completion pattern supplemented by the character complementing unit 163 or the recomplementation pattern supplemented by the recomplementing unit 164 is “2” or “7”. Find out which one is. This region enhancement technique improves the recognition level up to 95%. In the knowledge table 167 of FIG. 45, “area emphasis” as the re-recognition method, “m / 2 × n” as the re-recognition area, and the re-recognition reliability in the row of the misread character pair (2, 7). "95%" is registered.
[0244]
This region emphasis method is also effective in the case of a frame contact character as shown in FIG. FIG. 48A shows an example in which the lower part of the character pattern representing “2” is in contact with the character frame 182.
[0245]
In this case, the character complementing unit 163 obtains a character complementing pattern 183 similar to “7” as shown in FIG. A circumscribed rectangle 184 shown in FIG. 48C is calculated for this character complement pattern 183. Then, after this circumscribed rectangle 184 is divided into m × n areas as shown in FIG. 47, character recognition is performed by particularly emphasizing the upper half m / 2 × n partial areas 185. As a re-recognition method for the misread character pair (2, 7) by frame contact, it is learned that there is a high probability that the complement pattern 183 is recognized as “2”, that is, the accuracy rate (reliability) is high. The emphasis technique is registered in the knowledge table 167.
[0246]
FIG. 49 is a flowchart showing a method for re-recognizing a character pattern by region emphasis.
In FIG. 49, first, as shown in step S601, data of misread character pairs with low reliability is extracted from the knowledge table 167. Then, for the character registered on the left side of the misread character pair, a character pattern as binary learning data and a character complement pattern supplemented by the character complement unit 163 or a re-complement pattern supplemented by the re-complement unit 164 Enter.
[0247]
This character completion pattern or recomplementation pattern is a pattern defined by the variation amount parameter registered in the knowledge table 167, and can take a plurality of patterns even in the same category.
[0248]
Next, as shown in step S602, the character pattern as the learning data input in step S601 and the character complement pattern supplemented by the character complement unit 163 or the re-complement pattern supplemented by the re-complement unit 164, Divide into m × n regions.
[0249]
Then, as shown in step S603, character recognition is performed on the X × Y partial pattern in the m × n area. Then, the recognition rate z in this case is obtained. The X × Y partial pattern is a re-recognition region. At this time, X and Y are variables representing the lengths of the m × n region in the X direction and the Y direction, respectively, and X ≦ m and Y ≦ n. The recognition rate z is a probability of a correct answer when character recognition is performed using the X × Y partial pattern.
[0250]
That is, the character recognition result of the partial pattern of the character pattern as learning data is regarded as a correct answer. And the character recognition result of the partial pattern of the character pattern as learning data is used as the character recognition result for a plurality of partial patterns for the character complement pattern supplemented by the character complement unit 163 or the re-complement pattern supplemented by the re-complement unit 164. The partial pattern recognition rate z for the character complement pattern supplemented by the character complementation unit 163 or the recomplementation pattern supplemented by the recomplementation unit 164 is obtained.
[0251]
Subsequently, as shown in step S604, it is determined whether or not the recognition rate z is larger than the maximum recognition rate max. The maximum recognition rate max is a variable for storing the maximum value of the recognition rate z when the X × Y partial pattern is changed, and is initially set to a certain initial value (for example, “0”). .
[0252]
If the recognition rate z is larger than the maximum recognition rate max, the process proceeds to step S605, and the recognition rate z is substituted into the maximum recognition rate max. Then, the process proceeds to step S606, and the lengths X and Y are changed. Check if it is possible. On the other hand, if the recognition rate z is equal to or less than the maximum recognition rate max in step S604, the process immediately proceeds to step S606.
[0253]
This length X and Y changing operation is, for example, changing the lengths X and Y. Further, it may include a position changing operation in the m × n region of the X × Y partial pattern.
[0254]
If it is determined in step S606 that the lengths X and Y can be changed, the process returns to step S603, the length X and Y are changed, a new X × Y partial pattern is determined, and the partial pattern is changed to this partial pattern. Recognize characters.
[0255]
The processes in steps S603 to S606 described above are repeated until it is determined in step S506 that the lengths X and Y cannot be changed. If it is determined that the lengths X and Y cannot be changed in step S606, the maximum recognition rate max and the X × Y partial pattern from which the maximum recognition rate max is obtained are set as the re-recognition reliability and the re-recognition region, respectively. Register in the knowledge table 167. Also, “region emphasis” is registered in the knowledge table 167 as a re-recognition method.
[0256]
The flowchart in FIG. 49 is an example of learning the re-character recognition method using the “region emphasis” method. However, the re-character recognition method is also learned other than the “region emphasis” method. May be.
[0257]
FIG. 50 is a block diagram showing a configuration for performing character recognition of frame contact characters using the knowledge table 167 obtained by learning.
In FIG. 50, the frame contact state detection unit 191 detects the contact state between the frame and the character for the input unknown frame contact character. Here, the frame contact character pattern 201 in which the lower frame of FIG. 50A partially overlaps the lower side of “2”, and the lower frame of FIG. 50B completely overlaps the lower side of “2”. An example in which a frame contact character pattern 203 is input is shown. Then, the frame contact state detection unit 191 detects the frame contact character pattern 201 and the frame contact character pattern 203.
[0258]
The frame removal unit 192 removes the frame from the frame contact character pattern detected by the frame contact state detection unit 191.
The character complementing unit 193 supplements the character pattern from which the frame has been removed by the frame removing unit 192 by evaluating the geometric structure such as the distance between the character line to which the label is attached and the directionality.
[0259]
The re-complementation unit 194 previously extracts the frame contact characters using the connectivity by labeling for the area that cannot be completely complemented by the character complementation unit 193, and the pattern complemented by the character complementation unit 163 and the frame contact characters The character line segment parallel to the frame is complemented by detecting that the connectivity of the two matches. Here, the recomplementation pattern 202 indicates a pattern complemented by the recomplementation process of the recomplementation unit 194 with respect to the frame contact character pattern 201 of FIG. 50A, and the recomplementation pattern 204 is illustrated in FIG. The frame contact character pattern 203 is a pattern that could not be complemented by the recomplementation process of the recomplementation unit 194.
[0260]
The basic character recognition unit 195 performs character recognition for each of the character complement pattern supplemented by the character complement unit 193 and the re-complement pattern supplemented by the re-complement unit 194. As a result, for example, the character code “2” is output for the re-complementation pattern 202 in FIG. 50A, and the character “7” is output for the re-complementation pattern 204 in FIG. Code is output. Then, the character code obtained from the recognition result is output to the frame reference state and recognition knowledge reference unit 196.
[0261]
The frame reference state and recognition knowledge reference unit 196 includes the position information of the circumscribed rectangle of the character complement pattern supplemented by the character complement unit 193 or the re-complement pattern supplemented by the re-complement unit 194 and the frame contact of FIG. Based on the character frame position information and width information extracted from the character pattern 201 or the frame contact character pattern 203 in FIG.
[0262]
That is, the character variation with respect to the character frame such as “position shift”, “size variation”, “tilt variation”, etc., as shown in FIG. 43, or “tilt variation”, frame width variation ”,“ frame irregularity ” Further, the fluctuation amount dx, dy, dsx, dsy, dα, w, fsx, fsy, fα, and fδ are calculated for each of the obtained fluctuation types.
[0263]
Next, the knowledge table 167 is searched using the calculated variation type information and variation amount information and the character code input from the basic character recognition unit 195 as key items, and the variation type information and variation amount matching the key items are searched. It is checked whether or not a line having information and a misread character pair is registered in the knowledge table 167.
[0264]
When there is a line that matches the key item, it is determined whether or not the reliability registered in this line is equal to or higher than a predetermined threshold value. The character complement pattern supplemented by the complement unit 193 or the re-complement pattern supplemented by the re-complement unit 194 is output to the re-character recognition unit 197, and character recognition is performed again according to the re-recognition method registered in the line.
[0265]
That is, a technique different from the technique by the basic character recognizing unit 195 using the character completion pattern supplemented by the character complementing unit 193, the recomplementation pattern supplemented by the recomplementing unit 194, or the binary image data of unknown characters. Then, re-recognition of the frame contact character included in the unknown image data is executed. Then, the character code obtained by re-recognition is output.
[0266]
For example, when the basic character recognition unit 195 outputs the character code “7” as the recognition result of the re-complementation pattern 204 supplemented by the re-complementation unit 194, the knowledge reference unit 196 for the frame contact state and recognition Based on the position information of the circumscribed rectangle of the complement pattern 204 and the position information and width information of the character frame extracted from the frame contact character pattern 203, the type and amount of change are obtained. As a result, “lower position deviation” is calculated as the type of fluctuation, “dy = 5” is calculated as the fluctuation amount of this “lower position deviation”, and “w = 5” is calculated as the width of the character frame.
[0267]
The knowledge reference unit 196 for the frame contact state and recognition uses “dy = 5” as the variation type of the variation type, “dy = 5” as the variation amount of the “lower position displacement”, “w = 5” as the width of the character frame, The knowledge table 167 of FIG. 45 is searched using the character code “7” input from the basic character recognition unit 195 as a key item. As a result of this search, the misread character pair (2, 7) is registered in the line corresponding to these key items, and the reliability of the character code “7” recognized by the basic character recognition unit 195 is 77%. Know that “2” is misread as “7” with a probability of 23%.
[0268]
In this case, since the reliability registered in the row corresponding to these key items is lower than a predetermined threshold value, the re-character recognition unit 197 is unknown by a method different from the method by the basic character recognition unit 195. Re-recognition of the frame contact character pattern 203 included in the image data is executed. At this time, the re-character recognition unit 197 refers to the row corresponding to the key item in the knowledge table 167 and specifies the re-recognition method.
[0269]
That is, the re-character recognition unit 197 is instructed to perform “region enhancement” as a re-recognition method, and m / 2 in the upper half of the re-complementation pattern 204 as a re-recognition region when performing “region enhancement”. It is taught that only the xn partial area 205 is emphasized and re-recognized. It is also taught that the re-recognition reliability in this case is 95%.
[0270]
The re-character recognizing unit 197 performs re-recognition of only the upper half partial area 205 according to the re-recognition method registered in the knowledge table 167. Then, the partial area 205 of the re-complement pattern 204 matches the partial area 207 of the character pattern 206 corresponding to the character code “2” with a probability of 95%, and the partial area of the character pattern 208 corresponding to the character code “7”. Knowing that it matches 209 with a probability of 5%, the character code “2” is output as the recognition result of the character touching the frame of the unknown frame contact character pattern 203.
[0271]
FIG. 51 is a flowchart showing the operation of the knowledge reference unit 196 for frame contact state and recognition.
In FIG. 51, first, as shown in step S171, based on the frame extracted from the unknown frame contact character pattern and the character pattern separated from the frame contact character pattern, the variation amount for the character frame is calculated. The knowledge table 167 is searched using the variation amount as a key item. Then, it is checked whether or not there is a row in the knowledge table 167 in which a variation amount that matches the calculated variation amount exists.
[0272]
As a result, for example, when dx = 5 and w = 5 are calculated as the amount of variation for the character “2” that is in the position shift variation, the top row of the knowledge table 167 shown in FIG. Detected.
[0273]
If there is a line with the same amount of fluctuation, the process proceeds to step S172, and the line containing the character code (character recognition code) input from the basic character recognition unit 195 in the misread character pair has the same amount of fluctuation. Check if it exists in the line.
[0274]
As a result, for example, in the case of the character “2” that is in the position shift variation, the top row of the knowledge table 167 shown in FIG. 45 is detected.
And as shown to step S173, when the line which contains the character code input from the basic character recognition part 195 in a misread character pair exists in the line where fluctuation amount corresponds, it corresponds to the knowledge table 167. The re-recognition reliability registered in the row is compared with the reliability calculated by the basic character recognition unit 195, and the re-recognition reliability registered in the corresponding row of the knowledge table 167 is determined by the basic character recognition unit 195. It is determined whether or not the reliability is greater than the calculated reliability.
[0275]
As a result, for example, in the case of the character “2” having the lower position deviation fluctuation, the re-recognition reliability and basic character recognition unit 195 registered in the top row of the knowledge table 167 shown in FIG. And the re-recognition reliability registered in the corresponding row of the knowledge table 167 is higher than the reliability calculated by the basic character recognition unit 195, respectively. Determined to be large.
[0276]
When the re-recognition reliability registered in the corresponding row of the knowledge table 167 is higher than the reliability calculated by the basic character recognition unit 195, the process proceeds to step S174, and is registered in the corresponding row of the knowledge table 167. It is determined whether or not the re-recognition reliability is greater than a predetermined threshold th1. If the re-recognition reliability is greater than the threshold th1, the process proceeds to step S175, and is registered in the row detected in step S172 of the knowledge table 167. Reference is made to the “re-recognition method” and “re-recognition area”.
[0277]
Next, as shown in step S176, the “re-recognition area” indicated by the knowledge table 167 is cut out from the character completion pattern supplemented by the character complementing unit 193 or the re-complementation pattern supplemented by the re-complementation unit 194, and this Character recognition is performed on the clipped area by the “re-recognition method” shown in the knowledge table 167. Then, the character code obtained by the character recognition is output.
[0278]
Thereby, for example, when the threshold value th1 is smaller than “95%”, the upper half of the supplement pattern of the character “2” having the lower position shift variation input by the basic character recognition unit 195 is “ Character recognition is performed again by the “region emphasis” method using the area of “m / 2 × n”, and the character code “2” is finally output.
[0279]
The frame contact character recognition method is described, for example, in the specification and drawings of Japanese Patent Application No. 7-205564.
Next, an embodiment of the character string recognition unit 15 in FIG. 3 will be described.
[0280]
The character string recognizing unit 15 performs character unification determination on the character string extracted by the layout analysis in step S2 of FIG. 4 with respect to a parameter as a characteristic value used when cutting out characters one by one from the character string. Instead of heuristically determining the threshold in this case, a statistically reasonable value is set.
[0281]
Specifically, for each parameter, statistical data regarding the success or failure of the integration of the parameter value and the character for that parameter value is taken. Then, instead of evaluating each parameter individually, all parameters are regarded as one point in the multidimensional space, and using a multivariate analysis method, two cases of integration success and integration failure. A discriminant plane for separating the groups is obtained in the multidimensional space.
[0282]
That is, the sample data composed of P characteristic values indicating the features of the pattern are classified into a first group indicating successful extraction and a second group indicating failure in extraction, and the first group and the second group Are generated in the P-dimensional space.
[0283]
This discriminant plane can be obtained by, for example, a discriminant analysis method. That is, when the discriminant surface is configured by a linear discriminant function, the coefficient vector of the discriminant function is
Σ^-1(Μ₁−μ₂(3)
Given in.
[0284]
here,
Σ: population covariance matrix of the first group and the second group,
μ₁: Population mean vector of the first group,
μ₂: Population mean vector of the second group,
It is.
[0285]
The discriminant function having the coefficient vector of equation (3) is configured to be equidistant from the centroids of the first group and the second group.
Note that the coefficient vector of this discriminant function can also be calculated based on the criterion of maximizing the ratio of inter-group variation to intra-group variation between the first group and the second group.
[0286]
In addition, the process of cutting out characters from a character string includes statistical processing that integrates patterns based on the position, size, arrangement, etc. of the circumscribed rectangle of the pattern, and pattern processing for processing muddy dots and separated characters in the character string. It is divided into non-statistical processes that focus on the shape.
[0287]
In statistical processing, the position of the circumscribed rectangle of the pattern, the aspect ratio, the size ratio to the average character size, the distance between adjacent patterns, the size when merged, the overlapping width of patterns, the coarse density of character strings, etc. are extracted parameters Used as
[0288]
For example, as shown in FIG.
1) The distance a between the right frame of the circumscribed rectangle 211 and the left frame of the circumscribed rectangle 212,
2) The distance b between the left frame of the circumscribed rectangle 211 and the right frame of the circumscribed rectangle 212,
3) Ratio c between the distance a between the right frame of the circumscribed rectangle 211 and the left frame of the circumscribed rectangle 212 and the distance b between the left frame of the circumscribed rectangle 211 and the right frame of the circumscribed rectangle 212;
4) The ratio d between the distance b between the left frame of the circumscribed rectangle 211 and the right frame of the circumscribed rectangle 212 and the circumscribed rectangle average width MX;
5) An angle e formed between a lower frame of the circumscribed rectangle 213 and a straight line connecting the midpoint of the lower frame of the circumscribed rectangle 213 and the midpoint of the lower frame of the circumscribed rectangle 214,
6) An angle f formed by a straight line connecting the lower frame of the circumscribed rectangle 213 and the lower right vertex of the circumscribed rectangle 213 to the lower left vertex of the circumscribed rectangle 214,
7) When the circumscribed rectangle 215 and the circumscribed rectangle 216 overlap, the distance p between the right frame of the circumscribed rectangle 215 and the left frame of the circumscribed rectangle 216 and the distance between the left frame of the circumscribed rectangle 215 and the right frame of the circumscribed rectangle 216 the ratio g to q,
Is used as a cut-out parameter.
[0289]
That is,
c = a / b (4)
d = b / MX (5)
g = p / q (6)
It is.
[0290]
Next, statistical processing will be described with reference to the flowchart of FIG.
First, as shown in step S181, a circumscribed rectangle of the connection pattern is extracted. Next, as shown in step S182, it is checked whether there is another circumscribed rectangle on the right side of the circumscribed rectangle extracted in step S181. If there is no other circumscribed rectangle on the right side of the circumscribed rectangle extracted in step S181, the circumscribed rectangle extracted in step S181 is excluded from the target of statistical processing.
[0291]
On the other hand, if it is determined in step S182 that there is another circumscribed rectangle to the right of the circumscribed rectangle extracted in step S181, the process proceeds to step S184.
Also, as shown in step S183, the average character size of the circumscribed rectangle of the character string is calculated. Here, when the average character size of the circumscribed rectangle of the character string is calculated, each character has not been cut out yet, so strictly speaking, an accurate average character size cannot be calculated.
[0292]
Therefore, for example, the average character size is provisionally calculated by temporarily integrating the circumscribed rectangles of the connection pattern. As a temporary integration method, the aspect ratio P when adjacent connection patterns are integrated is, for example,
N (= 0.8) <P <M (= 1.2)
If it meets, provisional integration is performed. Then, the average character size after provisional integration is calculated. The average character size of the circumscribed rectangle of the character string may be obtained by generating a frequency histogram for each size of the circumscribed rectangle.
[0293]
Next, as shown in step S184, parameters a to g in FIG. 52 are calculated.
In non-statistical processing, muddy dots and separated characters in character strings are targeted, and separated into separated character processing and muddy point processing.
[0294]
In the processing for separated characters, pattern inclination, line density, size when adjacent patterns are integrated, and distance between patterns are used as cut-out parameters.
[0295]
For example, as shown in FIG.
8) Ratio p between the distance a between the right frame of the circumscribed rectangle 221 and the left frame of the circumscribed rectangle 222 and the distance b between the left frame of the circumscribed rectangle 221 and the right frame of the circumscribed rectangle 222;
9) The ratio q between the distance b between the left frame of the circumscribed rectangle 221 and the right frame of the circumscribed rectangle 222 and the circumscribed rectangle average width MX;
10) Ratio r of the product of the area c of the circumscribed rectangle 21 and the area d of the circumscribed rectangle 22 and the square of the product of the circumscribed rectangle average width MX and the circumscribed rectangle average height MY,
Is used as a cut-out parameter.
[0296]
That is,
p = a / b (7)
q = b / MX (8)
r = (c × d) / (MX × MY)² ... (9)
It is.
[0297]
Next, the separation character processing will be described with reference to the flowchart of FIG. In this separation character processing, for example, a separation character composed of two or more connected patterns such as “c” or “le” is detected.
[0298]
First, as shown in step S191, it is determined whether there is a pattern that rises to the right among the connected patterns. Then, if there is no pattern that rises to the right, it is excluded from the target of separation character processing.
[0299]
On the other hand, if it is determined in step S191 that the pattern is rising to the right, the process proceeds to step S192, and the pattern that is adjacent to the right side of the pattern that is rising to the right and that is falling to the right, that is, For example, a pattern corresponding to “c”, or a pattern adjacent to the right side of a pattern that rises to the right and that intersects with a pattern when searched in a right-angle direction (right-angle linear density) is 2. That is, for example, it is determined whether there is a pattern corresponding to “le”. If the pattern does not have a shape such as “c” or “le”, it is excluded from the target of separation character processing.
[0300]
On the other hand, if it is determined in step S192 that the pattern has a shape such as “c” or “le”, the process proceeds to step S194.
In addition to steps S191 and S192, in step S193, the average character size of the circumscribed rectangle of the character string is calculated.
[0301]
After steps S192 and S193 are completed, the values of the parameters p to r shown in FIG. 54 are calculated in step S194.
In the cloud point processing, attention is paid to the cloud point candidate pattern, and, for example, the size when the pattern and the adjacent pattern are integrated, the distance between the two patterns, and the ratio between the average character size and the size are used as the cutting parameters.
[0302]
That is, as shown in FIG.
11) Ratio p between the distance a between the right frame of the circumscribed rectangle 231 and the left frame of the circumscribed rectangle 232 and the distance b between the left frame of the circumscribed rectangle 231 and the right frame of the circumscribed rectangle 232,
12) The ratio q between the distance b between the left frame of the circumscribed rectangle 231 and the right frame of the circumscribed rectangle 232 and the circumscribed rectangle average width MX;
13) The ratio r of the product of the area c of the circumscribed rectangle 231 and the area d of the circumscribed rectangle 232 and the square of the product of the circumscribed rectangle average width MX and the circumscribed rectangle average height MY is:
Used as a cutting parameter.
[0303]
That is, the parameters p to r can be expressed in the same manner as the equations (7) to (9).
Next, muddy point processing will be described with reference to the flowchart of FIG.
[0304]
First, in step S201, a pattern that is a muddy point candidate is extracted. That is, for example, when two connected patterns extracted by the connected pattern extracting unit 1 exist adjacent to each other, and the ratio between the combined size and the average character size of the circumscribed rectangle of the character string is predetermined. When the value is equal to or less than the threshold, for example, equal to or less than ¼, the pattern is extracted as a muddy point candidate.
[0305]
Next, as shown in step S202, it is checked whether there is an adjacent circumscribed rectangle on the left side of the pattern that is a muddy point candidate. When there is no circumscribed rectangle adjacent to the left of the pattern that is a muddy point candidate, the pattern that is a muddy point candidate is excluded from the target of muddy point processing.
[0306]
On the other hand, if it is determined in step S202 that there is a circumscribed rectangle adjacent to the left of the pattern that is a muddy point candidate, the process proceeds to step S204.
In addition to steps S201 and S202, the average character size of the circumscribed rectangle of the character string is calculated in step S203. Then, after the processing of steps S202 and S203 is completed, the values of parameters p to r shown in FIG. 56 are calculated in step S204.
[0307]
Next, using the learning data, a discrimination plane for calculating the reliability of character extraction for an unknown handwritten character string is set, and when the number of parameters is n, a group in which extraction is successful and a group in which extraction is unsuccessful Are generated in an n-dimensional space.
[0308]
FIG. 58 is a flowchart showing a method for calculating the success / failure data for cutout.
In FIG. 58, first, in step S211, it is visually determined whether or not the learning data collected in advance is integrated with the circumscribed rectangle of interest and the adjacent circumscribed rectangle to form one character. Then, when the circumscribed rectangle of interest and the circumscribed rectangle adjacent thereto are integrated into one character, the process proceeds to step S212, and when the circumscribed rectangle of interest and the circumscribed rectangle adjacent thereto are not integrated into one character, step S212 is performed. The process proceeds to S213.
[0309]
In step S212, the parameter values in the circumscribed rectangle of interest and the circumscribed rectangle adjacent thereto are recorded for the case where the noted circumscribed rectangle and the circumscribed rectangle adjacent thereto are integrated into one character. Here, the parameters a to g in FIG. 48 can be used as parameters in the circumscribed rectangle to be focused on and the circumscribed rectangle adjacent thereto, and the parameters p in FIGS. 54 and 56 can be used in the non-statistical processing. ~ R can be used.
[0310]
In step S213, the parameter value in the circumscribed rectangle of interest and the circumscribed rectangle adjacent thereto is recorded in the case of unsuccessful integration that does not become one character by integrating the circumscribed rectangle of interest and the circumscribed rectangle adjacent thereto.
[0311]
Next, for unknown character strings, the cut-out parameter values in the statistical processing and the cut-out parameter values in the non-statistical processing are calculated, and the points in the multidimensional space determined by the parameter values are obtained from the learning data. The distance from the discriminant plane is obtained, and this is quantified as the reliability of extraction.
[0312]
For example, when the number of feature parameter is 3, as shown in FIG. 59, the discriminant plane for discriminating between the two groups of cutout success and cutout failure is H, and the unit normal vector of the discriminant plane H is n. When the value is a vector value of p, the distance h from the discriminant plane of the point p in the three-dimensional space corresponding to the value of the parameter is
h = OP · n (10)
It is expressed. Here, OP is a vector from the origin O in the three-dimensional space toward the point p in the three-dimensional space.
[0313]
Depending on whether the distance h from the discriminant plane H is positive or negative, the value of the parameter is assigned to which group, that is, the group that succeeds in cutting or the group that fails in cutting. It can also be seen how far the parameter value is from the discrimination surface H.
[0314]
Next, as shown in FIG. 60, a histogram distribution 241 indicating successful extraction and a histogram distribution 242 indicating failure in extraction are obtained based on the distance h from the discrimination plane H for all parameters of the learning data in the multidimensional space. . In general, since the

histogram distributions

241 and 242 are normal distributions, the

histogram distributions

241 and 242 are approximated by normal distributions. These normal distributions usually result in partially overlapping regions.
[0315]
In the present embodiment, it is determined whether or not to integrate them in consideration of the reliability of character recognition in addition to the reliability of extraction for adjacent patterns having extraction parameters located in the overlapping region.
[0316]
FIG. 61 is a flowchart illustrating an example of a method for calculating the extraction reliability.
In FIG. 61, first, as shown in step S221, the distance h from the discrimination surface H to a point on the multidimensional space determined by the values of a plurality of parameters is calculated by the equation (10).
[0317]
Next, as shown in step S222, the histogram distribution of a plurality of parameter values obtained from the learning data is approximated by a normal distribution. That is, for example, as shown in FIG. 62, the successful distribution of the histogram is approximated by a normal distribution 251, and the failed distribution of the histogram is approximated by a normal distribution 252.
[0318]
Next, in step S223, two groups of overlapping regions are calculated. For example, as shown in FIG. 62, a region where a normal distribution 251 of successful extraction overlaps with a normal distribution 252 of failed extraction is calculated as two groups of overlapping regions 254. At this time, the region 253 other than the two groups of overlapping regions 254 in the normal distribution 251 of successful extraction is set as the successful extraction region. Further, the region 255 other than the two groups of overlapping regions 254 in the normal distribution 252 of the cutout failure is set as the cutout failure region.
[0319]
Next, as shown in step S224, the position of the input parameter value for the unknown character on the histogram distribution is determined.
Next, as shown in step S225, as a result of determining the position of the input parameter value for the unknown character on the histogram distribution, when the input parameter value for the unknown character is included in the two groups of overlapping regions 254, The process proceeds to step S226. Then, the extraction reliability is calculated based on the position of the input parameter value for the unknown character in the two groups of overlapping regions 254.
[0320]
On the other hand, if it is determined in step S225 that the input parameter value for the unknown character is not included in the two groups of overlapping areas 254, the process proceeds to step S226, and the input parameter value for the unknown character is extracted in the successful extraction area 253. Determine whether it is included.
[0321]
If it is determined that the input parameter value for the unknown character is included in the cutout success area 253, the process proceeds to step S228, where the cutout reliability is set to “1”, and the input parameter value for the unknown character is set to the cutout success area. If it is determined that it is not included in H.253, the process proceeds to step S229, and the extraction reliability is set to “0”.
[0322]
For example, in FIG. 62, when the distance from the discrimination plane for the unknown character to the input parameter value is calculated and the distance from the discrimination plane for the input parameter value for the unknown character is included in the overlap area 254, the unknown character The extraction reliability is calculated based on the distance from the discrimination surface of the input parameter value for. In addition, when the distance from the determination surface of the input parameter value for the unknown character is included in the extraction success area 253, the extraction reliability is set to “1”. Further, when the distance from the determination surface of the input parameter value for the unknown character is included in the cut-out failure area 255, the cut-out reliability is set to “0”.
[0323]
FIG. 63 is a flowchart illustrating an example of a method of calculating two groups of overlapping areas.
In FIG. 63, first, as shown in step S231, the average value m and the variance value v of the histogram 261 are calculated for each of the successful extraction histogram distribution and the unsuccessful histogram distribution obtained from the learning data.
[0324]
Next, in step S232, the sum d of the square errors of the normal distribution curve 262 and the histogram 261 is calculated for the histogram distribution of successful extraction and the histogram distribution of failure of extraction.
[0325]
Next, in step S233, the fitness T is calculated by the following equation (11).
T = d / S (11)
Here, S is the area of the normal distribution curve 262.
[0326]
Next, in step S234, the distance L from the center to the end of the normal distribution curve 262 is calculated by the following equation (12).
L = k × (1 + T) × v^1/2 (12)
Here, k is a proportionality constant. And v^1/2Is equal to the standard deviation.
[0327]
Next, in step S235, a region from the right end 267 of the normal distribution curve 263 to the left end 266 of the normal distribution curve 264 is set as two overlapping regions 265.
[0328]
Next, it is determined whether or not the recognition process is performed on the extracted character candidate based on the extraction reliability obtained by the process of FIG. In this case, for example, the recognition process is not performed on a candidate for a cutout character having a high cutout reliability, and the recognition process is performed only on a candidate for a cutout character having a low cutout reliability.
[0329]
Here, for a plurality of cutout character candidates, cutout characters are determined in consideration of not only the reliability of recognition but also the cutout reliability. As a result, candidate characters that look like characters when partially viewed but are incorrect when viewed from the entire character string can be excluded from the cut-out characters. For example, the extraction reliability of each adjacent pattern or extraction confirmation unit is expressed as α_i, Recognition confidence β_iWhen the weighting factor is j, the overall reliability R is
R = Σ (j · α_i+ Β_i(13)
It can be expressed.
[0330]
Then, the one having the highest overall reliability R is selected as the final cut character from among a plurality of cut character candidates.
FIG. 64 is a diagram showing a case where characters are cut out one by one from the character string “Gunma”. Here, prior to cutting out the character string “Gunma”, the discriminant plane for the statistical processing and the non-statistical processing and the normal distribution curve of the histogram value are obtained individually using the learning data.
[0331]
Here, in the statistical process, the parameters c, e, and f of FIG. 52 are used as parameters for determining success or failure of character string segmentation, and the discriminant plane equation obtained from the learning data is:
0.84x0 + 0.43x1 + 0.33x2-145.25 = 0 (14)
Suppose that
[0332]
Further, the average value m of the histogram distribution indicating the success in cutting out the learning data shown in FIG. 63 is 128.942, the standard deviation is 34.77, and the fitness T is 0.12 from the equation (11). If the proportionality constant k is 2, the distance L from the center of the distribution to the end is 77.8 from the equation (12).
[0333]
Further, the average value m of the histogram distribution indicating the failure to cut out the learning data shown in FIG. 63 is 71.129, the standard deviation is 36.26, and the fitness T is 0.35 from the equation (11). If the proportionality constant k is 2, the distance L from the center of the distribution to the end is 92.2 from the equation (10).
[0334]
In FIG. 64, first, as shown in step S241, an input pattern for an unknown character is read by image input.
Next, in step S242, connected patterns are extracted by labeling, and label numbers {circle around (1)} to {circle around (6)} are attached to the extracted connected patterns as shown in FIG.
[0335]
Next, as shown in step S245, the extraction reliability is quantified based on the statistical processing in step S243 and the non-statistical processing in step S244.
In the statistical processing in step S243, the extraction reliability when the adjacent connection patterns are integrated is calculated based on the distance h from the discrimination surface with respect to the point in the three-dimensional space having the values of the parameters c, e, and f. To do. This extraction reliability α is, for example,
α = (h−w₁) / (W₂-W₁) × 100 (15)
Can be expressed as
[0336]
here,
w₁: The position of the left end of the overlapping area of two groups
w₂: Position of the right end of the overlapping area of the two groups
It is.
[0337]
For example, when the pattern of label number (1) and the pattern of label number (2) are integrated, the extraction reliability is 80, and when the pattern of label number (2) and the pattern of label number (3) are integrated The cutout reliability is 12, and when the pattern with label number (3) and the pattern with label number (4) are integrated, the cutout reliability is 28, the pattern with label number (4) and the pattern with label number (5) The cutout reliability is 92 when integrated, and the cutout reliability is 5 when the pattern with label number (5) and the pattern with label number (6) are integrated.
[0338]
Further, in the non-statistical processing in step S244, the extraction reliability for the pattern “g” having a cloud point candidate is set to the distance h from the discriminant plane with respect to the point in the three-dimensional space having the values of the parameters p to r in FIG. Calculate based on
[0339]
For example, the cutout reliability is 85 when the pattern of label number (1) and the muddy dot pattern of the cutout determination unit 271 including the pattern of label number (2) and the pattern of label number (3) are integrated.
[0340]
FIG. 65 shows a method of calculating the extraction reliability in the non-statistical processing in step S244.
First, in step S251, a pattern 272 that is a muddy point candidate is extracted. For example, when two connected patterns exist adjacent to each other, and the ratio between the size of the combined patterns and the average character size of the circumscribed rectangle of the character string is equal to or less than a predetermined threshold value, the muddy point candidate Pattern.
[0341]
Next, in step S252, it is checked whether there is a circumscribed rectangle 281 adjacent to the left of the pattern 272 that is a muddy point candidate. In this case, if there is an adjacent rectangle 281 adjacent to the left of the pattern 272 that is a muddy point candidate. As a result of the determination, the process proceeds to step S253, and values of parameters p to r in FIG. 56 are calculated.
[0342]
In the example of FIG.
p = a / b = 0.1 (16)
q = b / MX = 1.3 (17)
r = (c × d) / (MX × MY)²= 0.3 (18)
It becomes.
[0343]
here,
a: Distance between the right frame of the circumscribed rectangle 281 and the left frame of the circumscribed rectangle 272
b: Distance between the left frame of the circumscribed rectangle 281 and the right frame of the circumscribed rectangle 272
c: Area of circumscribed rectangle 281
d: Area of circumscribed rectangle 272
MX: circumscribed rectangle average width
MY: circumscribed rectangle average height
Is
Next, as shown in step S254, the distance from the discrimination surface 293 to the point in the three-dimensional space having the values of the parameters p to r is calculated.
[0344]
In order to calculate the distance from the discrimination surface 293 to a point in the three-dimensional space having the values of the parameters p to r, the discrimination surface 293 is calculated based on the learning pattern. This discriminant plane 293 can be obtained by, for example, the expression (3) based on the histogram distribution 292 indicating the success of the extraction of the character string of the learning pattern and the histogram distribution 291 indicating the failure, and the muddy point extraction parameters p to r For example, the equation of the discriminant surface 293 is as follows:
0.17x0 + 0.75x1 + 0.64x2 + 30.4 = 0 (19)
And is an equation of a plane in a three-dimensional space.
[0345]
Therefore, the distance h from the discrimination surface 293 is obtained by substituting the values of (16) to (18) into the equation (19),
h = 0.17 × 0.1−0.75 × 1.3 + 0.64 × 0.3 + 30.4
= 29.6 (20)
It becomes.
[0346]
In addition, the average value m of the histogram distribution 292 indicating successful extraction of learning data is 38, the standard deviation is 25, and the fitness T is 0.2 according to the expression (11). The average value m is −34, the standard deviation is 28, and the fitness T is 0.3 from the equation (11).
[0347]
In addition, the left end w of the histogram distribution 292 indicating successful extraction of learning data₁If the proportionality constant k is 2, then from equation (12)
w₁= 38-2 × (1 + 0.2) × 25 = −22 (21)
It becomes.
[0348]
In addition, the right end w of the histogram distribution 291 indicating failure in cutting out the learning data₂If the proportionality constant k is 2, then from equation (12)
w₂= −34 + 2 × (1 + 0.3) × 28 = 38.8 (22)
It becomes.
[0349]
Accordingly, the two groups of overlapping regions 294 are regions having a distance between −22 to 38.8 from the discrimination plane.
Next, in step S255, the extraction reliability α is obtained. This extraction reliability α is obtained by substituting the values of (20) to (22) into the equation (15),
α = (29.6 − (− 22)) / (38.8 − (− 22)) × 100
= 85 (23)
It becomes.
[0350]
As a result, the label number {circle over (2)} and the label number {circle around (3)} are integrated to form the cutout determining unit 271.
Next, in step S246 of FIG. 64, the reliability of statistical processing and non-statistical processing is synthesized. At this time, if there is a cutout determination unit, it is given priority. Therefore, the reliability of the cutout determination unit 271 is preferentially combined.
[0351]
As a result, the cutout reliability is 85 when the pattern of the label number (1) and the pattern of the cutout determination unit 271 are integrated, and the cutout when the pattern of the cutout determination unit 271 and the pattern of the label number (4) are integrated. When the reliability is 30, the pattern with label number (4) and the pattern with label number (5) are integrated, the extraction reliability is 92, and the pattern with label number (5) and the pattern with label number (6) are integrated. In this case, the extraction reliability is 5.
[0352]
For example, the cutout reliability is greater than a predetermined threshold (for example, 90), or the cutout reliability is greater than a predetermined threshold (for example, 70), and the cutout reliability of the adjacent cutout pattern is present. Is greater than a predetermined value (for example, 5), pattern integration is performed.
[0353]
Further, when the extraction reliability is smaller than a predetermined threshold value (for example, 8), pattern integration is not performed.
For example, when the pattern of the label number {circle over (1)} and the pattern of the cutout determination unit 271 are integrated, the cutout reliability is 85, and the ratio of the cutout reliability with respect to the adjacent label number {circle around (4)} is 85/30. Since = 2.8, the pattern of the label number {circle around (1)} and the pattern of the cutout determining unit 271 are not integrated. When the pattern of the cutout determination unit 271 and the pattern of label number (4) are integrated, the reliability of cutout is 30, and the pattern of the cutout determination unit 271 and the pattern of label number (4) are not integrated.
[0354]
Further, since the extraction reliability is 92 when the pattern of label number (4) and the pattern of label number (5) are integrated, the pattern of label number (4) and the pattern of label number (5) are integrated. To do. Further, when the pattern of label number (5) and the pattern of label number (6) are integrated, the extraction reliability is 5, and the pattern of label number (5) and the pattern of label number (6) are not integrated. .
[0355]
Thus, a circumscribed rectangle 275 corresponding to the cutout determining unit 273 that integrates the pattern of label number (4) and the pattern of label number (5) and the circumscribed rectangle 276 corresponding to the pattern of label number (6) are generated. .
[0356]
In addition, the extraction reliability when the newly generated pattern of the cutout determination unit 273 and the pattern of the cutout determination unit 271 are integrated is obtained. This extraction reliability is 60 in the example of FIG.
[0357]
Next, as shown in step S247, when the pattern integration based on the extraction reliability is completed, extraction candidates 1 and extraction candidates 2 are extracted. Then, recognition processing is performed on each character of the cutout candidate 1 and the cutout candidate 2, and the cutout reliability α and the recognition reliability β in the characters in the cutout candidate 1 and the cutout candidate 2 are obtained for each character, and cut out The sum of the reliability α and the recognition reliability β is defined as the overall reliability R.
[0358]
For example, when the circumscribed

rectangles

275, 276, and 278 are extracted as the extraction candidates 1, the recognition reliability β when the character recognition is performed on the pattern in the circumscribed rectangle 278 is 80, and the pattern in the circumscribed rectangle 275 is displayed. On the other hand, the recognition reliability β when the character recognition is performed is 90, and the recognition reliability β when the character recognition is performed on the pattern in the circumscribed rectangle 276 is 85.
[0359]
Further, since the cutout reliability α when the pattern of the label number {circle over (1)} and the pattern of the cutout determination unit 271 are integrated is 85, the total reliability R is (13) when the weighting factor j is 1. According to the formula, 345 is obtained.
[0360]
When the circumscribed

rectangles

276, 281, and 282 are extracted as the extraction candidates 2, the recognition reliability β when the character recognition is performed on the pattern in the circumscribed rectangle 281 is 83, and the pattern in the circumscribed rectangle 282 is On the other hand, the recognition reliability β when the character recognition is performed is 55, and the recognition reliability β when the character recognition is performed on the pattern in the circumscribed rectangle 276 is 85.
[0361]
Further, when the pattern of the cutout determination unit 271 and the pattern of the cutout determination unit 273 are integrated, the cutout reliability α is 60, and the overall reliability R is 283.
[0362]
Next, in step S248, the extraction candidate 1 having the larger overall reliability R of the

extraction candidates

1 or 2 is selected as a successful extraction character candidate. As a result, the characters “G”, “N”, and “MA” can be correctly cut out one by one from the character string “Gunma”.
[0363]
A method for performing character recognition processing in consideration of the reliability of extraction from a character string is described in, for example, the specification and drawings of Japanese Patent Application No. 7-234982.
[0364]
Next, the operation of the blurred character recognition unit 19 in FIG. 3 will be specifically described.
FIG. 66 is a block diagram showing an example of the configuration of the blurred character recognition unit 19. In FIG. 66, a feature extraction unit 301 extracts a character feature from a blurred character pattern and represents the extracted feature by a feature vector. On the other hand, the fading dictionary 302 stores feature vectors of each category for the faint characters. Then, the collation unit 303 collates the feature vector of the character pattern extracted by the feature extraction unit 301 with the feature vector of each category stored in the blurred dictionary 302, and the distance D between the feature vectors in the feature space._ij(I is a feature vector of an unknown character, j is a feature vector of a category of the blurred dictionary 302). As a result, the distance D between feature vectors_ijIs recognized as an unknown character i.
[0365]
Here, the distance D between feature vectors in the feature space_ijIs, for example, the Euclidean distance Σ (ij)², City block distance Σ | i−j |, or a discriminant function such as a discriminant function.
[0366]
The distance from the first category is D_ij1, D is the distance from the second category_ij ₂Then, the first category j1, the second category j2, and the distance between categories (D_ij2-D_ij1) And a table 1 relating to the reliability. Also, the distance from the first category is D_ij1The table 2 relating to the first category j1 and reliability is also created in advance. Then, the smaller reliability obtained from Table 1 and Table 2 is stored in the intermediate processing result table.
[0367]
3 is different from the blurred character recognition section 19 of the blurred character recognition section 19 in that the blurred character recognition section 19 uses a collapsed dictionary storing feature vectors of each category for the lost characters. It can be set as the same structure.
[0368]
Next, an example of the strikethrough line recognition unit 26 in FIG. 3 will be described. For example, the striker line recognizing unit 26 creates a histogram of the correction character candidates extracted by the correction analysis in step S4 of FIG. Assuming that a horizontal strike-through line exists in the area exceeding the value, the horizontal line existing in this area is removed.
[0369]
Next, by removing the horizontal line, the blurred portion is complemented, and the pattern after the complementation is subjected to dictionary matching to perform character recognition. As a result, if the character is recognized as a character, the candidate for the correction character is regarded as a character with a strike-through line, and if it is rejected, the candidate for the correction character is regarded as a normal character.
[0370]
For example, in FIG. 67, “5” in a state corrected by a horizontal double line is input as a candidate for a correction character, and a pattern complemented by removing this horizontal double line is recognized as a category of “5”. As a result, the input pattern is regarded as a corrected character. Further, “5” is input as a candidate for the correction character, and as a result of rejecting the pattern from which the horizontal line of “5” is removed, the input pattern is regarded as not a correction character.
[0371]
Next, an example of the character analysis unit 23 in FIG. 3 will be described. The spurious character analysis unit 23 clusters handwritten characters recognized as belonging to the same category into a predetermined number of clusters, and for those having a small distance between clusters belonging to different categories, the cluster with the smaller number of elements is used. By correcting the character category to the character category of the cluster having the larger number of elements, the handwritten characters that are mistakenly recognized as belonging to another category are read correctly.
[0372]
FIG. 68 is a diagram showing a clustering process using feature vectors of handwritten characters determined to belong to the character category “4”.
FIG. 68 shows handwritten characters determined to belong to the recognition result category “4” because the distance from the feature vector of the character category “4” stored in the recognition dictionary is short. In this recognition process, the character handwritten as “2” is erroneously recognized as belonging to the recognition result category of “4”.
[0373]
Then, in the first clustering process, each handwritten character determined to belong to the character category “4” is regarded as one cluster, and in the second clustering process, the characteristics between the handwritten characters regarded as clusters. Vector distances are calculated, and the closest feature vector distances are integrated into one cluster. As a result, in the example of FIG. 68, the number of clusters is reduced from 11 to 1 to 10.
[0374]
Also in the third and subsequent clustering processes, the feature vector distance between the clusters is calculated, and the closest feature vector distances are integrated to reduce the number of clusters. In the eleventh clustering process, the number of clusters is 1
[0375]
Here, when integrating clusters, for example, a city block distance is used for comparing distances between clusters having one element, that is, feature vectors. When the number of elements is a plurality of clusters, for example, the centroid method is used. This centroid method uses the feature vector x of the i-th (i = 1, 2, 3,..., M) element of the cluster having M elements._iX_i= (X_i1, X_i2, X_i3, ..., x_iN), The representative vector x representing the cluster_m, The feature vector x of the elements of the cluster_iExpressed as the average of
[0376]
[Expression 1]

[0377]
And
And the representative vector x_mBy calculating the city block distance for each other, the distances between clusters having a plurality of elements are compared.
[0378]
When the clustering process is continued until the number of clusters becomes 1, the “2” handwritten character that was mistakenly recognized as belonging to the character category “4” was correctly recognized as belonging to the character category “4”. Since it belongs to the same cluster as the handwritten character “4”, a clustering abort condition for aborting the clustering process is set.
[0379]
As this clustering termination condition, for example,
(1) When the final number of clusters reaches a predetermined number (for example, 3),
(2) When the distance between clusters at the time of cluster integration exceeds a predetermined threshold,
(3) When the increase rate of the inter-cluster distance at the time of cluster integration exceeds a predetermined threshold value,
Any of the following conditions can be used.
[0380]
FIG. 69 is a flowchart showing the clustering process.
In FIG. 69, first, as shown in step S261, only the feature vectors of handwritten characters recognized as belonging to a certain character category are extracted, and the extracted feature vectors of each handwritten character are regarded as one cluster.
[0381]
Next, as shown in step S262, a clustering abort condition for aborting the clustering process is set.
Next, as shown in step S263, two clusters having the shortest distance are selected from all the clusters for a certain character category.
[0382]
Next, as shown in step S264, it is determined whether or not the clustering termination condition set in step S262 is satisfied. If the clustering termination condition is not satisfied, the process proceeds to step 265 and the two clusters selected in step S263 are selected. Then, the process returns to step S263 to repeat the process of integrating the clusters.
[0383]
As a result of repeating the process of integrating the clusters, if it is determined in step S264 that the clustering termination condition is satisfied, the process proceeds to step 266, where it is determined whether the clustering process has been performed for all character categories. If the clustering process is not performed for the character category, the process returns to step 261 to perform the clustering process for the character category for which the clustering process is not performed.
[0384]
On the other hand, if it is determined in step 266 that the clustering process has been performed for all character categories, the process proceeds to step 267 and the clustering result is stored in the memory.
[0385]
Next, based on the clustering result obtained by the clustering process, the handwritten characters that are mistakenly recognized as belonging to another category are read correctly.
FIG. 70 is a diagram illustrating a process of correctly converting a recognition result that is erroneously recognized as a character handwritten as “2” to belong to the character category “4” into a correct character category “2”.
[0386]
FIG. 70 shows handwritten characters determined to belong to the recognition result category “2” and handwritten characters determined to belong to the recognition result category “4”. Here, the character handwritten as “3” is erroneously recognized as belonging to the recognition result category of “2”, and the character handwritten as “2” is erroneously recognized as belonging to the recognition result category of “4”. , “4” is rejected as not belonging to any recognition result category.
[0387]
Next, clustering termination conditions are set when the final number of clusters in the same category becomes 3, and clustering processing is performed, so that clusters a, b, and c are generated for the recognition result category of “2”. , Clusters d, e, and f are generated for the recognition result category “4”, and clusters g, h, and i are generated for the rejected three “4” handwritten characters.
[0388]
Next, among the clusters a, b, and c belonging to the recognition result category “2” and the clusters d, e, and f belonging to the recognition result category “4”, the clusters a and d having a small number of characters are misread candidate clusters. Extract as
[0389]
Next, the distance between the misreading candidate cluster a and each of the other clusters b, c, d, e, and f, and the distance between the misreading candidate cluster d and each of the other clusters a, b, c, e, and f. Is calculated. Then, the cluster b is extracted as the cluster closest to the misreading candidate cluster a, and it is determined whether the distance between the misreading candidate cluster a and the cluster b is equal to or less than a predetermined value. The misreading candidate cluster a and the cluster Since the distance to b is not less than the predetermined value, the misreading candidate cluster a is rejected.
[0390]
As a result, the character handwritten as “3” mistakenly recognized as belonging to the recognition result category of “2” is excluded from the recognition result category of “2”.
Further, the cluster b is extracted as a cluster having the closest distance to the misreading candidate cluster d, and it is determined whether or not the distance between the misreading candidate cluster d and the cluster b is equal to or less than a predetermined value. Since the distance to b is equal to or less than a predetermined value, the misread candidate cluster d is integrated with the cluster b to generate the cluster j, and the cluster j is “2” to which the cluster b having the larger number of elements belongs. The handwritten character “2”, which is determined to belong to the misreading candidate cluster d because it is misread as “4”, is read correctly.
[0390]
Further, the distance between the clusters g, h, i rejected not belonging to any recognition result category and the other clusters a to f is calculated. Then, the cluster e is extracted as a cluster having the closest distance to the cluster g, it is determined whether the distance between the cluster g and the cluster e is equal to or smaller than a predetermined value, and the distance between the cluster g and the cluster e is determined. Is less than or equal to a predetermined value, cluster g is integrated with cluster e.
[0392]
Further, the cluster e is extracted as a cluster having the closest distance to the cluster h, and it is determined whether or not the distance between the cluster h and the cluster e is equal to or smaller than a predetermined value, and the distance between the cluster h and the cluster e is determined. Since h is less than or equal to a predetermined value, cluster h is integrated with cluster e. As a result of cluster g and cluster h being integrated into cluster e, cluster k is generated, and cluster k is determined to belong to the recognition result category “4” to which cluster e with the larger number of elements belonged. Thus, the handwritten character “4” rejected as unrecognizable is read correctly.
[0393]
Further, the cluster e is extracted as a cluster having the closest distance to the cluster i, it is determined whether or not the distance between the cluster i and the cluster e is equal to or smaller than a predetermined value, and the distance between the cluster i and the cluster e is determined. Is not less than or equal to a predetermined value, so that cluster i is not integrated with cluster e.
[0394]
FIG. 71 is a flowchart showing a character category recognition result correction process.
In FIG. 71, first, as shown in step S271, data about the clustering result obtained by the clustering process of FIG. 69 is read from the memory.
[0395]
Next, as shown in step S272, distances between the clusters are calculated for all clusters of all categories obtained by the clustering process of FIG. 69, and the distances between the clusters are compared.
[0396]
Next, as shown in step S273, it is determined whether or not there is a cluster whose distance between the clusters is equal to or smaller than the threshold value. When there is a cluster whose distance between the clusters is equal to or smaller than the threshold value, the process proceeds to step S274. Then, when these clusters are integrated and there is no cluster whose distance between the clusters is equal to or less than the threshold value, those clusters are rejected.
[0397]
Here, as a threshold value of the distance between clusters at the time of cluster integration, for example, a constant multiple of the distance between vectors in the cluster having the larger number of elements of the two clusters is used. That is, when the cluster A having M elements and the cluster B having N elements (M> N) are integrated, the representative vector of cluster A is xam, the representative vector of cluster B is xbm, If the feature vector is xai (i = 1, 2,..., M), the inter-vector distance d in the cluster A_th is
[0398]
[Expression 2]

[0399]
It is represented by
Therefore, the condition for integrating clusters is, for example, if a constant is set to 1.5,
| Xam−xbm | <1.5d_th
It becomes.
[0400]
Next, as shown in step S275, the character category in the cluster is determined for all clusters integrated in step S274.
Next, as shown in step S276, it is determined whether or not the character categories of the integrated clusters are different. If the character categories of the clusters are different, the process proceeds to step S277, and the character category of the cluster having the smaller number of elements is displayed. Is corrected to the character category of the cluster having the larger number of elements, and then the process proceeds to step S278. On the other hand, if the character categories of the clusters match, the process skips step S277 and proceeds to step S278.
[0401]
Next, as shown in step S278, the character category is output for the characters in the cluster.
Next, the operation of the pattern recognition apparatus according to the embodiment of the present invention will be described more specifically by taking the case of processing the form of FIG. 72 as an example.
[0402]
FIG. 72 is a diagram showing an example of a form input to the pattern recognition apparatus according to one embodiment of the present invention.
The form shown in FIG. 72 is provided with a free pitch frame with frame number 1, a single character frame with

frame numbers

2, 3, and 4, a block frame with frame number 5, and an irregular table with frame number 6. In addition, the free pitch frame of frame number 1 is in contact with the frame and “5” corrected by the horizontal double line, “3” and “2” in contact with the frame, and in contact with the frame "7" in the state and the blurred state, "4" and "6" in the character, and "4" in the state in which the character protrudes from the frame.
[0403]
“5” is entered in one character frame of frame number 2, “3” is entered in one character frame of frame number 3, and one character frame of frame number 4 is in a state protruding from the frame and by a horizontal double line. The corrected “8” is entered. Of the block frame of frame number 5, the frame of frame number 5-1 is filled with a “6”, which is corrected by a horizontal double line, and the frame of frame number 5-2 is in contact with the frame. In the state, “2” is entered, and the frame of the frame number 5-3 is filled with “4” as a comb character.
[0404]
Of the irregular table of frame number 6, “3”, “2”, and “1” in a state of protruding from the frame are entered in the frame of frame number 6-1-1. In the frame 2, “6”, “3”, “8” are entered, the frame with the frame number 6-1-3, the frame with the frame number 6-1-4-1, and the frame number 6-1-4. -2 frame, frame number 6-1-4-3 frame, frame number 6-2-1 frame, frame number 6-2-2 frame and frame number 6-2-3 frame are respectively blank. The entire irregular table with frame number 6 has been corrected with a cross.
[0405]
Next, the environment recognition system 11 in FIG. 3 extracts the state of the input image from the form in FIG. 72 by performing the processes in FIGS. 5 to 8 on the form in FIG.
For example, by the layout analysis of FIG. 6, the free pitch frame of frame number 1, one character frame of

frame numbers

2, 3, and 4, the block frame of frame number 5 and the irregular table of frame number 6 are obtained from the form of FIG. At the same time, eight patterns are extracted as character candidates from the free pitch frame of frame number 1, and one pattern is extracted as a character candidate from one character frame of

frame numbers

2, 3, and 4, respectively. Three patterns are extracted as character candidates from the block frame of frame number 5, and three patterns are extracted as character candidates from the frame of frame number 6-1-1. From the frame, three patterns are extracted as character candidates, a frame with a frame number 6-1-3, a frame with a frame number 6-1-4-1, a frame with a frame number 6-1-4-2, Frame number 6-1-4-3, frame number 6-2-1, frame From the frame and the frame of the frame number 6-2-3 of issue 6-2-2, candidate characters are not extracted.
[0406]
Here, in order to extract a character string from the form of FIG. 72, for example, the text extraction method shown in FIGS. 14 and 15 is used. To extract a ruled line from the form of FIG. 72, for example, FIG. In order to extract a frame or a table from the form shown in FIG. 72 using the ruled line extraction method shown in FIG. 22, for example, the frame extraction method shown in FIGS. 23 and 24 is used.
[0407]
Further, the first pattern, the second pattern, the fifth pattern, and the eighth pattern extracted from the free pitch frame of frame number 1 are frame contact character candidates. The pattern extracted from one character frame of frame number 4, the pattern extracted from the frame of frame number 5-2, and the first pattern extracted from the frame of frame number 6-1-1 are also frame contact characters. Candidate
[0408]
Here, in order to extract frame contact character candidates from the form of FIG. 72, for example, the frame contact character extraction method shown in FIGS. 27 and 28 is used.
Also, the quality analysis in FIG. 7 detects a blurred state, a collapsed state, high-quality characters, and the like from the form in FIG. In this example, the quality of the image is normal, and a blurred state, a crushed state, high-quality characters, and the like are not detected.
[0409]
Further, correction character candidates are extracted from the form shown in FIG. 72 by the correction analysis shown in FIG. In this example, the first pattern extracted from the free pitch frame of frame number 1, the pattern extracted from one character frame of

frame numbers

2 and 4, the pattern extracted from the frame of frame number 5-1 and the frame number The pattern extracted from the irregular table of 6 is regarded as a correction character candidate.
[0410]
Here, in order to extract correction character candidates from the form shown in FIG. 72, for example, the feature amount extraction method shown in FIG. 30 is used.
Next, the environment recognition system 11 creates an intermediate processing result table in which the states extracted from the form by the processes of FIGS. 5 to 8 are entered for each character candidate extracted from the input image.
[0411]
FIG. 73 is a diagram showing an intermediate processing result table in which the states extracted from the forms by the processes of FIGS. 5 to 8 are entered.
In FIG. 73, “free pitch” is entered as “frame type” and “8” is entered as “number of characters” in the frame number 1 column, and “frame contact” is entered in the first pattern column of frame number 1. “Presence” is entered as “Yes”, “Strikethrough” as “Yes 2”, “Quality” as “Normal”, and “No. "No" and "Normal" are entered as "Strikethrough", and "Normal" is entered as "Quality", and "Yes" and "Strikethrough" are set as "Presence / absence of frame contact" in the eighth pattern field of frame number 1 "None" is entered, and "Normal" is entered as "Quality".
[0412]
Here, “Yes 1” in the “Strikethrough” column indicates that there are strikethrough candidates for a plurality of characters, and “Yes2” in the “Strikethrough” column indicates a strikethrough for one character. Indicates that there is a candidate.
[0413]
In the column of the frame number 2, “one character” as the “frame type”, “none” as the “frame contact presence / absence”, “existing 2” as the “straight line”, “normal” as the “quality”, “ “1” is entered, and in the box of the frame number 3, “one character” as “frame type”, “none” as “frame presence / absence”, “none” as “strikethrough”, “normal” as “quality”, “1” is entered as the “number of characters”, and in the field of the frame number 4, “one character” as the “frame type”, “present” as “the presence / absence of the frame contact”, “existing line” as “present 2”, “quality” "Normal" and "1" as "Number of characters" are entered.
[0414]
In the box for frame number 5, “ladder” as “frame type” and “3” as “number of characters” are entered, and in the column for frame number 5-1, “no contact” and “erasure” are entered. “Yes” is entered as “Line”, “Normal” as “Quality”, “1” as “Number of characters”, and “Yes”, “Strikethrough” as “Presence / absence of frame contact” in the column of frame number 5-2 "No", "Normal" as "Quality", and "1" as "Number of characters" are entered, and "No" and "Strikethrough" are indicated as "No frame contact" in the column of the frame number 5-3. “None”, “Quality” are entered as “Normal”, and “Number of characters” are entered as “1”.
[0415]
In the column of frame number 6, “table” is entered as “frame type”, and in the column of frame number 6-1-1, “free pitch” as “frame type” and “present” as “frame contact presence / absence”. "," Straight line "is entered as" Yes 1 "," Quality "is entered as" Normal ", and in the box of the frame number 6-2-2," Frame type "is" Free pitch "," Frame contact / no presence " “No”, “Yes 1” as “Strikethrough”, and “Normal” as “Quality” are entered.
[0416]
Next, the environment recognition system 11 performs the process of FIG. 9 based on the state extracted from the form by the processes of FIGS.
That is, based on the state of the input image entered in the intermediate processing result table of FIG. 73, the basic character recognition unit 17, the character string recognition unit 15, the contact character recognition unit 13, and the blurred character recognition of the character recognition unit 12 of FIG. Referring to the processing order control rule, it is determined which of the processing of the striker recognizing unit 26 and the noise recognizing unit 28 of the unit 19, the broken character recognizing unit 21, or the non-character recognizing unit 25 is called. This is entered in the “process call” column of the intermediate process result table 73. Then, the order in which the processes entered in the “process call” column of the intermediate process result table in FIG. 73 are executed is determined with reference to the process order table, and the determined order is determined in the middle of FIG. Fill in the "Processing order" column of the processing result table.
[0417]
As an example of processing order control rules,
(A1) If the column indicating the state of the intermediate processing result table for a certain processing target is “Yes” and the processing corresponding to that state is not executed, the processing corresponding to that state is processed as the intermediate processing result. Fill in the “Process call” column of the table,
(A2) If all the columns indicating the state of the intermediate processing result table are “none” or “normal” for a certain processing target and the processing of the basic character recognition unit 17 is not executed, intermediate processing is performed. Enter “basic” in the “call processing” column of the result table,
(A3) If there are a plurality of processes corresponding to the state entered in the intermediate process result table for a certain process target, the process order table that determines the order of the plurality of processes is accessed. To rearrange the order of the "call process" column,
(A4) If the processing corresponding to the state entered in the intermediate processing result table is completed for a certain processing target, the completed processing is entered in the “processing completed” column of the intermediate processing result table, Enter the instruction to be performed next and the instruction indicating the interruption or termination of processing in the “Processing Instruction” column of the intermediate processing result table, and based on the information, the order of the “Processing Call” column of the intermediate processing result table Rearrange,
and so on.
[0418]
FIG. 74 is a diagram illustrating an example of the processing order table.
In FIG. 74, the processing order table includes, for example,
(B1) For a certain process target, if only one process is entered in the “process call” column of the intermediate process result table, enter that process in the “process order” column of the intermediate process result table.
(B2) If “black frame / free pitch” is entered in the “processing call” column of the intermediate processing result table for a certain processing target, “black frame → free” in the “processing order” column of the intermediate processing result table Enter `` Pitch '',
(B3) For a certain processing target, when “Straight line (Yes 2) / black frame” is entered in the “Process call” column of the intermediate processing result table, “Black frame → one character erase line” is entered.
(B4) For a certain process target, when “black frame / free pitch / strikethrough (with 2)” is entered in the “process call” column of the intermediate process result table, “process order” of the intermediate process result table “Black frame → one character erase line → free pitch” is entered in the column of (B5), “Black frame / free pitch / strike (with 1)” in the “process call” column of the intermediate processing result table for a certain processing target. ”Is entered, enter“ Straight line → Black frame → Free pitch ”in the“ Processing order ”column of the intermediate processing result table.
(B6) For a certain process target, when “Free Pitch / Strikethrough (Yes 1)” is entered in the “Process Call” field of the intermediate process result table, the “Process Order” field of the intermediate process result table Enter "Multiple character strikethrough → Free pitch" in
(B7) For a certain process target, “Process A, B, C” is entered in the “Process Call” column of the intermediate process result table, and “Process B → Process” is entered in the “Process Order” column of the intermediate process result table. When “A → Processing C” is entered, and “Processing B” is entered in the “Processing completion” column of the intermediate processing result table, the “Processing order” column of the intermediate processing result table is set to “Processing A”. → Update to Process C
(B8) For a certain process target, “Process A, B, C” is entered in the “Process Call” column of the intermediate process result table, and “Process B → Process” is entered in the “Process Order” column of the intermediate process result table. When “A → Process C” is entered, “Process B” is entered in the “Process Complete” column of the intermediate process result table, and “Skip to Process C” is entered in the “Process Instruction” column of the intermediate process result table. ”Is updated, the“ Processing order ”column in the intermediate processing result table is updated to“ Processing C ”.
(B9) For a certain process target, “Process A, B, C” is entered in the “Process Call” column of the intermediate process result table, and “Process B → Process” is entered in the “Process Order” column of the intermediate process result table. When “A → Process C” is entered, “Process B” is entered in the “Process Complete” column of the intermediate process result table, and “Process C and Process” are entered in the “Process Instruction” column of the intermediate process result table. If “reversal of order with A” is entered, the “processing order” column of the intermediate processing result table is updated to “processing C → processing A”.
(B10) For a certain processing target, “Processing B, A” is entered in the “Processing call” column of the intermediate processing result table, and “Processing A” is entered in the “Processing completion” column of the intermediate processing result table. When “End” is entered in the “Processing instruction” column of the intermediate processing result table, the “Processing order” column of the intermediate processing result table is set to “End”.
Stored procedures such as.
[0419]
In FIG. 75, a process to be called based on the state of the input image entered in the intermediate process result table of FIG. 73 is entered in the “process call” column and the process entered in the “process call” column is executed. It is a figure which shows the example which entered the order in the column of the "processing order".
[0420]
In FIG. 75, “free pitch” is entered as “frame type” in the frame number 1 column, and “present”, “frame presence / absence” is entered in the first pattern column of frame number 1. Since “Yes 2” is entered as “Strikethrough”, “Black frame / Free pitch / Strikethrough (Yes 2)” is entered in the “Process call” column in accordance with (A1) of the processing order control rule, and the processing In accordance with (A3) of the order control rule, (B4) of the processing order table is referred to, and “black frame → one character eraser → free pitch” is entered in the “processing order” column.
[0421]
In the second pattern column of frame number 1, “present” is entered as “frame presence / absence”, “none” as “strikethrough”, and “normal” as “quality”. In accordance with (A1) of the rule, “black frame / free pitch” is entered in the “processing call” column, and (B2) of the processing order table is referred to in accordance with (A3) of the processing order control rule. Enter “black frame → free pitch”.
[0422]
In the column of the eighth pattern of frame number 1, “present” is entered as “presence / absence of frame”, “absent” as “strikethrough”, and “normal” as “quality”. In accordance with (A1) of the rule, “black frame / free pitch” is entered in the “processing call” column, and (B2) of the processing order table is referred to in accordance with (A3) of the processing order control rule. Enter “black frame → free pitch”.
[0423]
In the box for frame number 2, “one character” is entered as the “frame type”, “none” as “the presence / absence of frame contact”, “existing 2” as “strikethrough”, and “normal” as “quality”. In accordance with (A1) of the processing order control rule, “Stracing (Yes 2)” is entered in the “Processing call” field, and “Single character eraser” is entered in the “Processing order” field according to (A1) of the processing order control rule. ".
[0424]
In the column of frame number 3, “one character” is entered as “frame type”, “no” as “frame contact presence / absence”, “no” as “strikethrough”, and “normal” as “quality”. According to (A2) of the processing order control rule, “basic” is written in the “calling process” column, and “basic” is written in the “processing order” column according to (A1) of the processing order control rule.
[0425]
In the box for frame number 4, “one character” is entered as “frame type”, “present” as “frame presence / absence”, “existing 2” as “strikethrough”, and “normal” as “quality”. In accordance with (A1) of the processing order control rule, “black frame / strikethrough (2)” is entered in the “processing call” column, and (B3) of the processing order table is referred to according to (A3) of the processing order control rule. Then, enter “black frame → one character erase line” in the “processing order” column.
[0426]
In the column of frame number 5, “ladder” is entered as “frame type”, and in the column of frame number 5-1, “no contact” as “frame presence / absence”, “existing line” as “present 2”, Since “Normal” is entered as “Quality”, “Straight Line (Yes 2)” is entered in the “Process Call” field according to (A1) of the process order control rule, and (A1) of the process order control rule ) In the “Processing order” field, enter “single character strikethrough”.
[0427]
In the column of frame number 5-2, “present” is entered as “presence / absence of frame contact”, “none” as “straight line”, and “normal” as “quality”. ), “Black frame” is written in the “processing call” column, and “black frame” is written in the “processing order” column according to (A1) of the processing order control rule.
[0428]
In the column of the frame number 5-3, “No” is entered as “Frame presence / absence”, “No” as “Straight line”, and “Normal” as “Quality”. ), “Basic” is entered in the “Processing Call” column, and “Basic” is entered in the “Processing Order” column according to (A1) of the processing order control rule.
[0429]
In the column of frame number 6, “table” is entered as “frame type”, and in the column of frame number 6-1-1, “free pitch” as “frame type” and “present” as “frame contact presence / absence”. ”,“ Yes 1 ”as“ Strikethrough ”, and“ Normal ”as“ Quality ”are entered, and“ Black frame / Free pitch / Strikethrough ”is displayed in the“ Process call ”column according to (A1) of the process order control rule. (Yes 1) ”is entered, and (B5) of the processing order table is referred to in accordance with (A3) of the processing order control rule. In the“ processing order ”column,“ multiple character eraser → black frame → free pitch ” Fill out.
[0430]
In the column of the frame number 6-2-2, “free pitch” as “frame type”, “none” as “presence / absence of frame contact”, “present 1” as “strikethrough”, “normal” as “quality” Since it has been entered, “Free Pitch / Strikethrough (Yes 1)” is entered in the “Process Call” field according to (A1) of the process order control rule, and the process order table according to (A3) of the process order control rule Referring to (B6), “Multiple character strikethrough → Free pitch” is entered in the “Processing order” column.
[0431]
Next, the first recognition process is executed with reference to the process execution rule based on the intermediate process result table of FIG. 75 in which the “process call” and “process order” fields are filled. Then, the recognition process that has been completed is entered in the “Process Complete” column of the intermediate process result table, and the reliability obtained by the recognition process at that time is entered in the “Reliability” column of the intermediate process result table. .
[0432]
In addition, the “processing order” column of the intermediate processing result table is updated according to (B7) to (B9) of the processing order table of FIG. 74, and if there is a next process specified by the process execution rule, The processing is entered in the “processing instruction” column of the processing result table.
[0433]
As a process execution rule, for example,
(C1) If there is a process entered in the “process order” column of the intermediate process result table for a certain process target, the process with the highest priority is executed. When the executed process is completed, the completed process is entered in the “process completed” column of the intermediate process result table, and the process is deleted from the “process order” column of the intermediate process result table. Also, when instructing the process to be executed next, enter the process in the “Processing instruction” column of the intermediate process result table.
(C2) If it is determined that a certain pattern is not a non-character but a character as a result of executing a certain process, and the character code is calculated with a certain degree of reliability or more, “individual writing characteristics” Fill in the “Processing Instruction” column of the intermediate processing result table to call the character recognition processing by
(C3) If it is determined that a certain pattern is a strike line as a result of executing a certain process and the strike line is calculated with a reliability equal to or higher than a predetermined value, “process instruction” in the intermediate process result table Enter “End” in the column of “No.”, terminate the subsequent processing entered in the “Processing order” column of the intermediate processing result table, and end the processing.
(C4) If “free pitch” is entered at the beginning of the “processing order” column of the intermediate processing result table, the processing prior to “free pitch” for other processing targets of the same frame number is unprocessed. Then, after “free pitch” is entered at the beginning of the “processing order” column for all processing targets of the same frame number, the processing of “free pitch” for all processing targets of the same frame number is executed simultaneously.
(C5) If all processes entered in the “Processing Order” column of the intermediate processing result table are completed, “End” is entered in the “Processing Instruction” column of the intermediate processing result table for all processing targets. If “Personal Written Character” is entered, the character recognition process based on “Personal Written Character” is applied to the processing target “Personal Written Character” in the “Processing Instruction” column. Call and execute the process, and when the character recognition process by “personal writing characteristics” is completed, enter “end” in the “processing instruction” column of the intermediate processing result table.
(C6) If all processing targets are entered as “end” in the “processing instruction” column of the intermediate processing result table, all processing ends and the recognition result is output.
and so on.
[0434]
FIG. 76 shows that the recognition process is executed with reference to the process execution rule based on the intermediate process result table of FIG. 75, and the reliability obtained by the recognition process at that time is shown in the “reliability” column of the intermediate process result table. 8 is a diagram showing an example in which the “processing order” column of the intermediate processing result table is updated based on the processing execution rule, and the “processing instruction” column of the intermediate processing result table is entered.
[0435]
First, in the “processing order” field of the first pattern of frame number 1 in the intermediate processing result table of FIG. 75, “black frame” is first designated. The process of the contact character recognition unit 13 of FIG. 3 corresponding to the “black frame” is executed on the first pattern extracted from the free pitch frame of the frame number 1 of 72.
[0436]
For example, as shown in FIGS. 39 and 40, the contact character recognition unit 13 performs character recognition on the frame contact character by performing character complementation or recomplementation on the pattern from which the frame is removed. Further, for a pattern for which sufficient reliability cannot be obtained even if character completion or re-complementation is used, the knowledge table 14 is referred to, and re-character recognition is performed on the learning characters shown in FIGS. Performs character recognition for contact characters.
[0437]
As a result of calculating the recognition reliability of the first pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the character recognition processing of the contact character recognition unit 13 as 20%, frame number 1 of FIG. The first pattern extracted from the free pitch frame is regarded as not a character, and “Reject” is entered in the “Character code” column of the intermediate processing result table, and the “reliability” of the intermediate processing result table “20%” is entered in the column.
[0438]
Further, “black frame” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is updated to “one character erase line → free pitch”.
[0439]
Next, since “black frame” is first designated in the “processing order” column of the second pattern of frame number 1 in the intermediate processing result table of FIG. 75, in accordance with (C1) of the processing execution rule, For the second pattern extracted from the free pitch frame of frame number 1 in FIG. 72, the process of the contact character recognition unit 13 in FIG. 3 corresponding to the “black frame” is executed to perform character recognition for the frame contact character. Do.
[0440]
The second pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the character recognition processing of the contact character recognition unit 13 is recognized as the character category “3” with a probability of recognition reliability of 60%. Then, “3” is entered in the “character code” column of the intermediate processing result table, and “60%” is entered in the “reliability” column of the intermediate processing result table.
[0441]
In addition, “black frame” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is updated to “free pitch”.
Next, in the “processing order” column of the eighth pattern of frame number 1 in the intermediate processing result table of FIG. 75, “black frame” is first designated, so according to (C1) of the processing execution rule For the eighth pattern extracted from the free pitch frame of frame number 1 in FIG. 72, the processing of the contact character recognition unit 13 in FIG. 3 corresponding to the “black frame” is executed, and character recognition for the frame contact character is performed. Do.
[0442]
The eighth pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the character recognition processing of the contact character recognition unit 13 is recognized as the character category “4” with a probability of 95% recognition reliability. Then, “4” is entered in the “character code” column of the intermediate processing result table, and “95%” is entered in the “reliability” column of the intermediate processing result table.
[0443]
In addition, “black frame” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is updated to “free pitch”.
Next, since “one character eraser” is indicated in the “processing order” field of frame number 2 in the intermediate processing result table in FIG. 75, frame number 2 in FIG. 72 is in accordance with (C1) of the processing execution rule. For the pattern extracted from one character frame, the process of the strike line recognition unit 26 in FIG. 3 corresponding to “one character strikethrough” is executed.
[0444]
For example, as shown in FIG. 67, the strike line recognizing unit 26 removes a horizontal line having a histogram value equal to or greater than a predetermined value from a pattern extracted as a correction character candidate, and the pattern from which the horizontal line is removed Is recognized as a correction character, the pattern extracted as a correction character candidate is recognized as a correction character, and the pattern from which the horizontal line having a histogram value equal to or greater than a predetermined value is removed is rejected. In such a case, the pattern extracted as a candidate for the corrected character is recognized as a normal character by regarding the removed horizontal line as a part of the erased line as a part of the character and not as a erased line.
[0445]
72. As a result of calculating the recognition reliability of the pattern extracted from one character frame of frame number 2 in FIG. 72 by the strike line recognition processing of the strike line recognition unit 26, it is extracted from one character frame of frame number 2 in FIG. The processed pattern is considered not to be a correction character, and “10%” is entered in the “reliability” column of the intermediate processing result table, and “basic” is entered in the “processing instruction” column of the intermediate processing result table. Is done.
[0446]
In addition, “Strikethrough” is entered in the “Processing completed” column of the intermediate processing result table, and “Basic” is entered in the “Processing order” column of the intermediate processing result table.
Next, since “basic” is designated in the “processing order” field of frame number 3 in the intermediate processing result table of FIG. 75, one character of frame number 3 of FIG. 72 is in accordance with (C1) of the processing execution rule. For the pattern extracted from the frame, the process of the basic character recognition unit 17 in FIG. 3 corresponding to “basic” is executed.
[0447]
For example, as shown in FIG. 31, the basic character recognizing unit 17 extracts features of the input unknown character, represents the features of the unknown character by a feature vector, and stores each category stored in the basic dictionary in advance. By comparing with the feature vector, the distance between the feature vectors in the feature space is calculated, and the character category that minimizes the distance between the feature vectors is recognized as an unknown character.
[0448]
The basic character recognition unit 17 calculates the degree of deformation of the unknown character by calculating the number of contours of the contour of the unknown character. If the degree of deformation of the unknown character is large and the recognition rate is reduced, the knowledge table 18 is referred to and character recognition is performed using the detailed identification method shown in FIGS.
[0449]
The pattern extracted from one character frame of frame number 3 in FIG. 72 by the character recognition processing of the basic character recognition unit 17 is recognized as one character category “3” with a probability of recognition reliability of 95%, and the intermediate processing result “3” is entered in the “character code” column of the table, and “95%” is entered in the “reliability” column of the intermediate processing result table.
[0450]
Further, “basic” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is blank.
Next, since “black frame” is first designated in the “processing order” column of frame number 4 in the intermediate processing result table in FIG. 75, frame number 4 in FIG. 72 is in accordance with (C1) of the process execution rule. For the pattern extracted from one character frame, the process of the contact character recognition unit 13 of FIG. 3 corresponding to the “black frame” is executed to perform character recognition for the frame contact character.
[0451]
As a result of calculating the recognition reliability of the pattern extracted from one character frame of frame number 4 in FIG. 72 by the character recognition processing of contact character recognition unit 13, it is extracted from one character frame of frame number 4 in FIG. 72. The pattern is regarded as not a character, and “Reject” is entered in the “Character Code” column of the intermediate processing result table, and “15%” is entered in the “Reliability” column of the intermediate processing result table. .
[0452]
In addition, “black frame” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is updated to “one character eraser”.
Next, since “one character eraser” is indicated in the “processing order” column of the frame number 5-1 of the intermediate processing result table of FIG. 75, the frame of FIG. The pattern extracted from the frame of number 5-1 is subjected to the process of the strike line recognition unit 26 of FIG. 3 corresponding to “one character strike line”, and the pattern extracted as a candidate for the corrected character is recognized. .
[0453]
72. As a result of calculating the recognition reliability of the pattern extracted from the frame of the frame number 5-1 in FIG. 72 by the strike line recognition process of the strike line recognition unit 26 as 95%, the frame of the frame number 5-1 in FIG. The pattern extracted from is regarded as a correction character, and “95%” is entered in the “reliability” column of the intermediate processing result table, and “strike” is entered in the “processing completed” column of the intermediate processing result table. Is entered.
[0454]
In addition, “End” is entered in the “Processing instruction” column of the intermediate processing result table, and the “Processing order” column of the intermediate processing result table is blank.
Next, since “black frame” is indicated in the “processing order” column of the frame number 5-2 of the intermediate processing result table in FIG. 75, the frame number 5 in FIG. 72 is determined according to (C1) of the processing execution rule. The pattern extracted from the -2 frame is subjected to the process of the contact character recognition unit 13 in FIG. 3 corresponding to the “black frame” to perform character recognition for the frame contact character.
[0455]
Here, the underlined portion of the pattern extracted from the frame with the frame number 5-2 in FIG. 72 is in contact with the frame, and sufficient reliability cannot be obtained by the process of character completion in FIG. 39 or re-complementation in FIG. Therefore, as shown in FIG. 50 (b), by referring to the knowledge table 167 in FIG. 45, the misread character pair (2, 7) is acquired, and the re-character is reproduced by the region emphasis method shown in FIG. Recognize.
[0456]
The pattern extracted from the frame of frame number 5-2 in FIG. 72 by the character recognition processing of the contact character recognition unit 13 is recognized as the character category “2” with a recognition reliability of 95%, and is subjected to intermediate processing. “2” is entered in the “character code” column of the result table, and “95%” is entered in the “reliability” column of the intermediate processing result table.
[0457]
Further, “black frame” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is blank.
Next, since “basic” is indicated in the “processing order” column of the frame number 5-3 in the intermediate processing result table in FIG. 75, the frame number 5 in FIG. 72 is determined according to (C1) of the processing execution rule. For the pattern extracted from the frame -3, the process of the basic character recognition unit 17 of FIG. 3 corresponding to “basic” is executed to perform the character recognition process for the basic character.
[0458]
The pattern extracted from the frame of frame number 5-3 in FIG. 72 by the character recognition processing of the basic character recognition unit 17 is recognized as the character category “6” with a recognition reliability of 90%, and the intermediate processing “6” is entered in the “character code” column of the result table, and “90%” is entered in the “reliability” column of the intermediate processing result table.
[0459]
Further, “basic” is entered in the “processing completed” column of the intermediate processing result table, and the “processing order” column of the intermediate processing result table is blank.
Next, in the “processing order” column of the frame number 6-1-1 in the intermediate processing result table of FIG. 75, “Crossed characters of multiple characters” is first designated, so the processing execution rule (C1) Accordingly, the process of the strike line recognizing unit 26 of FIG. 3 corresponding to the “multiple character strikethrough” is executed, and the strikethrough recognition process is performed.
[0460]
As a result of the strike line recognition process of the strike line recognition unit 26 extracting a strike line from the table of frame number 6 and calculating the recognition reliability of the strike line as 98%, the frame number 6-1-1 in FIG. The pattern extracted from this frame is regarded as a corrected character, “Straight line” is entered in the “character code” column of the intermediate processing result table, and “98” is entered in the “reliability” column of the intermediate processing result table. % "Is entered, and" Straight line "is entered in the" Processing completed "column of the intermediate processing result table.
[0461]
Further, according to (C3) of the process execution rule, “End” is entered in the “Processing instruction” column of the intermediate processing result table, and the “Processing order” column of the intermediate processing result table is blank.
[0462]
Next, in the “processing order” column of the frame number 6-2-2 in the intermediate processing result table of FIG. 75, the “execution of multiple characters” is first designated. Accordingly, the process of the strike line recognizing unit 26 of FIG. 3 corresponding to the “multiple character strikethrough” is executed, and the strikethrough recognition process is performed.
[0463]
A strike line is extracted from the table of frame number 6 by the strike line recognition process of the strike line recognition unit 26, and the recognition reliability of the strike line is calculated as 98%. As a result, the frame number 6-2-2 in FIG. The pattern extracted from this frame is regarded as a corrected character, “Straight line” is entered in the “character code” column of the intermediate processing result table, and “98” is entered in the “reliability” column of the intermediate processing result table. % "Is entered, and" Straight line "is entered in the" Processing completed "column of the intermediate processing result table.
[0464]
Further, according to (C3) of the process execution rule, “End” is entered in the “Processing instruction” column of the intermediate processing result table, and the “Processing order” column of the intermediate processing result table is blank.
[0465]
Through the above processing, the intermediate processing result table of FIG. 76 is generated. Here, since the processing to be called next is entered in the “processing order” column of the intermediate processing result table of FIG. 76, the processing is continued according to the processing execution rule (C1).
[0466]
FIG. 77 is a diagram illustrating a result obtained when the recognition process is continued based on the intermediate process result table of FIG. 76.
First, in the “processing order” column of the first pattern of the frame number 1 in the intermediate processing result table of FIG. 76, “single character strikeout” is first designated, so according to the processing execution rule (C1). 72 for the first pattern extracted from the free pitch frame of frame number 1 in FIG. 72, the process of the strike line recognition unit 26 in FIG. 3 corresponding to “one character strikethrough” is executed to recognize the corrected character. Process.
[0467]
The recognition reliability of the first pattern extracted from the free pitch frame of frame number 1 in FIG. 72 is calculated as 96% by the recognition processing of the strikethrough recognition unit 26. As a result, the free number of frame number 1 in FIG. The first pattern extracted from the pitch frame is regarded as a corrected character, and a “straight line” is entered in the “character code” field of the intermediate processing result table, and the “reliability” of the intermediate processing result table is also entered. “96%” is entered in the column, and “black frame / strikethrough” is entered in the “processing completed” column of the intermediate processing result table.
[0468]
In addition, “End” is entered in the “Processing instruction” column of the intermediate processing result table, and the “Processing order” column of the intermediate processing result table is blank.
Next, since “free pitch” is indicated in the “processing order” column of the second pattern of frame number 1 in the intermediate processing result table of FIG. 75, the processing is executed according to (C4) of the processing execution rule. The second pattern extracted from the free pitch frame with the frame number 1 of 72 waits until the “processing order” column of all other patterns with the same frame number 1 becomes “free pitch”. FIG. 3 corresponding to “free pitch” for all patterns extracted from the free pitch frame of frame number 1 when the “processing order” column of all patterns of number 1 becomes “free pitch”. Character string recognition unit 15 is executed, and character recognition is performed in consideration of the character cut-out reliability.
[0469]
Next, since “free pitch” is indicated in the “processing order” column of the eighth pattern of frame number 1 in the intermediate processing result table of FIG. 75, the processing is executed according to (C4) of the processing execution rule. For the eighth pattern extracted from the free pitch frame of 72 frame number 1, wait until the “processing order” column of all other patterns of the same frame number 1 becomes “free pitch”. FIG. 3 corresponding to “free pitch” for all patterns extracted from the free pitch frame of frame number 1 when the “processing order” column of all patterns of number 1 becomes “free pitch”. The character string recognizing unit 15 is executed, and recognition processing is performed in consideration of the character cut-out reliability.
[0470]
When the “processing order” column of all the patterns of the frame number 1 is “free pitch”, the character string recognition is performed on all the patterns extracted from the free pitch frame of the frame number 1 in FIG. The character recognition process of the part 15 is performed.
[0471]
Here, for the first pattern extracted from the free pitch frame of frame number 1 in FIG. 72, the “processing instruction” field of the first pattern of frame number 1 in the intermediate processing result table of FIG. 72, the first pattern extracted from the free pitch frame of frame number 1 in FIG. 72 is excluded from the processing target of the character string recognition unit 15, and the free pitch of frame number 1 in FIG. The recognition processing of the character string recognition unit 15 is executed for the second to eighth patterns extracted from the frame.
[0472]
For example, as shown in FIGS. 52 to 65, the character string recognizing unit 15 calculates the reliability when the character is cut out based on the distance from the discrimination surface, and (character cutting reliability) and ( A character having the maximum product with the reliability of character recognition is defined as a cut-out character.
[0473]
The second pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the recognition processing of the character string recognition unit 15 is recognized as the character category “3” with a recognition reliability of 95%. “3” is entered in the “character code” column of the intermediate processing result table, and “95%” is entered in the “reliability” column of the intermediate processing result table.
[0474]
Further, in accordance with (C1) of the process execution rule, “black frame / free pitch” is entered in the “process completed” column of the intermediate process result table, and the “process order” column of the intermediate process result table is blank, and the process execution In accordance with (C4) of the rule, “personal writing characteristics” is entered in the “processing instruction” column of the intermediate processing result table.
[0475]
The eighth pattern extracted from the free pitch frame of frame number 1 in FIG. 72 is recognized as the character category “4” with a probability of recognition reliability of 98%, and “character code” in the intermediate processing result table. “4” is entered in the column “”, and “98%” is entered in the “reliability” column of the intermediate processing result table.
[0476]
Further, in accordance with (C1) of the process execution rule, “black frame / free pitch” is entered in the “process completed” column of the intermediate process result table, and the “process order” column of the intermediate process result table is blank, and the process execution In accordance with (C4) of the rule, “personal writing characteristics” is entered in the “processing instruction” column of the intermediate processing result table.
[0477]
Also, the third pattern extracted from the free pitch frame of frame number 1 in FIG. 72 is recognized as the character category “2”, and the fourth pattern extracted from the free pitch frame of frame number 1 in FIG. The fifth pattern extracted from the free pitch frame of frame number 1 in FIG. 72 and the fifth pattern is integrated into one character by the recognition processing of the character string recognition unit 15 and is the character category “7”. The sixth pattern recognized and extracted from the free pitch frame of frame number 1 in FIG. 72 is recognized as the character category “4”, and the sixth pattern extracted from the free pitch frame of frame number 1 in FIG. The seventh pattern is recognized as the character category “6”.
[0478]
As a result, the “number of characters” field in the intermediate processing result table of FIG. 77 is changed to “7”.
Next, since “basic” is designated in the “processing order” field of frame number 2 in the intermediate processing result table of FIG. 76, one character of frame number 2 of FIG. 72 is in accordance with (C1) of the processing execution rule. The basic character recognition unit 17 of FIG. 3 corresponding to “basic” is executed on the pattern extracted from the frame to perform character recognition processing for the basic character.
[0479]
The pattern extracted from the one character frame of frame number 2 in FIG. 72 by the character recognition processing of the basic character recognition unit 17 is recognized as the character category “5” with a recognition reliability of 97%, and the intermediate processing result “5” is entered in the “character code” column of the table, and “97%” is entered in the “reliability” column of the intermediate processing result table.
[0480]
In addition, “Strikethrough (Yes 2) / Basic” is entered in the “Process Call” column of the intermediate process result table, “Strikethrough / Basic” is entered in the “Process Complete” column of the intermediate process result table, The “processing order” column of the intermediate processing result table is blank, and “individual writing characteristics” is entered in the “processing instruction” column of the intermediate processing result table in accordance with (C4) of the processing execution rule.
[0481]
Next, since the “processing order” field of frame number 3 in the intermediate processing result table in FIG. 76 is blank, it is displayed in the “processing instruction” field of the intermediate processing result table according to (C4) of the processing execution rule. Entered as "personal writing characteristics".
[0482]
Next, in the “processing order” column of the frame number 4 in the intermediate processing result table of FIG. 76, “single character strikeout” is designated, so that the frame number 4 of FIG. For the pattern extracted from one character frame, the process of the strike line recognition unit 26 of FIG. 3 corresponding to “one character strikethrough” is executed, and the pattern extracted as a candidate for the corrected character is recognized.
[0483]
72. As a result of calculating the recognition reliability of the pattern extracted from one character frame of frame number 4 in FIG. 72 by the strike line recognition processing of the strike line recognition unit 26, it is extracted from one character frame of frame number 4 in FIG. The processed pattern is regarded as a corrected character, and “95%” is entered in the “reliability” column of the intermediate processing result table, and “black frame / strikethrough” is entered in the “processing completed” column of the intermediate processing result table. Is entered.
[0484]
In addition, “End” is entered in the “Processing instruction” column of the intermediate processing result table, and the “Processing order” column of the intermediate processing result table is blank.
Next, since “End” is entered in the “Processing instruction” field of the frame number 5-1 of the intermediate processing result table of FIG. 76, the pattern extracted from the frame of the frame number 5-1 of FIG. Does not process.
[0485]
Next, since the “processing order” field of frame number 5-2 in the intermediate processing result table in FIG. 76 is blank, the “processing instruction” of the intermediate processing result table is set according to (C4) of the processing execution rule. “Personal writing characteristics” is entered in the column.
[0486]
Next, since the “processing order” field of frame number 5-3 in the intermediate processing result table of FIG. 76 is blank, the “processing instruction” of the intermediate processing result table is set according to (C4) of the processing execution rule. “Personal writing characteristics” is entered in the column.
[0487]
Next, since “End” is entered in the “Processing instruction” field of the frame number 6-1-1 of the intermediate processing result table of FIG. 76, it is extracted from the frame of the frame number 6-1-1 of FIG. The processed pattern is not processed.
[0488]
Next, since “End” is entered in the “Processing instruction” field of the frame number 6-2-2 of the intermediate processing result table of FIG. 76, it is extracted from the frame of the frame number 6-1-1 of FIG. The processed pattern is not processed.
[0489]
Through the above processing, the intermediate processing result table of FIG. 77 is generated. Here, in the “processing instruction” column of the intermediate processing result table of FIG. 77, there are items in which “individual writing characteristics” are entered, so the processing is continued according to the processing execution rule (C5).
[0490]
FIG. 78 is a diagram showing a result obtained when the recognition process is continued based on the intermediate process result table of FIG. 77.
First, since “end” is entered in the “processing instruction” field of the first pattern of frame number 1 in the intermediate processing result table of FIG. 76, it is extracted from the free pitch frame of frame number 1 of FIG. The first pattern is not processed.
[0491]
Next, since “individual writing characteristics” is entered in the “processing instruction” field of the second pattern of frame number 1 in the intermediate processing result table of FIG. 75, the processing is executed according to (C5) of the processing execution rule. For the second pattern extracted from the free pitch frame of the frame number 1 of 72, the processing of the character analysis unit 23 of FIG. 3 corresponding to the “personal writing characteristics” is executed.
[0492]
For example, as shown in FIGS. 68 to 71, the pseudo-character analyzing unit 23 clusters handwritten characters written by the same author for each category, and the first cluster of handwritten characters obtained by the clustering and the distance The handwritten character category belonging to the second cluster is corrected to the category of the first cluster by integrating the second cluster having a small number of elements in the second cluster that is close to the other category into the first cluster. .
[0493]
The second pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the analysis processing of the koji analysis unit 23 is recognized as the character category “3” with a recognition reliability of 97%, “3” is entered in the “character code” column of the intermediate processing result table, and “97%” is entered in the “reliability” column of the intermediate processing result table.
[0494]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0495]
Next, since “individual writing characteristics” is entered in the “processing instruction” field of the eighth pattern of frame number 1 in the intermediate processing result table of FIG. 75, according to (C5) of the processing execution rule, FIG. For the eighth pattern extracted from the free pitch frame of the frame number 1 of 72, the processing of the character analysis unit 23 of FIG. 3 corresponding to “personal writing characteristics” is executed.
[0496]
The eighth pattern extracted from the free pitch frame of frame number 1 in FIG. 72 by the analysis processing of the koji analysis unit 23 is recognized as the character category “4” with a probability of recognition reliability of 98%. Then, “4” is entered in the “character code” column of the intermediate processing result table, and “98%” is entered in the “reliability” column of the intermediate processing result table.
[0497]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0498]
Next, since “personal writing characteristics” is entered in the “processing instruction” field of frame number 2 in the intermediate processing result table of FIG. 76, in accordance with (C5) of the processing execution rule, the frame number 2 of FIG. For the pattern extracted from the one character frame, the processing of the cross-character analysis unit 23 of FIG. 3 corresponding to “individual writing characteristics” is executed.
[0499]
The pattern extracted from the one character frame of the frame number 2 in FIG. 72 by the analysis processing of the character analysis unit 23 is recognized as the character category “5” with a probability of recognition reliability of 97%, and the intermediate processing result table “5” is entered in the “Character Code” column, and “97%” is entered in the “Reliability” column of the intermediate processing result table.
[0500]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0501]
Next, since “individual writing characteristics” is entered in the “processing instruction” field of frame number 3 in the intermediate processing result table of FIG. 76, in accordance with (C5) of the process execution rule, the frame number 3 of FIG. For the pattern extracted from the one character frame, the processing of the cross-character analysis unit 23 of FIG. 3 corresponding to “individual writing characteristics” is executed.
[0502]
The pattern extracted from the one character frame of frame number 3 in FIG. 72 by the analysis processing of the koji analysis unit 23 is recognized as the character category “3” with a probability of recognition reliability of 97%, and the intermediate processing result table "3" is entered in the "Character code" column, and "97%" is entered in the "Reliability" column of the intermediate processing result table.
[0503]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0504]
Next, since “end” is entered in the “processing instruction” field of frame number 4 in the intermediate processing result table of FIG. 76, the pattern extracted from one character frame of frame number 4 of FIG. Do not do.
[0505]
Next, since “End” is entered in the “Processing instruction” field of the frame number 5-1 of the intermediate processing result table of FIG. 76, the pattern extracted from the frame of the frame number 5-1 of FIG. Does not process.
[0506]
Next, since “individual writing characteristics” is entered in the “processing instruction” field of the frame number 5-2 of the intermediate processing result table of FIG. 76, the frame number of FIG. 72 is determined according to (C5) of the processing execution rule. 3 is executed on the pattern extracted from the frame 5-2, corresponding to “personal writing characteristics”.
[0507]
The pattern extracted from the frame of the frame number 5-2 in FIG. 72 by the analysis processing of the koji analysis unit 23 is recognized as the character category “2” with a probability of recognition reliability of 97%, and the intermediate processing result “2” is entered in the “character code” column of the table, and “97%” is entered in the “reliability” column of the intermediate processing result table.
[0508]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0509]
Next, since “Personal Written Characteristic” is entered in the “Processing Instruction” field of the frame number 5-3 of the intermediate processing result table of FIG. 76, the frame number of FIG. 72 is in accordance with (C5) of the processing execution rule. 3 is executed on the pattern extracted from the frame 5-3, corresponding to the “personal writing characteristics”.
[0510]
The pattern extracted from the frame of the frame number 5-3 in FIG. 72 by the analysis processing of the character analysis unit 23 is recognized as the character category “4” with a recognition reliability of 96%, and the intermediate processing result The “character code” column of the table is changed to “4” and “96%” is entered in the “reliability” column of the intermediate processing result table.
[0511]
Further, “black frame / free pitch / individual writing characteristics” is entered in the “processing completed” column of the intermediate processing result table, and “end” is entered in the “processing instruction” column of the intermediate processing result table.
[0512]
Next, since “End” is entered in the “Processing instruction” field of the frame number 6-1-1 of the intermediate processing result table of FIG. 76, it is extracted from the frame of the frame number 6-1-1 of FIG. The processed pattern is not processed.
[0513]
Next, since “End” is entered in the “Processing instruction” field of the frame number 6-2-2 of the intermediate processing result table of FIG. 76, it is extracted from the frame of the frame number 6-1-1 of FIG. The processed pattern is not processed.
[0514]
Through the above processing, the intermediate processing result table of FIG. 78 is generated. Here, since “End” is entered for all processing targets in the “Processing instruction” column of the intermediate processing result table in FIG. 78, all processing is ended according to the processing execution rule (C6). .
[0515]
As described above, according to the embodiment of the present invention, the character recognition unit 12 and the non-character recognition unit 25 perform recognition processing suitable for processing the state of the input image recognized by the environment recognition system 11. .
[0516]
For example, when the environment recognition system 11 extracts a character that touches a ruled line, the environment recognition system 11 extracts a free pitch character string by using the contact character recognition unit 13 that performs dedicated recognition processing for the character that touches the ruled line. In this case, when the character recognition unit 15 that performs dedicated recognition processing for the free pitch character string is used and the environment recognition system 11 extracts a blurred character, the blurred character recognition unit that performs dedicated recognition processing for the blurred character. 19, when the environment recognition system 11 extracts a collapsed character, when the environment recognition system 11 extracts a non-character, the environment recognition system 11 extracts a non-character. The non-character recognition unit 25 is used exclusively for the recognition process.
[0517]
Further, the reliability of the recognition result of the character recognition unit 19 or the non-character recognition unit 25 is calculated, and for the characters and non-characters with low reliability, the environment recognition system 11, the character recognition unit 19 and the non-character recognition unit 25 The other processes are performed again by performing mutual feedback, and the entire process is terminated when the reliability becomes high or there is no process that can be executed.
[0518]
As described above, according to the embodiment of the present invention, the recognition process can be executed by adaptively changing the characteristics and the identification method used when recognizing the character according to the environment in which the character is written. Therefore, highly accurate character recognition corresponding to various environments of documents and forms becomes possible.
[0519]
In addition to outputting only the character code as the recognition result, the environment recognition result by the environment recognition system 11 can be output simultaneously with the character recognition result, and when the environment recognition result and the character recognition result coincide with each other. The character recognition result can be output, and the confirmation function and reliability for the character recognition result can be improved.
[0520]
Furthermore, since the non-character recognition unit 25 is provided exclusively and non-character recognition can be performed independently of character recognition, the reliability of character recognition and non-character recognition can be improved.
[0521]
Furthermore, since an independent recognition process according to the environment in which each character is written can be performed, the recognition reliability can be improved by increasing the dictionary and knowledge in each recognition process.
[0522]
【The invention's effect】
As described above, according to the present invention, by extracting the processing target state from the input image and selecting a recognition process suitable for the state for each processing target, for input images having various states, Pattern recognition processing suitable for each state can be performed, and recognition processing can be performed with high accuracy. Moreover, since the evaluation of the processing object is performed both when the state is extracted and when the recognition process for the processing object is performed, the accuracy of the recognition process can be further improved.
[0523]
In addition, according to one aspect of the present invention, the state of the processing target is extracted from the input image, the processing target having the first state is subjected to the pattern recognition processing dedicated to the first state, and the second state For a processing target having a state, by performing a pattern recognition process dedicated to the second state, the recognition process for the processing target having the first state and the recognition process for the processing target having the second state are mutually performed. There is no interaction, and the recognition process can be performed with high accuracy.
[0524]
Further, according to one aspect of the present invention, by using different recognition dictionaries for input images having various states, the optimum recognition dictionary can be used for each state, and the accuracy of recognition processing can be improved. It becomes possible to improve.
[0525]
Further, according to one aspect of the present invention, it is possible to perform recognition processing while using an optimum discrimination function for each state by using different discrimination functions for input images having various states. The accuracy of the recognition process can be improved.
[0526]
Further, according to one aspect of the present invention, by using knowledge for input images having various states, recognition processing can be performed while using optimal knowledge for each state. It is possible to improve the accuracy.
[0527]
In addition, according to one aspect of the present invention, by performing a plurality of recognition processes on the same processing target until the reliability by the recognition process reaches a predetermined value or more, the reliability of the recognition process is increased. The accuracy of the recognition process can be improved.
[0528]
Further, according to one aspect of the present invention, by performing recognition processing for non-characters and recognition processing for characters separately, the characters are regarded as non-characters, or the non-characters are regarded as characters. Thus, the recognition process is reduced and the recognition process can be performed with high accuracy.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration of a pattern recognition apparatus according to an embodiment of the present invention.
FIG. 2 is a block diagram showing an embodiment of a more specific configuration of the environment recognition means of FIG.
FIG. 3 is a block diagram showing an example of a more specific configuration of the pattern recognition apparatus of FIG. 1;
4 is a flowchart showing an example of the overall operation of the environment recognition system of FIG. 3;
FIG. 5 is a flowchart showing an embodiment of the operation of the preprocessing unit in FIG. 4;
6 is a flowchart showing an embodiment of the operation of the layout analysis unit of FIG.
7 is a flowchart showing an embodiment of the operation of the quality analysis unit of FIG.
8 is a flowchart showing an example of the operation of the correction analysis unit of FIG.
FIG. 9 is a flowchart showing an embodiment of the operation of the control unit for character recognition / non-character recognition in FIG. 4;
FIG. 10 is a block diagram showing a system configuration of a pattern recognition apparatus according to an embodiment of the present invention.
FIG. 11 is a block diagram showing a more specific system configuration of a pattern recognition apparatus according to an embodiment of the present invention.
FIG. 12 is a diagram illustrating an example of a labeling process of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 13 is a diagram showing a compressed representation of the labeling process of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 14 is a diagram illustrating an example of text extraction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 15 is a diagram illustrating an example of a partial area in text extraction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 16 is a diagram illustrating an adjacent projection method in ruled line extraction processing of the pattern recognition apparatus according to an embodiment of the present invention.
FIG. 17 is a diagram showing a pattern projection result in ruled line extraction processing of the pattern recognition apparatus according to the embodiment of the present invention;
FIG. 18 is a flowchart showing ruled line extraction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 19 is a diagram illustrating ruled line extraction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 20 is a diagram illustrating a method for complementing a blurred rule line in the ruled line extraction process of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 21 is a flowchart illustrating a method for complementing a blurred rule line in the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 22 is a diagram illustrating a search direction when complementing a faint ruled line in the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 23 is a flowchart showing a single character frame extraction process of the pattern recognition apparatus according to the embodiment of the present invention;
FIG. 24 is a flowchart showing block frame extraction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 25 is a diagram showing frame and table types of the pattern recognition apparatus according to an embodiment of the present invention.
FIG. 26 is a flowchart showing image reduction processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 27 is a diagram illustrating frame contact presence / absence determination processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 28 is a flowchart showing frame contact presence / absence determination processing of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 29 is a diagram showing the types of strikethrough of the pattern recognition apparatus according to one embodiment of the present invention.
FIG. 30 is a diagram illustrating a method for calculating a feature amount of a corrected character in the pattern recognition apparatus according to an embodiment of the present invention.
31 is a block diagram illustrating a configuration example of a basic character recognition unit in FIG. 3;
32 is a diagram illustrating an example of a feature vector calculation method in the basic character recognition unit in FIG. 3;
33 is a diagram illustrating an example of a method for calculating a distance between feature vectors in the basic character recognition unit in FIG. 3;
34 is a diagram for explaining a character segment extraction method of the detailed identification method in the basic character recognition unit of FIG. 3; FIG.
35 is a diagram for explaining an endpoint detection method of the detailed identification method in the basic character recognition unit of FIG. 3;
36 is a diagram for explaining a method of detecting an angle change of the detailed identification method in the basic character recognition unit of FIG. 3;
FIG. 37 is a diagram for explaining a correspondence relationship between character segments in the detailed identification method in the basic character recognition unit in FIG. 3;
38 is a flowchart showing processing of a detailed identification method in the basic character recognition unit of FIG. 3;
FIG. 39 is a diagram illustrating a character complementing method in the contact character recognition unit of FIG. 3;
40 is a diagram illustrating a re-complementation method in the contact character recognition unit of FIG. 3;
41 is a diagram illustrating an example of a complementary misread character in the contact character recognition unit of FIG. 3;
42 is a block diagram illustrating an example of a character learning method in the contact character recognition unit of FIG. 3;
43 is a diagram for explaining a method for generating a frame contact character in the contact character recognition unit of FIG. 3;
44 is a diagram showing a generation example of a frame contact character in the contact character recognition unit of FIG. 3;
45 is a diagram illustrating an example of a knowledge table in the contact character recognition unit of FIG. 3;
46 is a diagram illustrating an example of a variation type and a variation amount registered in a knowledge table in the contact character recognition unit in FIG. 3;
47 is a diagram showing an example of a re-recognition region by region enhancement of the contact character recognition unit in FIG. 3;
48 is a diagram for explaining a re-recognition method by region emphasis of the contact character recognition unit in FIG. 3; FIG.
49 is a flowchart showing re-recognition processing by region emphasis of the contact character recognition unit of FIG. 3;
50 is a block diagram illustrating an example of a character re-recognition method in the contact character recognition unit of FIG. 3;
51 is a flowchart showing a character re-recognition process in the contact character recognition unit of FIG. 3; FIG.
52 is a diagram for explaining the graphical meaning of parameters by statistical processing of the character string recognition unit in FIG. 3;
53 is a flowchart showing statistical processing of the character string recognition unit in FIG. 3. FIG.
54 is a diagram for explaining the graphical meaning of parameters by the separated character processing of the character string recognition unit in FIG. 3;
55 is a flowchart showing separation character processing of the character string recognition unit in FIG. 3; FIG.
56 is a diagram for explaining the graphical meanings of parameters by the muddy point processing of the character string recognition unit in FIG. 3;
57 is a flowchart showing a muddy point process of the character string recognition unit in FIG. 3. FIG.
58 is a flowchart showing a process of calculating character cutout success / failure data of the character string recognition unit in FIG. 3;
59 is a diagram showing a method of quantifying the character cutout reliability of the character string recognition unit in FIG. 3;
60 is a diagram illustrating a method of generating a frequency distribution of the character string recognition unit in FIG. 3;
61 is a flowchart showing a method of calculating the character cutout reliability of the character string recognition unit in FIG. 3;
FIG. 62 is a diagram illustrating an example of a histogram distribution of character extraction success and extraction failure in the character string recognition unit of FIG. 3;
FIG. 63 is a diagram illustrating a method of calculating two groups of overlapping areas, that is, character extraction success and extraction failure in the character string recognition unit of FIG. 3;
FIG. 64 is a diagram showing a flow of character cutout processing in the character string recognition unit of FIG. 3;
65 is a diagram showing a flow of character cutout processing in non-statistical processing of the character string recognition unit in FIG. 3; FIG.
66 is a block diagram illustrating a configuration example of a blurred character recognition unit in FIG. 3. FIG.
67 is a diagram illustrating an example of processing of a strikethrough line recognizing unit in FIG. 3;
FIG. 68 is a diagram showing a flow of clustering processing by the comb character analysis unit of FIG. 3;
FIG. 69 is a flowchart showing clustering processing by the cross-character analysis unit of FIG. 3;
70 is a diagram showing a flow of a character category determination result correction process by the habit analysis unit of FIG. 3;
FIG. 71 is a flowchart showing a character category determination result correction process by the habit analysis unit of FIG. 3;
FIG. 72 is a diagram showing an example of a form to be processed by the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 73 is a diagram showing an example of an intermediate processing result table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 74 is a diagram showing an example of a processing order table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 75 is a diagram showing an example of an intermediate processing result table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 76 is a diagram showing an example of an intermediate processing result table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 77 is a diagram showing an example of an intermediate processing result table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 78 is a diagram showing an example of an intermediate processing result table of the pattern recognition apparatus according to the embodiment of the present invention.
FIG. 79 is a block diagram showing a configuration of a conventional pattern recognition apparatus.
[Explanation of symbols]
1 Environment recognition means
2 First pattern recognition means
4 Second pattern recognition means
6 Nth pattern recognition means
3, 5, 7 Reliability calculation means
1a State extraction means
1b Recognition processing control means
1c Intermediate processing result table creation means
1d Processing order control rule storage means
1e Process execution rule storage means
1f Processing order table
11 Environment recognition system
12 Character recognition part
13 Contact character recognition unit
15 Character recognition unit
17 Basic character recognition part
19 Faint character recognition part
21 Broken character recognition part
23 Koji Character Analysis Department
25 Non-character recognition part
26 Strikethrough recognition unit
28 Noise recognition unit
14, 16, 18, 20, 22, 24, 27, 29 Knowledge table
30 Environment recognition system
31 Layout analyzer
32 Correction Analysis Department
33 Character recognition / non-character recognition
34 Basic character recognition part
35 Black frame contact character recognition unit
36 Free pitch character string recognition unit
37 Strikethrough recognition part
38 Environment recognition system
39 Spear analysis unit
40 End determination processing unit
41 Image storage
42 Processing condition storage
43 Label image storage
44 Intermediate processing result table
50 program memory
51 Central processing unit
52 Image memory
53 Work memory
54 Bus
55 Interface circuit
56 display
57 Printer
58 memory
59 Scanner
60 dictionary files

Claims

Layout analysis means for analyzing whether the input image processing target includes a frame contact character that is a character in contact with the frame line ;
Quality analysis means for analyzing whether the processing target of the input image includes blurred or crushed characters ;
Correction analysis means for analyzing whether or not a correction character by a strikethrough is included in the processing target of the input image;
A first knowledge table storing knowledge about a method for recognizing the frame contact character, and performing pattern recognition processing of the frame contact character included in the processing target based on the knowledge stored in the first knowledge table. First pattern recognition means to perform;
A second knowledge table storing knowledge about a method for recognizing a blurred or crushed character, and pattern recognition of the blurred or crushed character included in the processing target based on the knowledge stored in the second knowledge table Second pattern recognition means for performing processing;
A third knowledge table storing knowledge about the recognition method of the corrected character, and performing a pattern recognition of the corrected character included in the processing object based on the knowledge stored in the third knowledge table; Pattern recognition means,
A fourth knowledge table storing knowledge about a basic character or character string recognition technique, and based on the knowledge stored in the fourth knowledge table, the basic character included in the processing target Or a fourth pattern recognition means for performing pattern recognition of a character string;
If it is analyzed as containing box touching character in the process target by the layout analysis means, it said allowed first pattern recognition means perform the recognition process, the blurred characters or collapse processed by the quality analysis unit when analyzed the character is included, if the carry out a recognition process to the second pattern recognition means, which is analyzed to contain a correct character in the process target by the corrected analysis means, said first the third pattern recognition unit to perform the recognition process, when only the basic character or character string is included as a character or character string in the processing object, the recognition to perform a recognition process to said fourth pattern recognition means Processing control means ,
The recognition process control means includes:
Recognition by the first to fourth pattern recognition means when a plurality of states of basic characters or character strings, frame contact characters, blurred or collapsed characters, and correction characters are included in the same processing target Processing order storage means for storing a processing order indicating in which order the processing is executed;
A calling procedure indicating which pattern recognition means to call from among the first to fourth pattern recognition means is stored in correspondence with the results of the analysis by the layout analysis means, the quality analysis means, and the correction analysis means. Processing order control rule storage means to perform,
When it is analyzed that the processing target includes a plurality of states of basic characters or character strings, frame contact characters, blurred or collapsed characters, and correction characters, the processing order control rule storage unit stores The pattern recognition performed on the processing target based on the calling procedure corresponding to the plurality of states and the processing order corresponding to the plurality of states stored in the processing order storage unit. An intermediate processing result table creating means for creating an intermediate processing result table in which the execution order of each recognition process by the means is entered,
A pattern recognition apparatus characterized by causing the pattern recognition means to perform recognition processing based on the execution order entered in the intermediate process result table created by the intermediate process result table creation means .

The pattern recognition apparatus according to claim 1, wherein the first to fourth pattern recognition means include pattern cutout means for cutting out a recognition target.

When the recognition processing control means is analyzed that the same processing object includes a plurality of states of basic characters or character strings, frame contact characters, blurred or collapsed characters, and correction characters , The pattern recognition apparatus according to claim 1, wherein the recognition process is performed in accordance with the priority order until the reliability by the recognition process becomes equal to or higher than a predetermined value.

Before SL quality analysis means, for a given region, the (area, the number of vertical / horizontal length of the following connecting region each predetermined threshold) / (number of all of the connecting region of the predetermined region) The pattern recognition apparatus according to claim 1, wherein when the value is larger than a predetermined value, the character is determined to be a blurred character .

Before SL quality analysis means, for a given region, the value of (the sum of the lengths of husky borders complemented portion at the time of complement) / (Total length of each ruled line), greater than a predetermined value The pattern recognition apparatus according to claim 1, wherein the pattern recognition device is sometimes determined to be a faint character .

Before SL quality analysis means, for a given region, the value of (the number of black pixel density is a predetermined threshold value larger coupling region) / (number of all of the connecting region of the predetermined region) than a predetermined value The pattern recognition apparatus according to claim 1, wherein the pattern recognition device determines that the character is a collapsed character when the value is larger.