JP4116688B2

JP4116688B2 - Dictionary learning method and character recognition device

Info

Publication number: JP4116688B2
Application number: JP36583497A
Authority: JP
Inventors: 慶司吉崎
Original assignee: 株式会社日本デジタル研究所
Priority date: 1997-12-22
Filing date: 1997-12-22
Publication date: 2008-07-09
Anticipated expiration: 2017-12-22
Also published as: JPH11184976A

Description

【０００１】
【発明の属する技術分野】
本発明は文字認識装置に関し、特に、入力文字のイメージデータから特徴量を作成し、認識辞書の特徴量との比較により認識された文字コードを得る辞書学習方式および文字認識装置に関する。
【０００２】
【従来の技術】
文字認識装置は、文字画像を読取って電気信号に変換して画像イメージを得て、それを１文字ずつ切出して認識辞書を用いて文字認識を施し、文字コードを得る装置である。
上記文字認識の過程では、先ず、切出された文字の特徴量抽出と、抽出された特徴量と認識辞書内の各テンプレートの特徴量との比較がなされ、最も類似度の高い特徴量を有する候補文字が正解文字とされその文字コードが取得される。
しかし、読取られる文字は同じ文字でも形や大きさや線の太さが異なったり線のかすれがあったりして必ずしも正確に認識できるとは限らない。
特に、手書き文字を認識する場合には上記のような要因に加え、個人の書き癖等の要因も加わって認識率が低下する。
【０００３】
上述のような問題の解決方法として、不読文字或いは誤認識文字の修正入力の際に認識辞書に登録されている標準パターンを更新（修正／追加）する方式、いわゆる’認識辞書の学習’により認識率向上を図る方法がある。認識率の学習方式に関する従来技術として、（１）特開平２−１８６４８４号公報の「文字認識装置」や特開平９ー１８５６８２号公報の「認識辞書の学習方式」等に開示の技術のように修正入力の際に、逐一その文字を学習させるか否かをオペレータが判断して指示を与える方式、（２）特開平７−１２９７２４号公報の「文字認識装置ならびにその辞書学習方法および辞書作成方法」に開示の技術のように、修正と同時に学習を行なう方式、（３）特開平７−２７１９１７号公報の「手書き文字認識辞書作成方法及び装置」や、特開平８−２８７１９１号公報の「光学的文字読取り装置」等に開示の技術のように、学習後の認識辞書の妥当性を認識シミュレーションによって調べる方式がある。
【０００４】
【発明が解決しようとする課題】
しかしながら、上記（１）の技術では、オペレータの作業負担が増加する点と文字学習の要否の判断のための専門的知識をオペレータが持っていなければ精度の高い’認識辞書の学習’ができないという問題点があり、（２）の技術では、学習文字の妥当性を考慮していないので、学習の効率性および学習後の認識辞書の安定性に欠けるという問題点があり、（３）の技術では、学習後の認識辞書の妥当性のシミュレーションに時間を要し、また、認識シミュレーションに必要なデータを格納する記憶手段を追加する必要があるという不具合がある。
【０００５】
本発明は、上記問題点および不具合に鑑みてなされたものであり、効率的に認識辞書の学習を行なうと共に学習後の認識辞書に高い妥当性と安定性を与え得る辞書学習方法および文字認識装置の提供を目的とする。
【０００６】
【課題を解決するための手段】
上記問題を解決するために、請求項１に記載の発明の辞書学習方法は、認識辞書の学習処理を行なう文字認識装置の辞書学習方法であって、文字認識装置のコンピュータにより、認識結果に対して修正入力された文字から下記除外条件（１）〜（５）のいずれかに該当する文字を除外した学習文字ｃの特徴量を抽出する特徴量抽出ステップ、学習文字ｃの特徴量と認識辞書のテンプレートの特徴量から各文字コード毎に最小認識距離と最大認識距離を算出する認識距離算出ステップ、認識距離算出ステップにより算出した各文字コードの最小認識距離の中で学習文字ｃとは異なる文字コードの最小認識距離Ａの正規化距離が閾値ωより小さい場合にその学習文字ｃを除外し、算出した各文字コードの最大認識距離の中で学習文字ｃと同じ文字コードの最大認識距離Ｂの正規化距離が閾値ωより大きい場合にその学習文字ｃを除外して、学習文字Ｃを得るフィルタステップ、からなることを特徴とする。
除外条件：（１）修正入力により挿入された文字、（２）修正入力により削除された修正対象文字、（３）修正対象文字がその前後の文字と接触している場合の当該修正対象文字、（４）修正対象文字の大きさが閾値より小さい場合の当該修正対象文字、（５）修正対象文字が掠れている場合の当該修正対象文字
【０００７】
また、請求項２に記載の発明は請求項１に記載の発明の辞書学習方法において、更に、フィルタステップにより得た学習文字Ｃを学習対象テンプレート別に分類し、その中から未だ学習処理をしてなくて且つ最小認識距離が最小の学習文字Ｄを選択する選択ステップ、学習文字Ｄの特徴量と学習対象テンプレートの特徴量を学習文字Ｄの合成比率Ｒを変化させながら下記式Ｇに基づいてベクトル量として合成し、学習文字Ｄの特徴量に前記分類されたテンプレートの特徴量を近づける特徴量合成ステップ、を備えたことを特徴とする。
式Ｇ： Fdic = (1 - R) Fdic + RFunk
Fdic は合成後の特徴量、 Fdic は認識辞書における学習対象テンプレートの特徴量、 Funk は文字の特徴量、 R は合成比率（０＜ R ＜１）
【０００８】
また、請求項３に記載の発明は請求項２に記載の発明の辞書学習方法において、更に、辞書学習ステップによる学習の終了時に、学習文字Ｄの最小認識距離を用いて学習文字Ｄが正認識されたか否かを判定し、判定結果に基づいて認識辞書データの更新または追加を行なう更新／追加ステップ、更新／追加ステップにより認識辞書データが更新または追加されたテンプレートを用いて学習文字の認識処理を行って得た認識距離データを基に学習文字フィルタ手段で用いるフィルタデータを変更するフィルタデータ更新ステップ、を備えたことを特徴とする。
【０００９】
また、請求項４に記載の発明の文字認識装置は、認識辞書の学習処理を行なう文字認識装置において、認識結果に対して修正入力された文字から下記除外条件（１）乃至（５）のいずれかに該当する文字を除外した学習文字ｃの特徴量を抽出する特徴量抽出手段と、学習文字ｃの特徴量と認識辞書のテンプレートの特徴量から各文字コード毎に最小認識距離および最大認識距離を算出する認識距離算出手段と、認識距離算出手段により算出した各文字コードの最小認識距離の中で学習文字ｃとは異なる文字コードの最小認識距離Ａの正規化距離が閾値ωより小さい場合にその学習文字ｃを除外し、算出した各文字コードの最大認識距離の中で学習文字ｃと同じ文字コードの最大認識距離Ｂの正規化距離が閾値ωより大きい場合にその学習文字ｃを除外して、学習文字Ｃを得るフィルタ手段と、フィルタ手段により得た学習文字Ｃを学習対象テンプレート別に分類し、その中から未だ学習処理をしてなくて且つ最小認識距離が最小の学習文字Ｄを選択する選択手段と、学習文字Ｄの特徴量と学習対象テンプレートの特徴量を学習文字Ｄの合成比率Ｒを変化させながら下記式Ｇに基づいてベクトル量として合成し、学習文字Ｄの特徴量に前記分類されたテンプレートの特徴量を近づける特徴量合成手段と、特徴量合成手段による合成処理の終了時に、学習文字Ｄの最小認識距離を用いて学習文字Ｄが正認識されたか否かを判定し、判定結果に基づいて認識辞書データの更新または追加を行なう更新／追加手段と、更新／追加手段により認識辞書データが更新または追加されたテンプレートを用いて学習文字の認識処理を行って得た認識距離データを基に学習文字フィルタ手段で用いるフィルタデータを変更するフィルタデータ更新手段と、を備えたことを特徴とする。
除外条件：（１）修正入力により挿入された文字、（２）修正入力により削除された修正対象文字、（３）修正対象文字がその前後の文字と接触している場合の当該修正対象文字、（４）修正対象文字の大きさが閾値より小さい場合の当該修正対象文字、（５）修正対象文字が掠れている場合の当該修正対象文字
式Ｇ： Fdic = (1 - R) Fdic + RFunk
Fdic は合成後の特徴量、 Fdic は認識辞書における学習対象テンプレートの特徴量、 Funk は文字の特徴量、 R は合成比率（０＜ R ＜１）
【００１０】
また、請求項５に記載の発明はのプログラムを記録したコンピュータ読み取り可能な記録媒体は、認識辞書の学習処理を行なう文字認識装置において、コンピュータを、認識結果に対して修正入力された文字から下記除外条件（１）乃至（５）のいずれかに該当する文字を除外した学習文字ｃの特徴量を抽出する特徴量抽出手段、学習文字ｃの特徴量と認識辞書のテンプレートの特徴量から各文字コード毎に最小認識距離および最大認識距離を算出する認識距離算出手段、認識距離算出手段により算出した各文字コードの最小認識距離の中で学習文字ｃとは異なる文字コードの最小認識距離Ａの正規化距離が閾値ωより小さい場合にその学習文字ｃを除外し、算出した各文字コードの最大認識距離の中で学習文字ｃと同じ文字コードの最大認識距離Ｂの正規化距離が閾値ωより大きい場合にその学習文字ｃを除外して、学習文字Ｃを得るフィルタ手段、フィルタ手段により得た学習文字Ｃを学習対象テンプレート別に分類し、その中から未だ学習処理をしてなくて且つ最小認識距離が最小の学習文字Ｄを選択する選択手段と、学習文字Ｄの特徴量と学習対象テンプレートの特徴量を学習文字Ｄの合成比率Ｒを変化させながら下記式Ｇに基づいてベクトル量として合成し、学習文字Ｄの特徴量に分類されたテンプレートの特徴量を近づける特徴量合成手段と、特徴量合成手段による合成処理の終了時に、学習文字Ｄの最小認識距離を用いて学習文字Ｄが正認識されたか否かを判定し、判定結果に基づいて認識辞書データの更新または追加を行なう更新／追加手段、更新／追加手段により認識辞書データが更新または追加されたテンプレートを用いて学習文字の認識処理を行って得た認識距離データを基に学習文字Ｃを得る際に用いるフィルタデータを変更するフィルタデータ更新手段、として機能させるプログラムを記録したコンピュータ読み取り可能な記録媒体。
除外条件：（１）修正入力により挿入された文字、（２）修正入力により削除された修正対象文字、（３）修正対象文字がその前後の文字と接触している場合の当該修正対象文字、（４）修正対象文字の大きさが閾値より小さい場合の当該修正対象文字、（５）修正対象文字が掠れている場合の当該修正対象文字
式Ｇ： Fdic = (1 - R) Fdic + RFunk
Fdic は合成後の特徴量、 Fdic は認識辞書における学習対象テンプレートの特徴量、 Funk は文字の特徴量、 R は合成比率（０＜ R ＜１）
【００１５】
【発明の実施の形態】
＜文字認識装置の構成例＞
図１は本発明の文字認識装置の構成例を示すブロック図であり、（ａ）は文字認識装置の構成例を示すブロック図、（ｂ）は本発明の要部である辞書学習部の構成例を示すブロック図である。
図１（ａ）で、文字認識装置１００は、光学的読取り装置や手書き文字読取り装置等からなり、文字（または、線、点）を読み取って電気信号に変換して更にデジタル化してイメージデータを得る文字入力部１と、文字入力部１からのイメージデータを１文字単位の文字イメージに分解する文字切出部２と、切出された文字イメージの特徴を抽出して特徴量を得る特徴抽出部３と、入力された文字の特徴量と認識辞書９内の各テンプレートの特徴量とを比較して認識結果（例えば、文字コード）を出力する文字認識部４を有している。
【００１６】
更に、文字認識装置１００は、認識辞書９の学習を行なう場合に文字認識部４から出力される認識結果を一時的に記憶する認識結果記憶部５と、認識結果を文字としてモニター画面に表示し、オペレータが不読文字または誤認識文字をキーボード操作によって修正入力する認識結果修正部６と、学習文字抽出部７と、辞書学習部８および認識辞書９を有している。学習文字抽出部７と、辞書学習部８および認識辞書９については以下に詳述する。
【００１７】
また、文字認識装置１００は図示しない制御部を有している。制御部はＣＰＵ，ＲＡＭ，ＲＯＭ等を有するマイクロプロセッサー構成を有し、文字認識装置全体の動作を制御する。
なお、文字入力部１は読取り装置として別に専用の制御部を有する構成であってもよい。また、文字切出部２〜辞書学習部８はハードウエア回路で構成することもできるが、文字切出部２〜辞書学習部８の各モジュールのうちのあるモジュールをハードウエア回路で、その他のモジュールをプログラムで構成するようにしてもよい。また、文字切出部２〜辞書学習部８の一部をプログラムで構成した場合にはその各モジュールはＲＯＭ等の記録媒体に記録され、制御プログラムのコントロール下で制御部のＣＰＵにより実行制御されて、それぞれの処理を実現する。
【００１８】
なお、本実施例では文字切出部２〜文字認識部４と認識結果修正部６の一部，学習文字抽出部および辞書学習部８をプログラムで構成している。そして、認識結果記憶部５をＲＡＭ等の内部メモリーとし、認識結果修正部６のうちのハードウエア部分をモニター画面を有するディスプレイ装置およびキーボード等からなる装置としている。
【００１９】
＜学習文字抽出部＞
学習文字抽出部７は、認識結果修正部６での修正対象の文字（以下、修正対象文字）のうち一定の除外条件を満たす文字を自動的に除外し、それ以外の文字を学習を要する学習文字として記憶しているデータを出力する。
実施例では、学習文字抽出部７における除外条件を下記の５条件とし、これら５条件のいずれかに該当する文字以外の文字を学習文字としている。
【００２０】
除外条件；
（イ）修正入力により挿入された文字、（ロ）修正入力により削除された修正対象文字、（ハ）修正対象文字がその前後の文字と接触している場合の当該修正対象文字、（ニ）修正対象文字の大きさが閾値より小さい場合の当該修正対象文字、（ホ）修正対象文字が掠れている場合の当該修正対象文字上記（イ）、（ロ）、（ハ）の除外条件は、切出しが失敗している可能性があるためこれらの文字を学習文字とすることを不適として除外するために設けたものであり（イ）、（ロ）の除外条件については認識結果修正部６から得る情報を基にして判定でき、（ハ）の除外条件については文字切出部２から得る情報を基に判定できる。また、上記（ニ）、（ホ）の除外条件は文字の形状が粗悪であることが予想できるので、これらの文字を学習文字とすることを不適として除外するために設けたものであり、（ニ）の除外条件については文字切出部２から得る情報を基に判定でき、（ホ）の除外条件については修正対象文字の黒点（ドット）数から判定できる。
【００２１】
＜認識辞書＞
認識辞書は、ハードディスク、フロッピーディスク、光ディスク等のリムーバブルな記録媒体に記録されており、これら記録媒体の読み出し／書込装置、すなわち、磁気ディスク装置，フロッピーディスク装置，光ディスク装置等のいずれかが文字認識装置の構成部分とされる。
【００２２】
図２は認識辞書９の構造を示す説明図であり、認識辞書９はテンプレート群９１とフィルタデータ群９２を有している。テンプレートは文字コードとその文字コードに対応する文字の特徴量およびテンプレート番号を有し、フィルタデータは文字コード別の認識距離の最大値の平均値とその二乗平均値（分散値）、認識距離の最小値その二乗平均値（分散値）、これらの値を作成する際に用いた文字数からなっている。フィルタデータは認識辞書９を作成するために使用した文字データを当該認識辞書を用いて文字認識部４と同様の認識処理を行なうことにより求める。ここでいう認識距離は認識する文字の特徴量と、同じ文字コードをもつ認識辞書の特徴量をベクトルとして算出した距離をいう。
【００２３】
本発明の辞書学習方式では、学習された認識辞書を初期の認識辞書（標準辞書）とは別に、入力文字のソースの種類に従って保存する。この方式により入力文字が、以前に学習したものであればその時の認識辞書を認識に用いることができ、また、未学習の文字であれば標準辞書を用いるので常に安定した認識条件を得ることができる。
なお、入力文字のソースの種類としては、手書きの帳票を読取とって得た入力文字や手書き入力装置等によって得られた入力文字の場合には（文字癖等の個性を有するので）書手そのものをソースとでき、活字の場合には書籍名やフォントをソースとすることができる。
【００２４】
＜辞書学習部＞

［構成］図１（ｂ）で、辞書学習部８は、特徴抽出部１１，認識距離計算部１２，学習文字フィルタ部１３，学習文字選択部１４，学習文字合成部１５，テンプレート追加部１６およびフィルタデータ変更部１７を有している。特徴抽出部１１〜フィルタデータ変更部１７は、ハードウエア回路で構成することもできるが本実施例ではプログラムで構成している。なお、特徴抽出部１１〜フィルタデータ変更部１７の各モジュールのうちのあるモジュールをハードウエア回路で、その他のモジュールをプログラムで構成するようにしてもよい。また、特徴抽出部１１〜フィルタデータ変更部１７をプログラムで構成した場合にはその各モジュールはＲＯＭ等の記録媒体に記録され、制御プログラムのコントロール下で制御部のＣＰＵにより実行制御されて、認識辞書の学習処理を実現する。
【００２５】
［辞書学習動作］
図３は辞書学習部８の辞書学習動作を示すフローチャートである。特徴抽出部１１では、図１（ａ）の学習文字抽出部７から出力された複数の学習文字のうち、学習文字のデータに特徴量が含まれているか否かを調べ（Ｓ１）、特徴量が含まれている場合にはＳ３に移行し、特徴量が含まれていない場合にはその学習文字のデータから特徴量を抽出（作成）する（Ｓ２）。
【００２６】
次に、認識距離計算部１２で学習文字の特徴量と認識辞書９のテンプレートの特徴量から各文字コード別に最小認識距離およびその時のテンプレート番号と、最大認識距離およびその時のテンプレート番号とからなる認識距離データを作成する（Ｓ３）。図４に認識距離データの構造を示す。
【００２７】
学習文字フィルタ部１３では、学習に不適な学習文字をオペレータの判断を要することなく自動的に除外する。
具体的には、後述の式２（認識距離の正規化式）によって導き出された各文字コードの最小認識距離の正規化距離のうち、学習文字と異なる各文字コードの正規化距離と閾値とを比較し、閾値より小さいものがあればその学習文字を除外する。すなわち、このような学習文字は異なる文字コードの最小認識距離の正規分布内に存在していることとなり、その学習文字を認識辞書に加えることによりその異なる文字コードの認識に影響を与える可能性が生ずるものとしてその学習文字を除外する。同様に、学習文字の文字コードの最大認識距離の正規化距離を式２から求めて閾値と比較し、正規化距離が閾値より大きければその学習文字を除外する。すなわち、このような学習文字はその文字コードの最大認識距離の正規分布内に存在しないため、その学習文字の信頼性が低いものとして除外する（Ｓ４）。また、上記正規化距離と閾値との比較によるフィルタ処理の結果、学習文字が全く無い場合には処理を終了する（Ｓ５）。
【００２８】
上述した学習文字フィルタ部１３での学習文字と異なる文字コードでの比較処理は、学習文字が異なる文字コードの辞書テンプレートから統計的に十分離れているか否かを調べるために行なわれる。すなわち、学習文字におけるある文字コードの最小認識距離とはその文字コードの認識辞書のテンプレートの中で最も学習文字に近いものと学習文字との特徴ベクトルの距離であり、その距離が短いということはその学習文字がその文字コードに近いことを示している（図６参照）。
学習文字が他の文字コードに近過ぎる場合にその文字について学習処理をすれば、近過ぎる他の文字コードの認識性能に悪影響を及ぼす（他の文字コードの文字が入力された時に学習文字の文字コードとして認識されてしまう）ため、その学習文字を除外する。
【００２９】
また、学習文字の文字コードの比較はその文字コードの統計的分布にその学習文字が含まれているか否かを調べるために行なわれる。すなわち、ある文字コード内の最大認識距離とはその文字コードの特徴空間の半径を近似しているものであることから、ある文字においてその値が平均値からかけ離れていることはその文字が対象文字コードの特徴空間からかけ離れていることを意味する（図７参照）。従って、同じ文字コードの他の文字の特徴からかけ離れている文字はその文字の特徴の信頼性が低く学習には適さないものとしてその学習文字を学習の対象から除外する。
【００３０】
上記２つの除外条件に基づいて、学習には不適な文字すなわち他の文字コードの認識に影響を及ぼす文字および同じ文字コードの平均的特徴からかけ離れている文字についてはオペレータの判断なしに自動的に除外できる。
また、他の文字コードの特徴に近い文字や同じ文字コードの平均的特徴からかけ離れている文字が出現する原因としては、文字の掠れや潰れ、傾きや変形によって文字のイメージが粗悪になっていることや、オペレータが認識結果の修正の際に誤った文字コードの文字を入力すること等がある。
【００３１】
学習文字選択部１４では、有効な学習文字をオペレータの判断を要することなく自動的に除外する。
具体的には、学習文字フィルタ部１３でのフィルタ処理を通過した学習文字を学習対象となる認識辞書のテンプレート（学習対象テンプレート）別に分類し、その中から未だ学習処理をしていなくて正認識距離の最小な学習文字を選択していく（Ｓ６）。
【００３２】
ここで、学習対象テンプレートとは、学習文字の認識距離データにおいて学習文字と同じ文字コードで最小認識距離のときのテンプレート、すなわち、学習文字と同じ文字コードで最も近いテンプレートをいう。また、その時の最小認識距離を正認識距離とする。
なお、認識する文字と同じ文字コードにおける最小認識距離を正認識距離とし、認識する文字と異なる文字コードの最小認識距離のうち、最も小さなものを誤認識距離とすると、正認識距離＞誤認識距離のとき認識結果は誤認識となる。
また、誤認識距離≧正認識距離≧誤認識距離−α（α：リジェクト判定用の閾値）のとき認識結果はリジェクトとなる。更に、誤認識距離−α＞正認識距離のとき認識結果は正認識となる。
【００３３】
学習文字を学習対象テンプレートに分けるのは、同一ソースから入力された同一文字コードの文字で正認識距離の辞書テンプレート（学習対象テンプレート）が同じ学習文字は、その特徴量ベクトルを認識辞書のテンプレートの特徴ベクトルから見た場合、図８に示すように同じ方向性を持っているからである。
【００３４】
このため、その学習文字の中の学習対象テンプレートに近いものから学習することにより学習対象テンプレートの特徴量を徐々にそれらの特徴量に近づけて学習文字を認識するようにして学習対象テンプレートの変化量を最小限におさえることができる。通常、認識辞書は初期状態（学習をしていない状態）ではテンプレートのバランスが取れているためこのような学習方法を用いることにより、そのバランスを大きく崩すことなく学習できる。また、オペレータの判断を要さないのでオペレータに文字特徴や学習に係わる専門的知識を要求する必要がない。
【００３５】
学習文字合成部１５では、学習文字選択部１４で選ばれた学習文字の特徴量と学習対象テンプレートの特徴量とを、後述の特徴量合成法（式３：図５参照）により学習文字の合成比率を初期値から徐々に変化させながらベクトル量として合成していき、学習対象テンプレートの特徴量を学習文字に近づけていく。そして、繰り返し合成中に後述の３つの合成終了判定条件のいずれかを満たしたとき、特徴量合成処理を終了する（Ｓ７）。
【００３６】
テンプレート追加部１６では、上記特徴量合成処理の繰り返しの終了時に、学習文字が正認識されるようになっているか否かを調べ、正認識されるようになっていればＳ９に移行し（Ｓ８）、正認識（後述）されるようになっていなければ学習文字の特徴量を認識辞書の新たなテンプレートとして追加する（Ｓ９）。すなわち、特徴量合成のみの学習方法ではテンプレートが増えないので学習後の認識速度に影響を与えないが、辞書が元の辞書から大きく変化したり、他の文字コードに近づき過ぎてしまうと認識率が低下する可能性がある。また、テンプレート追加のみの学習方法ではそのテンプレートが他の文字コードに近過ぎなければ認識率の低下をもたらすことはないが、テンプレートの増加は認識速度の低下要因となる。そこで、本発明では、特徴量合成による学習方法を優先し（Ｓ７）、後述の合成終了判定条件の（Ｂ）（式５）または（Ｃ）（式６）が成立した場合に、当該学習方法を不適として、テンプレート追加による学習方法に切換える。これより、オペレータの判断を要することなしにその学習文字について最適な学習方法を選択することができる。
【００３７】
フィルタデータ変更部１７では、上記Ｓ７，Ｓ８（学習文字合成部１５）またはＳ９（テンプレート追加部１６）によって学習文字が正認識されるようになった場合に、認識辞書９のフィルタデータを修正する。
具体的には、学習後の認識辞書で学習文字の認識処理を行ない、この時の認識距離データを学習文字フィルタ部１３で用いる認識距離の最小値／最大値の平均値とその二乗平均値および文字数のそれぞれに追加することにより、認識距離の最小値／最大値の平均値とその二乗平均値および文字数を修正対象文字に適した値に修正する（Ｓ１０）。
【００３８】
学習文字選択部１４で選択された全ての学習の学習処理（Ｓ７〜Ｓ９）の終了後、再度、認識距離計算部１２で学習後の認識辞書を用いて認識距離データを作成し、その後、学習処理をせず、未だ誤認識する学習文字についてのみ、全ての保存されている学習文字が正認識されるようになるか、或いは学習文字フィルタ部１３を通過する文字が存在しなくなるまで上記Ｓ３〜Ｓ１０（認識距離計算部１２〜ファイルデータ変更部１７）の学習処理を繰り返す（Ｓ１１）。
【００３９】
［認識距離の正規化等］
学習文字フィルタ部１３で閾値と比較する最小／最大認識距離の正規化距離は次により求めることができる。
文字コードｃｏにおける認識距離ｄc0の分散値ｖａｒ（ｄc0）は、認識辞書９に格納されている認識距離の平均値ａｖｅ（ｄc0）およびその二乗平均値ａｖｅ（ｄc0²）より下記の式１により求められる。
ｖａｒ（ｄco）＝ａｖｅ（ｄco²）−（ａｖｅ（ｄco ））² （式１）
式１によって得られる各文字コードの最小認識距離の分散値を用いて、認識距離計算部１２で得た学習文字の認識距離データの各文字コードにおける最小認識距離を下記の式２に従って正規化する。
ｄ’co＝（ｄco−ａｖｅ（ｄco））／（ｖａｒ（ｄco ））^1/2 （式２）
ここで、ｄcoは学習文字の文字コードｃｏにおける最小認識距離、ｄ’coは学習文字の文字コードｃｏにおける最小認識距離の正規化距離、ａｖｅ（ｄco）は認識辞書９に格納されている文字コードｃｏにおける最小認識距離の平均値、ｖａｒ（ｄco）は上記式１で算出された文字コードｃｏの最小認識距離の分散値である。
【００４０】
また、式１によって得られる各文字コードの最大認識距離の分散値および上記式２を用いて、認識距離計算部１２で得た学習文字の認識距離データの各文字コードにおける最大認識距離を正規化することができる。
この場合、学習文字の文字コードｃｏにおける最大認識距離をｄco、学習文字の文字コードｃｏにおける最大認識距離の正規化距離をｄ’co、認識辞書９に格納されている文字コードｃｏにおける最大認識距離の平均値をａｖｅ（ｄco）、ｖａｒ（ｄco）を式１で算出された文字コードｃｏの最大認識距離の分散値として式２を適用すればよい。
【００４１】
このようにある正規化分布における観測値の範囲はその標本の平均値と分散値を用いて式２により正規化された観測値の範囲として求めることができる。例えば、信頼度を９５％とすれば、正規化分布表から−１．９６〜１．９６が正規化された観測値の範囲（信頼区間）として与えられる。これにより、認識距離がある文字コードの認識距離の正規分布内に存在しているか否かを判定できる。
【００４２】
［特徴量合成法および合成終了判定条件］
（１）特徴量合成法
学習文字の特徴量と学習対象テンプレートの特徴量の合成は、学習文字の合成比率を初期値から徐々に変化させながらベクトル量として下記式（式３）により行なう。なお、図５に合成による学習で遷移する特徴量の遷移状態を示した。
Ｆdic'＝（１−Ｒ）×Ｆdic ＋Ｒ×Ｆunk （式３）
ここで、Ｆunk は学習文字の特徴量、Ｆdic は認識辞書における学習対象テンプレートの特徴量、Ｆdic'は上記テンプレートの合成後の特徴量、Ｒは合成比率（０＜Ｒ＜１）である。
【００４３】
（２）合成終了判定条件前述したように、上記式３による繰り返し合成中に、下記３条件のうちのいずれかを満たした時に学習文字合成部１５での処理を終了する。
（Ａ）；学習文字が正認識され、且つ正認識距離と誤認識距離の差が閾値より大きい場合、すなわち、下記式（式４）が成立する場合、｜Ｆdicerr−Ｆunk ｜−｜Ｆdic−Ｆunk ｜＞１（式４）
但し、Ｆdicerrは誤認識距離のテンプレートの特徴ベクトル、１（＞０）は閾値である。
（Ｂ）；学習対象テンプレートの合成前後における移動値が閾値より大きい場合、すなわち、下記式（式５）が成立する場合、｜Ｆdic−Ｆdic ｜＞２（式５）
但し、α２（＞０）は閾値である。
（Ｃ）；学習後の学習対象テンプレートの特徴ベクトルと認識辞書の学習文字の文字コードと異なる文字のテンプレートの特徴ベクトルの距離が閾値より小さい場合、すなわち、下記式（式６）が成立する場合、｜Ｆdicerr−Ｆdic｜＜３（式６）
但し、α３（＞０）は閾値。
【００４４】
（Ａ）の合成終了判定条件が成立するケース、すなわち正認識距離と誤認識距離の大きさが閾値α１より大きくなるケースがあるが、その原因は、同一ソースから入力された同じ文字コードの文字でもその特徴量にある程度のばらつきがあるので、１つの学習文字を正認識距離に余裕が生ずるように学習することでそのばらつきを吸収するように構成してあることによる。
【００４５】
（Ｂ）の合成終了判定条件は、前述したように、通常、認識辞書は初期状態ではテンプレートのバランスが取れているため、認識辞書の既存のテンプレートが大きく移動するとそのバランスを大きく崩し、その文字コードの認識率が低下する可能性があるので、これを防ぐためのものである。
【００４６】
また、（Ｃ）の合成終了判定条件は、学習対象のテンプレートが他の文字コードのテンプレートに近づき過ぎることによりその文字コードの認識に悪影響を及ぼすことを防ぐために設けた条件である。
【００４７】
［最小認識距離］
図６は最小認識距離の説明図である。図６で、学習文字Ｘの特徴量ベクトルをＡ（三角印）、図４の文字コード４１の特徴量ベクトルの分布範囲を右側の円（破線）ａ、図４の文字コード４２の特徴量ベクトルの分布範囲を左側の大きな円（破線）ｄとし、文字コード４１の特徴量ベクトルをｂ（バツ印），辞書テンプレートの特徴量ベクトルをｃ（丸印），学習文字に最も近い辞書テンプレートをＢ（二重丸印）とし、文字コード４２の特徴量ベクトルをｅ（バツ印），辞書テンプレートの特徴量ベクトルをｆ（丸印），学習文字に最も近い辞書テンプレートをＤ（二重丸印）とする。この場合、学習文字とその文字コード４１の最小認識距離はＣで示される線分の長さとなり、学習文字Ｘとその文字コード４２の最小認識距離はＥで示される線分の長さとなり、Ｃ＜Ｅであるから学習文字Ｘの最小認識距離は線分Ｃの長さとなる。
【００４８】
［最大認識距離］
図７は最大認識距離の説明図である。図７で、学習文字をＡ（三角印）、図４の文字コード４１の特徴量ベクトルの分布範囲を円（破線）ａとし、文字コード４１の特徴ベクトルをｂ（バツ印），辞書テンプレートの特徴量ベクトルをｃ（丸印），最大認識空間（≒特徴空間半径）をｄ、学習文字Ａから最もはなれている辞書テンプレートをＢ（二重丸印）とすると、学習文字Ａにおける文字コード４１の最大認識距離は線分ＢＡの長さとなる。
【００４９】
［学習文字の方向性］
図７は学習文字の方向性の説明図である。図７で、図４の文字コード４１の特徴量ベクトルの分布範囲を右側の円（破線）ａ、文字コード４１の特徴量ベクトルをｂ（バツ印），辞書テンプレートの特徴量ベクトルをｃ（丸印）とした場合、あるテンプレート（黒丸印）を最小認識距離のテンプレートとする学習文字群Ａはそのテンプレートを中心として一定の角度範囲に存在し、他のテンプレート（二重丸印）を最小認識距離のテンプレートとする学習文字群Ｂはそのテンプレートを中心として一定の角度範囲に存在する（すなわち、方向性を有する）。
【００５０】
【発明の効果】
上記説明したように、請求項１に記載の発明の辞書学習方法、請求項４に記載の文字認識方法、および請求項５に記載の発明によれば、認識結果の修正入力文字から所定の除外条件に該当する文字を除外して得た学習文字から特徴値を抽出してから最小認識距離と最大認識距離を算出するので、学習文字として不適切な修正対象文字を学習文字として認識距離を算出することを防ぐことができ、無駄な演算を行なうことがないことから認識距離の算出速度が速い。また、算出した最小認識距離と最大認識距離を基準として所定条件に該当する学習文字を除外できるので、認識距離がちか過ぎて他の文字コードの認識性能に悪影響を及ぼす学習文字を除外したり、認識距離が遠すぎて同じ文字コードの他の文字の特徴からかけ離れている文字を学習の対象から除外することができる。
【００５１】
請求項２に記載の発明の辞書学習方法、請求項４に記載の文字認識方法、および請求項５に記載の発明によれば、更に、学習文字の中の学習対象テンプレートに近いものから学習することができるので、学習対象テンプレートの特徴量を徐々にそれらの特徴量に近づけて学習文字を認識するようにして学習対象テンプレートの変化量を最小限におさえることができる。
【００５２】
請求項３に記載の発明の辞書学習方法、請求項４に記載の文字認識方法、および請求項５に記載の発明によれば、更に、辞書が元の辞書から大きく変化したり、他の文字コードに近づき過ぎてしまっても認識率の低下を抑制することができる。
また、学習された認識辞書を初期の認識辞書（標準辞書）とは別に、入力文字のソースの種類に従って保存し、入力文字が、以前に学習したものであればその時の認識辞書を認識に用いることができ、未学習の文字であれば標準辞書を用いるので常に安定した認識条件を得ることができる。
【図面の簡単な説明】
【図１】本発明の文字認識装置の構成例を示すブロック図である。
【図２】認識辞書の構造を示す構造図である。
【図３】学習辞書部の辞書学習動作を示すフローチャートである。
【図４】認識距離データの構造を示す説明図である。
【図５】合成による学習で遷移する特徴量の遷移状態を示す説明図である。
【図６】最小認識距離の説明図である。
【図７】最大認識距離の説明図である。
【図８】学習文字の方向性の説明図である。
【符号の説明】
７学習文字抽出部（学習文字抽出手段）
９認識辞書
１３学習文字フィルタ部（第１のフィルタ手段、第２のフィルタ手段）
１４学習文字選択部（学習文字選択手段）
１５学習文字合成部（特徴量合成手段）
１６テンプレート追加部（辞書学習方法判定手段）
９１テンプレート
９２フィルタデータ（認識辞書データ）
１００文字認識装置[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character recognition device, and more particularly to a dictionary learning method and a character recognition device that create a feature amount from image data of an input character and obtain a recognized character code by comparison with a feature amount of a recognition dictionary.
[0002]
[Prior art]
A character recognition device is a device that obtains a character code by reading a character image and converting it into an electrical signal to obtain an image image, cutting it out character by character, and performing character recognition using a recognition dictionary.
In the above character recognition process, first, the extracted feature amount of the extracted character is compared with the extracted feature amount and the feature amount of each template in the recognition dictionary, and the feature amount having the highest similarity is obtained. The candidate character is regarded as a correct character and its character code is acquired.
However, even if the character to be read is the same character, the shape, the size, the thickness of the line is different, or the line is blurred.
In particular, when recognizing handwritten characters, in addition to the above factors, factors such as personal writing habits are added to lower the recognition rate.
[0003]
  As a solution to the above-described problem, a method of updating (correcting / adding) a standard pattern registered in the recognition dictionary at the time of correction input of an unread character or a misrecognized character, so-called 'learning the recognition dictionary' There is a method to improve the recognition rate. As a conventional technology related to the recognition rate learning method,(1)Whether the character is to be learned one by one at the time of correction input as in the technique disclosed in “Character Recognition Device” of Japanese Patent Laid-Open No. 2-186484 or “Learning Method of Recognition Dictionary” of Japanese Patent Laid-Open No. 9-185682 (2) Learning simultaneously with correction as in the technique disclosed in “Character recognition device and dictionary learning method and dictionary creation method thereof” of Japanese Patent Laid-Open No. 7-129724 The method of performing,(3)As in the technique disclosed in “Handwritten Character Recognition Dictionary Creation Method and Device” of Japanese Patent Laid-Open No. 7-271919, “Optical Character Reading Device” of Japanese Patent Laid-Open No. 8-287191, etc. There is a method of checking validity by recognition simulation.
[0004]
[Problems to be solved by the invention]
  However, the above(1)In this technique, there is a problem that an operator's work load increases and a highly accurate 'learning of recognition dictionary' cannot be performed unless the operator has specialized knowledge for determining the necessity of character learning.(2)This technology does not consider the validity of learning characters, so there is a problem that the efficiency of learning and the stability of the recognition dictionary after learning are lacking.(3)This technique has the disadvantage that it takes time to simulate the validity of the recognition dictionary after learning, and it is necessary to add storage means for storing data necessary for the recognition simulation.
[0005]
  The present invention has been made in view of the above-described problems and disadvantages, and is a dictionary learning that can efficiently learn a recognition dictionary and can give high validity and stability to the recognized dictionary after learning.MethodAn object is to provide a character recognition device.
[0006]
[Means for Solving the Problems]
  In order to solve the above problem, the dictionary learning method according to the first aspect of the present invention is a dictionary learning method for a character recognition apparatus that performs learning processing of a recognition dictionary. Corrected from the entered characterfollowingExclusion conditionsAny of (1) to (5)Exclude the characters that matchStudyA feature amount extracting step for extracting a feature amount of the custom character c, a recognition distance calculating step for calculating a minimum recognition distance and a maximum recognition distance for each character code from the feature amount of the learned character c and the feature amount of the template of the recognition dictionary, and recognition When the normalized distance of the minimum recognition distance A of the character code different from the learning character c in the minimum recognition distance of each character code calculated by the distance calculation step is smaller than the threshold ω, the learning character c is excluded and calculated. A filter step of obtaining the learning character C by excluding the learning character c when the normalized distance of the maximum recognition distance B of the same character code as the learning character c is larger than the threshold ω among the maximum recognition distances of the character codes; It is characterized by comprising.
  Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned
[0007]
  Further, the invention according to claim 2 is the dictionary learning method according to claim 1, wherein the learning character C obtained by the filter step is further converted into a learning pair.ElephantA selection step of selecting a learning character D that has not yet been subjected to a learning process and has a minimum minimum recognition distance.Features of learning character D and features of learning target templateComposition ratio R of learning character DChangeWhile lettingAs a vector quantity based on the following formula GA feature amount combining step of combining and bringing the feature amount of the classified template close to the feature amount of the learning character D.
  Formula G: Fdic = (1-R) Fdic + RFunk
Fdic Is the feature after synthesis, Fdic Is the feature quantity of the learning target template in the recognition dictionary, Funk Is the character feature, R Is the composite ratio (0 < R <1)
[0008]
  Further, the invention according to claim 3 is the dictionary learning method according to claim 2, further comprising setting the minimum recognition distance of the learning character D at the end of learning in the dictionary learning step.Whether or not the learning character D is correctly recognizedAn update / add step for updating or adding the recognition dictionary data based on the determination result, and a learning character recognition process using a template in which the recognition dictionary data is updated or added by the update / add step. And a filter data updating step for changing filter data used in the learning character filter means based on the recognized distance data.
[0009]
  According to a fourth aspect of the present invention, there is provided a character recognition device that performs a learning process of a recognition dictionary, from a character that has been corrected and input to a recognition result.followingExclusion conditionsOne of (1) to (5)Exclude the characters that matchStudyFeature amount extraction means for extracting the feature amount of the custom character c, and recognition distance calculation means for calculating the minimum recognition distance and the maximum recognition distance for each character code from the feature amount of the learned character c and the feature amount of the template of the recognition dictionary. If the normalized distance of the minimum recognition distance A of the character code different from the learning character c in the minimum recognition distance of each character code calculated by the recognition distance calculation means is less than the threshold ω, the learning character c is excluded, A filter that obtains a learning character C by excluding the learning character c when the normalized distance of the maximum recognition distance B of the same character code as the learning character c is larger than the threshold ω among the calculated maximum recognition distances of the character codes And learning character C obtained by the filter meansElephantSelecting means for selecting learning characters D that have not yet been subjected to learning processing and have the smallest minimum recognition distance;Features of learning character D and features of learning target templateComposition ratio R of learning character DChangeWhile lettingAs a vector quantity based on the following formula GA feature amount synthesizing unit that synthesizes and approximates the feature amount of the classified template to the feature amount of the learning character D; and a minimum recognition distance of the learning character D at the end of the synthesis process by the feature amount synthesizing unit.Whether or not the learning character D is correctly recognizedUpdate / addition means for updating or adding recognition dictionary data based on the determination result, and learning character recognition processing using a template in which the recognition dictionary data is updated or added by the update / addition means Filter data updating means for changing filter data used in the learning character filter means based on the obtained recognition distance data.
Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned
  Formula G: Fdic = (1-R) Fdic + RFunk
Fdic Is the feature after synthesis, Fdic Is the feature quantity of the learning target template in the recognition dictionary, Funk Is the character feature, R Is the composite ratio (0 < R <1)
[0010]
  According to a fifth aspect of the present invention, there is provided a computer-readable recording medium on which a program according to the present invention is recorded.followingExclusion conditionsOne of (1) to (5)Exclude the characters that matchStudyFeature amount extraction means for extracting the feature amount of the custom character c, recognition distance calculation means for calculating the minimum recognition distance and the maximum recognition distance for each character code from the feature amount of the learning character c and the feature amount of the template of the recognition dictionary, recognition When the normalized distance of the minimum recognition distance A of the character code different from the learning character c in the minimum recognition distance of each character code calculated by the distance calculation means is smaller than the threshold ω, the learning character c is excluded and calculated. Filter means for obtaining the learning character C by excluding the learning character c when the normalized distance of the maximum recognition distance B of the same character code as the learning character c in the maximum recognition distance of each character code is larger than the threshold ω, Learning character C obtained by the filter meansElephantSelecting means for selecting learning characters D that have not yet been subjected to learning processing and have the smallest minimum recognition distance;Features of learning character D and features of learning target templateComposition ratio R of learning character DChangeWhile lettingAs a vector quantity based on the following formula GA feature amount synthesizing unit that synthesizes and approximates the feature amount of the template classified as the feature amount of the learning character D, and a minimum recognition distance of the learning character D at the end of the synthesizing process by the feature amount synthesizing unitWhether or not the learning character D is correctly recognizedUpdate / addition means for updating or adding recognition dictionary data based on the determination result, and learning character recognition processing using a template in which the recognition dictionary data is updated or added by the update / addition means. The computer-readable recording medium which recorded the program which functions as a filter data update means to change the filter data used when obtaining the learning character C based on the recognized distance data.
Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned
  Formula G: Fdic = (1-R) Fdic + RFunk
Fdic Is the feature after synthesis, Fdic Is the feature quantity of the learning target template in the recognition dictionary, Funk Is the character feature, R Is the composite ratio (0 < R <1)
[0015]
DETAILED DESCRIPTION OF THE INVENTION
<Configuration example of character recognition device>
FIG. 1 is a block diagram showing a configuration example of a character recognition device according to the present invention, (a) is a block diagram showing a configuration example of a character recognition device, and (b) is a configuration of a dictionary learning unit which is a main part of the present invention. It is a block diagram which shows an example.
In FIG. 1A, a character recognition device 100 includes an optical reading device, a handwritten character reading device, etc., reads a character (or a line, a dot), converts it into an electric signal, and further digitizes the image data. Character input unit 1 to be obtained, character cutout unit 2 that decomposes the image data from the character input unit 1 into character images in units of one character, and feature extraction to extract features of the extracted character image And a character recognition unit 4 that compares a feature amount of the input character with a feature amount of each template in the recognition dictionary 9 and outputs a recognition result (for example, a character code).
[0016]
Furthermore, the character recognition device 100 uses the character recognition unit when learning the recognition dictionary 9.4Recognition result storage unit 5 that temporarily stores the recognition result output from the computer, and a recognition result correction unit that displays the recognition result as characters on the monitor screen, and an operator corrects and inputs unread characters or misrecognized characters by keyboard operation. 6, a learning character extraction unit 7,Dictionary learningThe unit 8 and the recognition dictionary 9 are included. A learning character extraction unit 7;Dictionary learningThe unit 8 and the recognition dictionary 9 will be described in detail below.
[0017]
Moreover, the character recognition apparatus 100 has a control unit (not shown). The control unit has a microprocessor configuration having a CPU, a RAM, a ROM, and the like, and controls the operation of the entire character recognition device.
Note that the character input unit 1 may have a dedicated control unit as a reading device. In addition, the character extraction unit 2 to the dictionary learning unit 8 can be configured by a hardware circuit, but one of the modules of the character extraction unit 2 to the dictionary learning unit 8 is a hardware circuit, The module may be configured by a program. When a part of the character extraction unit 2 to the dictionary learning unit 8 is configured by a program, each module is recorded on a recording medium such as a ROM and is controlled by the CPU of the control unit under the control of the control program. Each processing is realized.
[0018]
In this embodiment, the character extraction unit 2 to the character recognition unit 4 and a part of the recognition result correction unit 6, the learning character extraction unit, and the dictionary learning unit 8 are configured by a program. The recognition result storage unit 5 is an internal memory such as a RAM, and the hardware part of the recognition result correction unit 6 is a device including a display device having a monitor screen and a keyboard.
[0019]
<Learning character extraction unit>
The learning character extraction unit 7 automatically excludes characters that satisfy certain exclusion conditions from characters to be corrected by the recognition result correction unit 6 (hereinafter, correction target characters), and learning that requires learning of other characters Outputs data stored as characters.
In the embodiment, the exclusion conditions in the learning character extraction unit 7 are the following five conditions, and characters other than the characters corresponding to any of these five conditions are the learning characters.
[0020]
  Exclusion conditions;
(I)Characters inserted by correction input,(B)Characters to be corrected deleted by correction input,(C)The correction target character when the correction target character is in contact with the preceding and following characters, (d) the correction target character when the size of the correction target character is smaller than the threshold, and (e) the correction target character In case the above correction target character(I), (B), (C)This exclusion condition is provided in order to exclude these characters as learning characters because they may have failed to be cut out.(B), (b)Can be determined based on information obtained from the recognition result correction unit 6,(C)The exclusion condition can be determined based on information obtained from the character cutout unit 2. Also, above(D), (e)Since the exclusion condition of can be expected that the shape of the character is inferior, it is provided to exclude these characters as learning characters as inappropriate,(D)The exclusion condition can be determined based on information obtained from the character cutout unit 2, and the exclusion condition (e) can be determined from the number of black dots (dots) of the correction target character.
[0021]
<Recognition dictionary>
The recognition dictionary is recorded on a removable recording medium such as a hard disk, floppy disk, or optical disk, and any one of these recording medium read / write devices, that is, a magnetic disk device, a floppy disk device, an optical disk device, etc. It is a component of the recognition device.
[0022]
  FIG. 2 is an explanatory diagram showing the structure of the recognition dictionary 9, and the recognition dictionary 9 has a template group 91 and a filter data group 92. The template has a character code, a character feature amount corresponding to the character code, and a template number. The filter data includes an average value of the maximum recognition distance for each character code, its mean square value (variance value), and the recognition distance. It consists of the minimum value, its root mean square value (variance value), and the number of characters used to create these values. The filter data is obtained by performing the same recognition process as the character recognition unit 4 on the character data used to create the recognition dictionary 9 using the recognition dictionary. The recognition distance here is the feature amount of the character to be recognized.,This is the distance calculated using the feature values of the recognition dictionary having the same character code as a vector.
[0023]
In the dictionary learning method of the present invention, the learned recognition dictionary is stored according to the type of the input character source separately from the initial recognition dictionary (standard dictionary). With this method, if the input character has been learned before, the recognition dictionary at that time can be used for recognition, and if it is an unlearned character, the standard dictionary is used, so a stable recognition condition can always be obtained. it can.
As the type of input character source, in the case of an input character obtained by reading a handwritten form or an input character obtained by a handwriting input device, etc. (because it has individuality such as a character font), the writer itself Can be used as the source, and in the case of type, the book name or font can be used as the source.
[0024]
<Dictionary learningDepartment>

[Configuration] In FIG.Dictionary learningThe unit 8 includes a feature extraction unit 11, a recognition distance calculation unit 12, a learning character filter unit 13, a learning character selection unit 14, a learning character synthesis unit 15, a template addition unit 16, and a filter data change unit 17. The feature extraction unit 11 to the filter data change unit 17 can be configured by a hardware circuit, but are configured by a program in this embodiment. A certain module among the modules of the feature extraction unit 11 to the filter data change unit 17 may be configured by a hardware circuit, and the other modules may be configured by a program. When the feature extraction unit 11 to the filter data changing unit 17 are configured by a program, each module is recorded on a recording medium such as a ROM, and is executed and controlled by the CPU of the control unit under the control of the control program. Realize dictionary learning.
[0025]
[Dictionary learning operation]
  Figure 3Dictionary learning10 is a flowchart showing a dictionary learning operation of unit 8. In the feature extraction unit 11, among the plurality of learning characters output from the learning character extraction unit 7 in FIG.Learning charactersData ofSpeciallyIt is checked whether or not the collected amount is included (S1). If the feature amount is included, the process proceeds to S3.LearningCharacterFrom the dataA feature quantity is extracted (created) (S2).
[0026]
  Next, the recognition distance calculation unit 12 performs the characteristics of the learning character.amountThen, recognition distance data including the minimum recognition distance and the template number at that time, and the maximum recognition distance and the template number at that time is created for each character code from the template feature quantity in the recognition dictionary 9 (S3). FIG. 4 shows the structure of the recognition distance data.
[0027]
The learning character filter unit 13 automatically excludes learning characters unsuitable for learning without requiring operator judgment.
Specifically, among the normalized distances of the minimum recognition distances of the respective character codes derived by the below-described expression 2 (recognition distance normalization expression), the normalized distances and threshold values of the character codes different from the learning characters are set. If there is a character smaller than the threshold, the learning character is excluded. That is, such learning characters exist in the normal distribution of the minimum recognition distance of different character codes, and adding the learning characters to the recognition dictionary may affect the recognition of the different character codes. Exclude the learned characters as occurring. Similarly, the normalization distance of the maximum recognition distance of the character code of the learning character is obtained from Expression 2 and compared with a threshold value. If the normalization distance is larger than the threshold value, the learning character is excluded. That is, since such a learning character does not exist within the normal distribution of the maximum recognition distance of the character code, the learning character is excluded as having low reliability (S4). If there is no learning character as a result of the filtering process based on the comparison between the normalized distance and the threshold value, the process ends (S5).
[0028]
The comparison processing with the character code different from the learning character in the learning character filter unit 13 described above is performed to check whether or not the learning character is statistically sufficiently separated from the dictionary template of the different character code. That is, the minimum recognition distance of a character code in a learning character is the distance between the learning character and the feature vector closest to the learning character in the recognition dictionary template of the character code, and the distance is short. This indicates that the learned character is close to the character code (see FIG. 6).
If the learning character is too close to another character code, and the learning process is performed for that character, the recognition performance of other character codes that are too close will be adversely affected. Therefore, the learning character is excluded.
[0029]
Further, the comparison of the character codes of the learning characters is performed to check whether or not the learning characters are included in the statistical distribution of the character codes. In other words, since the maximum recognition distance in a character code approximates the radius of the character code's feature space, the value of a character is far from the average value. This means that it is far from the feature space of the code (see FIG. 7). Therefore, a character that is far from other character features of the same character code is excluded from learning because the character feature has low reliability and is not suitable for learning.
[0030]
Based on the above two exclusion conditions, characters that are inappropriate for learning, that is, characters that affect the recognition of other character codes and characters that are far from the average characteristics of the same character code are automatically determined without operator judgment. Can be excluded.
In addition, the reason for the appearance of characters that are close to the characteristics of other character codes or characters that are far from the average characteristics of the same character code is that the image of the character is inferior due to the curling, crushing, tilting, and deformation of the characters. In addition, the operator may input a character with an incorrect character code when correcting the recognition result.
[0031]
The learning character selection unit 14 automatically excludes effective learning characters without requiring the operator's judgment.
Specifically, the learning characters that have passed the filtering process in the learning character filter unit 13 are classified according to the recognition dictionary template (learning target template) to be learned, and the recognition processing is not performed yet and is correctly recognized. The learning character with the shortest distance is selected (S6).
[0032]
Here, the learning target template refers to a template when the learning character recognition distance data has the same character code as the learning character and the minimum recognition distance, that is, a template closest to the learning character and the same character code. In addition, the minimum recognition distance at that time is set as a positive recognition distance.
If the minimum recognition distance in the same character code as the character to be recognized is the positive recognition distance, and the smallest recognition distance of the character code different from the character to be recognized is the smallest recognition distance, the correct recognition distance> the erroneous recognition distance. In this case, the recognition result is erroneous recognition.
In addition, the recognition result is rejected when the erroneous recognition distance ≧ the correct recognition distance ≧ the erroneous recognition distance−α (α: threshold for determination of rejection). Further, when the erroneous recognition distance-α> the correct recognition distance, the recognition result is the correct recognition.
[0033]
  The learning character is divided into the learning target templates. The learning character with the same character code input from the same source and the same correct recognition distance dictionary template (learning target template)recognitiondictionaryofThis is because, when viewed from the feature vector of the template, it has the same direction as shown in FIG.
[0034]
Therefore, by learning from the learning characters that are close to the learning target template, the amount of change in the learning target template is such that the learning character is recognized by gradually bringing the characteristic amounts of the learning target template closer to those feature amounts. Can be kept to a minimum. Usually, since the recognition dictionary has a balanced template in an initial state (in a state where learning is not performed), learning can be performed without greatly losing the balance by using such a learning method. Further, since it is not necessary for the operator to make a judgment, it is not necessary to require the operator to have specialized knowledge related to character characteristics or learning.
[0035]
The learning character synthesis unit 15 combines the learning character feature quantity selected by the learning character selection unit 14 and the feature quantity of the learning target template by using a feature quantity synthesis method (Equation 3: see FIG. 5) described later. It is synthesized as a vector amount while gradually changing the ratio from the initial value, and the feature amount of the learning target template is brought closer to the learning character. Then, when any of the following three composition end determination conditions is satisfied during repeated composition, the feature amount composition processing is terminated (S7).
[0036]
The template adding unit 16 checks whether or not the learning character is correctly recognized at the end of the repetition of the feature amount synthesizing process. If it is correctly recognized, the process proceeds to S9 (S8). If the character is not recognized correctly (described later), the feature amount of the learning character is added as a new template of the recognition dictionary (S9). In other words, the learning method using only feature amount synthesis does not increase the number of templates, so it does not affect the recognition speed after learning.However, if the dictionary changes significantly from the original dictionary or becomes too close to other character codes, the recognition rate May be reduced. In addition, in the learning method of only adding a template, if the template is not too close to other character codes, the recognition rate does not decrease. However, the increase in the template causes the recognition speed to decrease. Therefore, in the present invention, priority is given to a learning method based on feature value synthesis (S7).), LaterWhen (B) (Equation 5) or (C) (Equation 6) is satisfied, the learning method is inappropriate, and the learning method is switched to a template addition method. As a result, it is possible to select an optimal learning method for the learning character without requiring the operator's judgment.
[0037]
The filter data changing unit 17 corrects the filter data of the recognition dictionary 9 when the learned characters are correctly recognized by S7, S8 (learning character synthesizing unit 15) or S9 (template adding unit 16). .
Specifically, learning character recognition processing is performed in the recognition dictionary after learning, and the average value of the minimum value / maximum value of the recognition distance used in the learning character filter unit 13 at this time and the mean square value thereof, and By adding to each of the number of characters, the average value of the minimum value / maximum value of the recognition distance, the mean square value thereof, and the number of characters are corrected to values suitable for the correction target character (S10).
[0038]
After completion of all learning learning processes (S7 to S9) selected by the learning character selection unit 14, the recognition distance calculation unit 12 creates recognition distance data again using the recognition dictionary after learning, and then learns. Only for the learning characters that are not recognized and are still misrecognized, all of the stored learning characters are recognized correctly, or until the characters that pass the learning character filter unit 13 no longer exist. The learning process of S10 (recognition distance calculation unit 12 to file data change unit 17) is repeated (S11).
[0039]
[Normalization of recognition distance, etc.]
The normalized distance of the minimum / maximum recognition distance to be compared with the threshold value by the learning character filter unit 13 can be obtained as follows.
The dispersion value var (dc0) of the recognition distance dc0 in the character code co is the average value ave (dc0) of the recognition distance stored in the recognition dictionary 9 and its mean square value ave (dc0).²) From the following formula 1.
var (dco) = ave (dco²)-(Ave (dco))²      (Formula 1)
Using the variance value of the minimum recognition distance of each character code obtained by Expression 1, the minimum recognition distance in each character code of the recognition distance data of the learning character obtained by the recognition distance calculation unit 12 is normalized according to Expression 2 below. .
d'co = (dco-ave (dco)) / (var (dco))^1/2    (Formula 2)
Here, dco is a minimum recognition distance in the character code co of the learning character, d′ co is a normalized distance of the minimum recognition distance in the character code co of the learning character, and ave (dco) is a character code stored in the recognition dictionary 9. The average value of the minimum recognition distances at co, var (dco) is the variance of the minimum recognition distances of the character code co calculated by the above equation 1.
[0040]
Further, using the variance value of the maximum recognition distance of each character code obtained by Expression 1 and the above Expression 2, the maximum recognition distance in each character code of the recognition distance data of the learning character obtained by the recognition distance calculation unit 12 is normalized. can do.
In this case, the maximum recognition distance in the character code co of the learning character is dco, the normalization distance of the maximum recognition distance in the character code co of the learning character is d'co, and the maximum recognition distance in the character code co stored in the recognition dictionary 9 Equation 2 may be applied with ave (dco) as the average value of and var (dco) as the variance value of the maximum recognition distance of the character code co calculated by Equation 1.
[0041]
Thus, the range of observed values in a certain normalized distribution can be obtained as the range of observed values normalized by Equation 2 using the average value and variance value of the sample. For example, if the reliability is 95%, −1.96 to 1.96 is given as a normalized observation value range (confidence interval) from the normalized distribution table. Thereby, it can be determined whether or not the recognition distance exists within the normal distribution of the recognition distance of the character code.
[0042]
[Characteristic composition method and composition completion judgment condition]
(1) Feature synthesis method
The feature amount of the learning character and the feature amount of the learning target template are synthesized by the following equation (Equation 3) as a vector amount while gradually changing the learning character composition ratio from the initial value. FIG. 5 shows the transition state of the feature amount that is transitioned by learning by synthesis.
Fdic '= (1-R) * Fdic + R * Funk (Formula 3)
Here, Funk is a feature amount of a learning character, Fdic is a feature amount of a learning target template in the recognition dictionary, Fdic ′ is a feature amount after the above-described template is synthesized, and R is a synthesis ratio (0 <R <1).
[0043]
(2) Composition Completion Judgment Condition As described above, the process in the learning character composition unit 15 is terminated when any of the following three conditions is satisfied during the repetitive composition according to the above expression 3.
(A)When the learning character is recognized correctly and the difference between the correct recognition distance and the erroneous recognition distance is larger than the threshold, that is, when the following expression (Expression 4) is satisfied: | Fdicerr-Funk |-| Fdic-Funk |> 1 (Formula 4)
However, Fdicerr is the feature vector of the misrecognition distance template, and 1 (> 0) is a threshold value.
(B)When the movement value before and after the synthesis of the learning target template is larger than the threshold value, that is, when the following expression (Expression 5) is satisfied: | Fdic−Fdic |> 2 (Expression 5)
However, α2 (> 0) is a threshold value.
(C)When the distance between the feature vector of the learning target template after learning and the feature vector of the character template different from the character code of the learning character in the recognition dictionary is smaller than the threshold, that is, when the following equation (Equation 6) holds: −Fdic | <3 (Formula 6)
However, α3 (> 0) is a threshold value.
[0044]
(A)There are cases where the composite end determination condition is satisfied, that is, the case where the correct recognition distance and the incorrect recognition distance are larger than the threshold value α1. This is because there is a certain amount of variation in the amount, so that the variation is absorbed by learning one learning character so that there is a margin in the correct recognition distance.
[0045]
(B)As described above, since the recognition dictionary usually has a balanced template in the initial state, as described above, if the existing template in the recognition dictionary moves greatly, the balance is greatly lost, and the character code is recognized. This is to prevent the rate from decreasing.
[0046]
Also,(C)The combination end determination condition is a condition provided to prevent the learning target template from being too close to another character code template and adversely affecting the recognition of the character code.
[0047]
[Minimum recognition distance]
  FIG. 6 is an explanatory diagram of the minimum recognition distance. In FIG. 6, the feature vector of the learning character X is A (triangle mark),FIG.Character code41The distribution range of the feature vector of the right circle (broken line) a,FIG.Character code42The distribution range of the feature vector is set to the large circle (dashed line) d on the left side, and the character code41The character vector of the character is b (cross), the feature vector of the dictionary template is c (circle), and the dictionary template closest to the learning character is B (double circle).42E (cross mark), the feature value vector of the dictionary template is f (circle mark), and the dictionary template closest to the learning character is D (double circle mark). In this case, the learning character and its character code41Is the length of the line segment indicated by C, and the learning character XWhenThe character code42Is the length of the line segment indicated by E, and since C <E, the minimum recognition distance of the learning character X is the length of the line segment C.
[0048]
[Maximum recognition distance]
  FIG. 7 is an explanatory diagram of the maximum recognition distance. In FIG. 7, the learning character is A (triangle mark),4Character code41The feature vector distribution range is a circle (dashed line) a, and the character code41B is the feature vector of the dictionary, c is the feature vector of the dictionary template, c is the maximum recognition space (≈feature space radius), and B is the dictionary template most distant from the learning character A (double Circle)), character code in learning character A41Is the length of the line segment BA.
[0049]
[Direction of learning characters]
  FIG. 7 is an explanatory diagram of the directionality of learning characters. In FIG.FIG.Character code41If the distribution range of the feature vector of the right side is a circle (dashed line) a, the feature vector of the character code 41 is b (cross), and the feature vector of the dictionary template is c (circle), a certain template (black circle) The learning character group A having a minimum recognition distance as a template exists in a certain angle range centering on the template, and the learning character group B having another template (double circle) as a minimum recognition distance template is It exists in a certain angle range centering on the template (that is, it has directionality).
[0050]
【The invention's effect】
  As explained above,According to the dictionary learning method of the invention of claim 1, the character recognition method of claim 4, and the invention of claim 5, characters corresponding to a predetermined exclusion condition are detected from the corrected input characters of the recognition result. Since the minimum recognition distance and the maximum recognition distance are calculated after extracting the feature value from the learning characters obtained by exclusion, it is possible to prevent the recognition distance from being calculated using the correction target character inappropriate as the learning character as the learning character. And the calculation speed of the recognition distance is fast because no unnecessary calculation is performed. In addition, since the learning characters that meet the predetermined conditions can be excluded based on the calculated minimum recognition distance and maximum recognition distance, it is possible to exclude learning characters that have too long recognition distance and adversely affect the recognition performance of other character codes. Characters that are too far apart from other character features of the same character code can be excluded from learning.
[0051]
According to the dictionary learning method of the invention of claim 2, the character recognition method of claim 4, and the invention of claim 5, further learning is performed from the learning characters that are close to the learning target template. Therefore, the amount of change in the learning target template can be minimized by recognizing learning characters by gradually bringing the characteristic amounts of the learning target template closer to those feature amounts.
[0052]
According to the dictionary learning method of the invention of claim 3, the character recognition method of claim 4, and the invention of claim 5, the dictionary further changes greatly from the original dictionary or other characters Even if the code gets too close, it is possible to suppress a decrease in the recognition rate.
In addition to the initial recognition dictionary (standard dictionary), the learned recognition dictionary is stored according to the type of the input character source, and if the input character has been learned before, the recognition dictionary at that time is used for recognition. If the character is unlearned, a standard dictionary is used, so that a stable recognition condition can always be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration example of a character recognition device of the present invention.
FIG. 2 is a structural diagram showing the structure of a recognition dictionary.
FIG. 3 is a flowchart showing a dictionary learning operation of a learning dictionary unit.
FIG. 4 is an explanatory diagram showing a structure of recognition distance data.
FIG. 5 is an explanatory diagram illustrating a transition state of a feature amount that is transitioned by learning by synthesis.
FIG. 6 is an explanatory diagram of a minimum recognition distance.
FIG. 7 is an explanatory diagram of a maximum recognition distance.
FIG. 8 is an explanatory diagram of the directionality of learning characters.
[Explanation of symbols]
7 Learning character extraction unit (learning character extraction means)
9 recognition dictionary
13 Learning character filter section (first filter means, second filter means)
14 Learning character selection part (learning character selection means)
15 Learning character synthesis unit (feature amount synthesis means)
16 Template addition part (dictionary learning method judging means)
91 templates
92 Filter data (recognition dictionary data)
100 character recognition device

Claims

A dictionary learning method of a character recognition device that performs learning processing of a recognition dictionary,
By the computer of the character recognition device,
Each character from the feature quantity of the feature with recognition dictionary template learning character c excluding the character that corresponds to one of the recognition result for fixed character entered from the following exclusion conditions (1) to (5) A recognition distance calculating step for calculating a minimum recognition distance and a maximum recognition distance for each code;
If the normalized distance of the minimum recognition distance A of the character code different from the learning character c in the minimum recognition distance of each character code calculated in the recognition distance calculation step is smaller than the threshold ω, the learning character c is excluded. When the normalized distance of the maximum recognition distance B of the same character code as the learning character c is greater than the threshold ω among the calculated maximum recognition distances of the character codes, the learning character C is excluded and the learning character C is excluded. Get the filter step,
A dictionary learning method characterized by comprising:
Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned

Furthermore, selection step of the learning letter C obtained by filtering step classifies the learning pair Zote Plates another, minimum recognition distance and without to the still learning process from its selects the smallest learned character D,
The feature amount of the learning character D and the feature amount of the learning target template are combined as a vector amount based on the following formula G while changing the composition ratio R of the learning character D, and A feature value synthesis step to bring the feature values closer,
The dictionary learning method according to claim 1, further comprising:
Formula G: Fdic = (1-R) Fdic + RFunk
Fdic is the feature value after composition, Fdic is the feature value of the learning target template in the recognition dictionary, Funk is the character feature value, and R is the composition ratio (0 < R <1)

Furthermore, at the end of learning in the dictionary learning step, it is determined whether or not the learning character D is correctly recognized using the minimum recognition distance of the learning character D , and the recognition dictionary data is updated or added based on the determination result. Update / add steps to perform,
Filter data update for changing filter data used in the learning character filter means based on recognition distance data obtained by performing recognition processing of learning characters using a template in which recognition dictionary data is updated or added in the updating / adding step. Step,
The dictionary learning method according to claim 2, further comprising:

In a character recognition device that performs learning processing of a recognition dictionary,
Each character from the feature quantity of the feature with recognition dictionary template learning character c excluding the character that corresponds to one of the recognition result for fixed character entered from the following exclusion conditions (1) to (5) A recognition distance calculation means for calculating a minimum recognition distance and a maximum recognition distance for each code;
The learning character c is excluded when the normalized distance of the minimum recognition distance A of the character code different from the learning character c among the minimum recognition distances of the character codes calculated by the recognition distance calculation means is smaller than the threshold ω. When the normalized distance of the maximum recognition distance B of the same character code as the learning character c is greater than the threshold ω among the calculated maximum recognition distances of the character codes, the learning character C is excluded and the learning character C is excluded. Filter means to obtain
Selection means for said learning letter C obtained by the filter means to classify the learning pair Zote Plates another, minimum recognition distance and without to the still learning process from its selects the smallest learned character D,
The feature amount of the learning character D and the feature amount of the learning target template are combined as a vector amount based on the following formula G while changing the composition ratio R of the learning character D, and A feature amount combining means for bringing the feature amounts closer;
At the end of the synthesizing process by the feature amount synthesizing means, it is determined whether or not the learning character D is correctly recognized using the minimum recognition distance of the learning character D , and the recognition dictionary data is updated or added based on the determination result. Update / add means to perform;
Filter data update for changing filter data used in the learning character filter means based on recognition distance data obtained by performing recognition processing of learning characters using a template in which the recognition dictionary data is updated or added by the updating / adding means Means,
A character recognition device comprising:
Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned
Formula G: Fdic = (1-R) Fdic + RFunk
Fdic is the feature value after composition, Fdic is the feature value of the learning target template in the recognition dictionary, Funk is the character feature value, and R is the composition ratio (0 < R <1)

In a character recognition device that performs learning processing of a recognition dictionary,
Computer
Each character from the feature quantity of the feature with recognition dictionary template learning character c excluding the character that corresponds to one of the recognition result for fixed character entered from the following exclusion conditions (1) to (5) Recognition distance calculation means for calculating the minimum recognition distance and the maximum recognition distance for each code,
The learning character c is excluded when the normalized distance of the minimum recognition distance A of the character code different from the learning character c among the minimum recognition distances of the character codes calculated by the recognition distance calculation means is smaller than the threshold ω. When the normalized distance of the maximum recognition distance B of the same character code as the learning character c is greater than the threshold ω among the calculated maximum recognition distances of the character codes, the learning character C is excluded and the learning character C is excluded. Obtaining filter means,
Selection means for said learning letter C obtained by the filter means to classify the learning pair Zote Plates another, minimum recognition distance and without to the still learning process from its selects the smallest learned character D,
The feature amount of the learning character D and the feature amount of the learning target template are combined as a vector amount based on the following formula G while changing the composition ratio R of the learning character D, and A feature amount combining means for bringing the feature amounts closer;
At the end of the synthesizing process by the feature amount synthesizing means, it is determined whether or not the learning character D is correctly recognized using the minimum recognition distance of the learning character D , and the recognition dictionary data is updated or added based on the determination result. Updating / adding means,
A filter for changing filter data used when obtaining the learning character C based on recognition distance data obtained by performing recognition processing of the learning character using a template in which the recognition dictionary data is updated or added by the updating / adding means. Data update means,
A computer-readable recording medium storing a program that functions as a computer.
Exclusion conditions: (1) character inserted by correction input, (2) correction target character deleted by correction input, (3) correction target character when the correction target character is in contact with the characters before and after it, (4) The correction target character when the size of the correction target character is smaller than the threshold, (5) The correction target character when the correction target character is drowned
Formula G: Fdic = (1-R) Fdic + RFunk
Fdic is the feature value after composition, Fdic is the feature value of the learning target template in the recognition dictionary, Funk is the character feature value, and R is the composition ratio (0 < R <1)