JP3970075B2

JP3970075B2 - Character recognition apparatus, character recognition method, execution program thereof, and recording medium recording the same

Info

Publication number: JP3970075B2
Application number: JP2002095511A
Authority: JP
Inventors: 明中村; 博光川尻
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 2002-03-29
Filing date: 2002-03-29
Publication date: 2007-09-05
Anticipated expiration: 2022-03-29
Also published as: JP2003296661A

Description

【０００１】
【発明の属する技術分野】
本発明は、文字認識結果の信頼度（確からしさ）を判定することにより、手書き入力された文字を認識する文字認識装置、文字認識方法、その実行プログラムおよびそれを記録した記録媒体に関する。
【０００２】
【従来の技術】
従来の文字認識方法においては、たとえば筆記入力された文字の特徴量を抽出し、これを認識辞書中の特徴量と比較して、両者の類似度が高い、もしくは両者の距離値が小さい（これらをまとめて、便宜上、「確信度」が高いと称する）認識候補文字を出力するようにしていた。しかしながら、かかる一文字毎の文字認識では、筆記文字が認識辞書の特徴量に近接している場合には比較的精度の良い認識結果が得られるが、認識辞書の特徴量から離れた文字を筆記した場合には、適正な認識結果を簡単に得ることができない。
【０００３】
そこで、かかる一文字毎の文字認識に加え、前後の文字あるいは単語間・文節間の連接確率ないし共起確率を検出し、前記文字毎の確信度とこれらの確率とから文字列の整合度を算出し、かかる整合度に従って文字列全体の認識文字列候補を出力する、いわゆる後処理が実行されている。
ところが、かかる後処理の際に、あまりに多くの認識候補文字を対象とすると、後処理の計算処理時間が増大してしまう。また、確信度の低い認識候補文字を対象とすると、後処理の結果、却って誤った文字列候補を出力する恐れもある。
【０００４】
そこで、計算時間の増大を抑えながら後処理の精度を高めるために、後処理の対象とする候補文字を制限する種々の方式が提案されている。代表的なものとして例えば、（ａ）文字認識により得られる各候補文字の確信度を直接的に用いる方式、（ｂ）各候補文字の確信度から認識結果の信頼度を推定し、信頼度に応じて候補文字数を制御する方式、（ｃ）隣接する認識候補文字群相互の間の言語的な連接関係から認識結果の信頼度を推定する方式、などが挙げられる。
【０００５】
（ａ）はもっとも単純な方式である。即ち、各候補文字の確信度を所定のしきい値と比較し、このしきい値より確信度が高い候補文字のみを後処理の対象とする方式である。
また、（ｂ）の方式において認識結果の信頼度を求める方法の一例としては、特開平０９−２５９２２６号公報「認識結果の評価方法および認識装置」が挙げられる。これは１位候補文字の確信度と２位候補文字の確信度の差分値を求め、この差分値と１位候補文字の確信度の線形和を認識結果の正解らしさの尺度とする方法である。この方法は、認識結果が正解の場合１位候補文字の確信度が比較的高く、かつ、１位候補文字の確信度と２位候補文字の確信度の差が比較的大きい傾向に着目したものである。また、この方法以外にも、１位候補文字の確信度と２位以下の各候補文字の確信度との比を用いる方法、各候補文字の確信度を多次元の確率分布ととらえ、統計的に信頼度を求める方法など、種々の方法が提案されている。
【０００６】
更に（ｃ）の方式としては、判定対象の認識候補文字群とその直前または直後の認識候補文字群に含まれる各文字間の連接確率に着目して、認識結果の信頼度を統計的に推定する方式である。
【０００７】
【発明が解決しようとする課題】
しかしながら、一般に文字認識の結果得られる各候補文字の確信度は、必ずしもその候補文字の正解らしさを適切に反映していない。例えば、比較的乱雑に書かれた文字の場合、たとえその候補文字が正解であったとしても確信度は低い傾向がある。一方、候補文字と正解文字が類似している場合は、不正解であってもその確信度は高い場合がある。したがって、前記（ａ）の方式で精度良く候補文字を制限することは困難である。
【０００８】
また、１位候補文字と２位候補文字の組み合わせが「つ」−「フ」、「之」−「え」など類似文字の場合には、認識結果が正解か否かに拘らず、これらの確信度の差分値は小さい傾向がある。ひらがな、カタカナ、漢字、英数字のすべてを認識対象とする場合には、このような類似文字の組み合わせが頻繁に発生するため、前記（ｂ）の方式でもやはり認識結果の信頼度を精度良く推定することは難しい。
【０００９】
一方、前記（ｃ）の方式では、方式（ａ）および（ｂ）に見られるような確信度の特性に起因する問題は回避される。しかし、この方式では文字間の連接確率が低い文字列（例えば使用頻度の低い専門用語や固有名詞等）の場合、認識結果の信頼度を精度良く求められない場合がある。
したがって、前記従来技術によれば、本来後処理の対象から除外すべき認識候補文字を信頼度の高いものとして後処理の対象として出力し、逆に、本来後処理の対象とすべき認識候補文字を信頼度の低いものとして後処理の対象から除外する結果が生じ、信頼度の判定によって却って認識精度を低下させる結果を招いていた。これら従来技術に共通する点は、信頼度判定の算出に用いる特徴量を文字認識結果からのみ抽出しているところにある。
【００１０】
ところで、スタイラス等により手書き入力された文字を認識するいわゆるオンライン手書き文字認識においては、入力データである時系列の座標点列が、認識結果の信頼度推定に有用な情報を含んでいる場合がある。例えば、乱雑に書かれれた文字は一般に筆記速度が速く、丁寧に書かれた文字では筆記速度が遅い傾向がある。乱雑に書かれた文字では誤認識が起こりやすいため、認識結果の信頼度は低下すると考えられる。また、一般に筆記画数が少ない場合には類似文字が多いためやはり誤認識が起こりやすくなる。
【００１１】
さらに、筆記画数が認識処理の結果得られる候補文字の正規画数と比較して著しく小さいような場合、例えば筆記画数が１画であるのに対して１位候補文字の正規画数が１０画であるような場合は、画数の多い文字を１画で筆記したような場合、すなわち極端なつづけ字である可能性が高い。このような場合もやはり誤認識が起こりやすくなる。したがって、筆記画数と候補文字の正規画数相互の関係もまた文字認識結果の信頼度を反映している。
【００１２】
そこで、本発明は、手書き入力データから得られる信頼度推定に有用な特徴に着目することにより従来技術が抱える問題を解消し、候補文字の確信度や候補文字間の連接確率など、文字認識結果から得られる特徴のみでは認識結果の信頼度を適切に推定できない場合でも比較的精度よく信頼度を算出でき、もって、認識精度を向上させ得る文字認識装置、文字認識方法、その実行プログラムおよびそれを記憶した記録媒体を提供することを目的とするものである。
【００１３】
【課題を解決するための手段】
本願請求項１に係る発明は、手書き入力された文字の座標点列を認識して認識候補文字群を出力する文字認識手段と、前記文字認識手段より出力される判定対象認識候補文字群の信頼度を算出するための特徴量として、前記手書き入力された文字の座標点列の平均筆記速度及び、前記判定対象認識候補文字群中の上位Ｎ文字（Ｎ≧１）の正規画数を算出する特徴抽出手段と、前記特徴抽出手段からの特徴量と、サンプルデータの統計的傾向とに基づいて、前記判定対象認識候補文字群の信頼度を算出する信頼度算出手段と、前記信頼度算出手段からの信頼度に基づいて前記判定対象認識候補文字群の後処理を制御する後処理制御手段とを有することを特徴とする
【００１４】
請求項２に係る発明は、請求項１記載の発明において、前記特徴抽出手段は、前記平均筆記速度と、前記手書き入力された文字の座標点列の筆記画数とを、前記判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする。
【００１５】
請求項３に係る発明は、請求項１ないし２の何れかに記載の発明において、前記特徴抽出手段は、さらに前記判定対象認識候補文字群中の上位Ｍ文字（Ｍ≧１）の確信度を、当該判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする。
請求項４に係る発明は、請求項１ないし３の何れかに記載の発明において、前記特徴抽出手段は、さらに前記判定対象認識候補文字群中の各認識候補文字とその直前の手書き入力に対する直前認識候補文字群との間の連接確率の値もしくはその直後の手書き入力に対する直後認識候補文字群との間の連接確率の値を、当該判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする
【００１６】
請求項５に係る発明は、請求項４記載の発明において、前記特徴抽出手段は、前記判定対象認識候補文字群中の各認識候補文字と前記直前認識候補文字群中の最上位確信度の認識候補文字との間の連接確率の値もしくは前記直後認識候補文字群中の最上位確信度の認識候補文字との間の連接確率の値を当該判定対象認識候補文字群の特徴量として抽出することを特徴とする
【００１７】
請求項６に係る発明は、請求項５記載の発明において、前記特徴抽出手段は、前記判定対象認識候補文字群中の一の認識候補文字とその直前または直後認識候補文字群中の各認識候補文字との間の連接確率の内、最高の連接確率を当該一の認識候補文字と前記直前または直後認識候補文字群との間の連接確率とすることを特徴とする
【００１８】
請求項７に係る発明は、請求項１ないし６の何れかに記載の発明において、前記信頼度算出手段は、前記特徴量から前記判定対象認識候補文字群中の一の認識候補文字の確からしさを判別得点として算出する判別得点算出手段を含み、当該判別得点に基づいて前記信頼度を算出することを特徴とする。
請求項８に係る発明は、請求項１ないし７の何れかに記載の発明において、前記後処理制御手段は、前記信頼度算出手段から算出された信頼度に基づいて、後処理の対象とする認識候補文字を制限することを特徴とする。
【００１９】
請求項９に係る発明は、手書き入力された文字を認識する装置で実行される文字認識方法であって、文字認識手段が、手書き入力された文字の座標点列を認識して認識候補文字群を出力する文字認識ステップと、特徴抽出手段が、前記文字認識ステップより出力される判定対象認識候補文字群の信頼度を算出するための特徴量として、前記手書き入力された文字の座標点列の平均筆記速度及び、前記判定対象認識候補文字群中の上位Ｎ文字（Ｎ≧１）の正規画数を算出する特徴抽出ステップと、信頼度算出手段が、前記特徴抽出ステップからの特徴量と、サンプルデータの統計的傾向とに基づいて、前記判定対象認識候補文字群の信頼度を算出する信頼度算出ステップと、後処理制御手段が、前記信頼度算出ステップからの信頼度に基づいて前記判定対象認識候補文字群の後処理を制御する後処理制御ステップとを有することを特徴とする。
【００２０】
請求項１０に係る発明は、請求項９記載の発明において、前記特徴抽出ステップは、前記特徴抽出手段が、前記平均筆記速度と、前記手書き入力された文字の座標点列の筆記画数とを、前記判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする。
【００２１】
請求項１１に係る発明は、請求項９ないし１０の何れかに記載の発明において、前記特徴抽出ステップは、前記特徴抽出手段が、さらに前記判定対象認識候補文字群中の上位Ｍ文字（Ｍ≧１）の確信度を、当該判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする。
請求項１２に係る発明は、請求項９ないし１１の何れかに記載の発明において、前記特徴抽出ステップは、前記特徴抽出手段が、さらに前記判定対象認識候補文字群中の各認識候補文字とその直前の手書き入力に対する直前認識候補文字群との間の連接確率の値もしくはその直後の手書き入力に対する直後認識候補文字群との間の連接確率の値を、当該判定対象認識候補文字群の信頼度を算出するための特徴量として抽出することを特徴とする。
【００２２】
請求項１３に係る発明は、請求項１２記載の発明において、前記特徴抽出ステップは、前記特徴抽出手段が、前記判定対象認識候補文字群中の各認識候補文字と前記直前認識候補文字群中の最上位確信度の認識候補文字との間の連接確率の値もしくは前記直後認識候補文字群中の最上位確信度の認識候補文字との間の連接確率の値を当該判定対象認識候補文字群の特徴量として抽出することを特徴とする。
【００２３】
請求項１４に係る発明は、請求項１２記載の発明において、前記特徴抽出ステップは、前記特徴抽出手段が、前記判定対象認識候補文字群中の一の認識候補文字とその直前または直後認識候補文字群中の各認識候補文字との間の連接確率の内、最高の連接確率を当該一の認識候補文字と前記直前または直後認識候補文字群との間の連接確率とすることを特徴とする。
【００２４】
請求項１５に係る発明は、請求項９ないし１４の何れかに記載の発明において、前記信頼度算出ステップは、前記信頼度算出手段が、前記特徴量から前記判定対象認識候補文字群中の一の認識候補文字の確からしさを判別得点として算出する判別得点算出ステップを含み、当該判別得点に基づいて前記信頼度を算出することを特徴とする。
請求項１６に係る発明は、請求項９ないし１５の何れかに記載の発明において、前記後処理制御ステップは、前記後処理制御手段が、前記信頼度算出ステップから算出された信頼度に基づいて、後処理の対象とする認識候補文字を制限することを特徴とする。
【００２５】
請求項１７に係る発明は、コンピュータに請求項９ないし１６の何れかに記載の文字認識方法における各処理ステップを実行させるためのプログラムである。
請求項１８に係る発明は、コンピュータに請求項９ないし１６の何れかに記載の文字認識方法における各処理ステップを実行させるためのプログラムを記録したコンピュータ読み取り可能な記録媒体である。
【００２６】
【発明の実施の形態】
＜第１の実施の形態＞
以下、本発明の第１の実施の形態につき図面を参照して説明する。
まず、図１は、第１の実施の形態に係る手書き文字認識装置の回路ブロック図である。
【００２７】
図１において、１は入力部で、タブレット等に手書き入力された筆跡から筆跡文字情報を生成し出力する。２は文字認識部で、入力部１から供給された筆跡文字情報を文字認識辞書３の文字特徴量と比較し、両者の近接度（確信度）が１位からＮ位までの認識辞書中の文字を当該筆跡文字の認識候補文字として出力する。３は文字認識辞書で、候補文字がその文字特徴量とともに記憶されている。
【００２８】
４は特徴抽出部で、入力部１で得られた筆跡文字情報から平均筆記速度を算出するとともに筆記画数を求める。さらに正規画数テーブル５を参照して、文字認識部２から供給される判定対象の認識候補文字群の内、上位Ｎ文字の正規画数を抽出する。５は正規画数テーブルで、認識対象文字の正規画数を各認識対象文字に対応づけて記憶している。
【００２９】
６は判別得点算出部で、特徴抽出部４で得られた特徴量を処理して、当該判定対象の認識候補文字群の正誤判別得点を算出する。
７は認識信頼度算出部で、判別得点算出部６からの判別得点と、判別得点−信頼度変換テーブル８とを比較して、当該判定対象の認識候補文字群の信頼度を出力する。８は判別得点−信頼度変換テーブルで、判別得点と信頼度の関係をテーブルとして記憶しておくものである。
【００３０】
９は認識候補数制御部で、認識信頼度算出部７からの信頼度と、信頼度−累積正読率テーブル１０とを比較して、当該判定対象文字の認識候補数を制限するものである。１０は信頼度−累積正読率テーブルで、信頼度と正読率の関係をテーブルとして記憶しておくものである。
２１は言語処理部で、認識候補数制御部９によって設定された個数の認識候補文字を対象として後処理を行い、文字列候補を出力する。
【００３１】
なお、請求項における「文字認識手段」は実施の形態における図１の文字認識部２および文字認識辞書３が対応する。請求項における「特徴抽出手段」は実施の形態における図１の特徴抽出部４および正規画数テーブル５が対応する。請求項における「信頼度算出手段」は実施の形態における図１の判別得点算出部６、認識信頼度算出部７および判別得点―信頼度変換テーブル８が対応する。請求項における「後処理制御手段」は実施の形態における図１の認識候補数制御部９および信頼度―累積正読率テーブル１０が対応する。
【００３２】
次に、前記回路ブロック図に示された各部の処理の詳細について説明する。
まず、図２を参照して、特徴抽出部４の処理について説明する。
図２は入力部１より文字「い」が筆記された時の座標点列を表している。かかる特徴抽出部４では、入力部１で得られた座標点列（ｘｉ，ｙｉ，ｔｉ，ｐｉ）（ｉ＝１〜Ｋ）から平均筆記速度を算出するとともに筆記画数を求める。ここでＫは座標点数であり、ｘｉおよびｙｉはそれぞれｉ番目の座標点のｘ座標およびｙ座標、ｔｉはｉ番目の座標点が発生した時刻である。またｐｉは時刻ｔｉにおける入力ペンの状態を示し、ペンがタブレットに接している時はｐｉ＝１，ペンがタブレットから離れている時はｐｉ＝０の値を持つ。
【００３３】
ｉ番目の座標点からｉ＋１番目の座標点へペンが移動した時の筆記速度ｖｉは数１で表される。
【００３４】
【数１】

【００３５】
したがって、座標点列の平均筆記速度Ｖは数２により算出される。
【００３６】
【数２】

【００３７】
また、入力ペン状態を示すｐｉの値が１から０に変化する回数を計数することにより、筆記画数Ｓｉｎｐが得られる。さらに、認識候補文字群の上位Ｎ文字に対して、正規画数テーブル５を参照することにより正規画数Ｓｎ１，Ｓｎ２，…，ＳｎＮが得られる。
次に、図３を参照して、判別得点算出部６における処理について説明する。
【００３８】
前記特徴抽出部４で抽出した平均筆記速度Ｖ、筆記画数Ｓｉｎｐ、正規画数Ｓｎ１，Ｓｎ２，…，ＳｎＮの組は、（２＋Ｎ）次元のベクトル空間において所定のベクトル（特徴ベクトル）として表現できる。判別得点算出部６では、予め第１位の認識候補文字が正読または誤読であるサンプルについて同様に特徴ベクトルを学習データとして抽出しておき、これと判定対象の認識候補文字群の特徴ベクトルとを比較して、当該認識候補文字群の判別得点を算出する。
【００３９】
たとえば、正読と誤読の特徴ベクトルの学習データと判定対象の認識候補文字群の特徴ベクトルが図３に示すような状態にあるとする。図３は簡略化のためにＮ＝１すなわち特徴ベクトルの次元数が３の場合を図示したものである。判別得点算出部６は、正読、誤読のそれぞれの集合（クラス）の特徴ベクトルの分布から、予め、両クラスの特徴ベクトルの平均値（重心ベクトル）および共分散行列を求め、これを記憶している。そして、これら各クラスの重心と判定対象文字群の特徴ベクトルとの間のマハラノビス距離ＤＭｃ、ＤＭｅを求め、これらの値の比、比の対数または差を判別得点とする。
【００４０】
ここで、前記マハラノビス距離ＤＭは次のようにして算出される。すなわち、クラスＣ１の重心ベクトルをｍ１、クラスＣ１の共分散行列をΣ１とすると、所定の特徴ベクトルｘからｍ１へのマハラノビス２乗距離ＤＭ１は、数３で定義される。
【００４１】
【数３】

【００４２】
ここで、共分散行列Σ１はｎ×ｎの正方行列であり（ｎ：特徴空間の次元数）、その（ｉ,ｊ）要素はｉ番目の特徴量とｊ番目の特徴量の共分散、すなわちΣ１（ｉ,ｊ）＝σｉｊである。
なお、前記では、判別得点をマハラノビス距離ＤＭを用いて算出したが、これに替えて、正読、誤読の各クラスの特徴ベクトルの分布から線形判別分析により線形判別関数を求めておき、判定対象の認識候補文字群の特徴ベクトルに対してこの線形判別関数を当てはめて判別得点を求めるようにしても良い。
【００４３】
また、正読、誤読の学習サンプルから抽出した特徴ベクトルを学習データとして、対象の特徴ベクトルが正読か誤読かを判定できるように学習させたニューラルネットを用い、判定対象の特徴ベクトルに対する当該ニューラルネットの出力値を判別得点とするようにしてもよい。
次に、図４を参照して認識信頼度算出部７の処理について説明する。
【００４４】
たとえば、前記判別得点の算出において、マハラノビス距離ＤＭｃ、ＤＭｅの距離の比または比の対数を判別得点とした場合、正読および誤読の各学習サンプルから得られる判別得点と信頼度の関係は図４に示すようになる。ここで、信頼度は、学習サンプルからベイズの定理によって算出される。
すなわち、判別得点ｙを有する正読サンプル個数の全正読サンプル個数に対する比率をｐ（ｙ｜Ｘ１＝Ｃ）、判別得点ｙを有する誤読サンプル個数の全誤読サンプル個数に対する比率をｐ（ｙ｜Ｘ１＝Ｅ）、全サンプル数に対する正読サンプル数の総数の比率をＰ（Ｘ１＝Ｃ）、全サンプル数に対する誤読サンプル数の総数の比率をＰ（Ｘ１＝Ｅ）とすると、判別得点ｙを有する認識候補文字群の確信度１位の認識候補文字の信頼度は、次式によって算出できる。
【００４５】
Ｐ（Ｘ１＝Ｃ｜ｙ）＝ｐ（ｙ｜Ｘ１＝Ｃ）・Ｐ（Ｘ１＝Ｃ）／[ｐ（ｙ｜Ｘ１＝Ｃ）・Ｐ（Ｘ１＝Ｃ）＋ｐ（ｙ｜Ｘ１＝Ｅ）・Ｐ（Ｘ１＝Ｅ）]
ここで、Ｘ１は１位の認識候補文字を表し、Ｘ１＝Ｃ、Ｘ１＝Ｅはそれぞれ、１位認識候補文字が正解、不正解である事象を意味する。
かかる式から判別得点と信頼度の関係を示す判別得点―信頼度変換テーブルを予め作成しておき、これを判別得点―信頼度変換テーブル８に記憶させておく。認識信頼度算出部７は判別得点算出部６からの判別得点と、当該判別得点―信頼度変換テーブル８の得点を比較し、該当する信頼度を、当該認識候補文字群の第１位の認識候補文字の信頼度として出力する。
【００４６】
次に、図５を参照して、認識候補数制御部９の処理について説明する。
図５の上部に示す表は、判定対象の確信度１位の認識候補文字に対する信頼度と、当該判定対象のＮ位までの認識候補文字の中に正読の文字が含まれる累積確率との関係を示すものである。かかる表中の確率は、前記正読、誤読の学習サンプルを基に予め算出しておく。
【００４７】
信頼度―累積正読率テーブル１０には、かかる表を記憶させておく。そして、認識候補数制御部９は、判定対象の認識候補文字群の信頼度と当該テーブル中の信頼度レベルとを比較し、該当する信頼度レベルの累積確率を参照しながら何位までの認識候補文字を言語処理部２１に出力するかを決定する。ここで、何位までを出力するかは、例えば、該当する信頼度レベルの累積確率が所定のしきい値に達したか否かで決定する。この際、設定されるしきい値は、全ての信頼度レベルに対して一律としても良いし、あるいは、信頼度レベル毎に個別に設定するようにしても良い。
【００４８】
あるいは、図５の上部の表を基に、信頼度レベル毎の出力候補数を予め設定し、これを信頼度−累積正読率テーブル１０に記憶させておいても良い。図５の下部に示す表は、信頼度レベルと出力候補数とを予め設定した場合の一例である。信頼度―累積正読率テーブル１０に予めかかる表を記憶させた場合には、認識候補数制御部９は、該当する出力候補数を表から読み出し、それに従って、言語処理部２１に出力される認識候補文字を制限する。
【００４９】
以上の実施の形態においては、認識結果の信頼度推定に有用な特徴量を手書き入力データから抽出しているため、候補文字の確信度からは認識結果の信頼度を適切に推定できない場合にも比較的精度よく信頼度を算出でき、もって、正読率の高い認識候補文字を言語処理部に出力することができるようになる。
＜第２の実施の形態＞
次に、本発明に係る第２の実施形態について以下に説明する。
【００５０】
本実施の形態は前記特徴抽出部４における特徴抽出処理を変更するものである。
まず、図６に本実施の形態に係る手書き文字認識装置の回路ブロック図を示す。第１の実施の形態において示した図１との相違は、「文字間連接確率辞書１１」が追加されている点である。
【００５１】
本実施の形態においては、特徴抽出部４における特徴抽出処理として、第１の実施の形態において示した処理内容に加えて、文字間連接確率辞書１１を参照することにより判定対象認識候補文字群中の認識候補文字とその直前および直後の認識候補文字群との間の連接確率を抽出する処理が追加される。
すなわち、第１の実施の形態では信頼度算出に用いる特徴量として、平均筆記速度、筆記画数、判定対象認識候補文字群の上位Ｎ文字の正規画数を採用したが、第２の実施の形態では、これらに加えて判定対象認識候補文字群中の上位Ｌ位までの認識候補文字とその直前の認識候補文字群との間の連接確率の値Ｐｂｋ（ｋ＝１〜Ｌ）、および判定対象認識候補文字群中の上位Ｌ位までの認識候補文字とその直後の認識候補文字群との間の連接確率の値Ｐｆｋ（ｋ＝１〜Ｌ）を信頼度算出に用いる特徴量として採用する。
【００５２】
ここで、判定対象認識候補文字群中の第ｋ位候補文字とその直前の認識候補文字群との間の連接確率の値Ｐｂｋは、本実施の形態では、第ｋ位候補文字と直前の１位からＪ位までの候補文字との間の連接確率の最大値とする。Ｐｆｋも同様に、第ｋ位候補文字と直後の１位からＪ位までの候補文字との間の連接確率の最大値とする。
【００５３】
たとえば図７の例においては、判定対象の認識候補文字の１位文字「日」に対するＰｂ１は、当該「日」と直前の１位文字「朋」からＪ位文字「胡」までのそれぞれの連接確率Ｐ（Ｃ１｜Ｃｂｋ）の内、最大の連接確率を採用する。また、１位文字「日」に対するＰｆ１は、当該「日」と直後の１位文字「も」からＪ位文字「亡」までのそれぞれの連接確率Ｐ（Ｃｆｋ｜Ｃ１）の内、最大の連接確率を採用する。同様に、判定対象の認識候補文字の２位文字「月」に対するＰｂ２、Ｐｆ２は、直前、直後の文字群に対する連接確率の最大値をそれぞれ採用する。
【００５４】
ここで、Ｃ１は判定対象の認識候補１位の文字を表し、Ｃｂｋ、Ｃｆｋはそれぞれ、直前、直後の認識候補ｋ位の文字を表す。そして、Ｐ（Ｃｊ｜Ｃｉ）は、文字Ｃｉに続いて文字Ｃｊが現れる連接確率を表す。
第２の実施形態においては、図３に示す判別空間は（２＋Ｎ＋２Ｌ）次元となる。また、正読・誤読のサンプルも、平均筆記速度、筆記画数、当該サンプルの認識候補文字群の上位Ｎ文字の正規画数の他、当該サンプルの認識候補文字群の上位Ｌ文字に対する連接確率Ｐｂｋ、Ｐｆｋが特徴抽出要素とされ、かかるサンプルデータに従って判別得点―信頼度変換テーブル８と信頼度―累積正読率テーブル１０に記憶されるテーブルが設定される。
【００５５】
第２の実施形態においては、第１の実施形態で採用した特徴量に加えて、隣接する認識候補文字群に含まれる文字間の連接確率を信頼度判定の特徴量として採用するものであるから、前記第１の実施形態よりもさらに高精度の信頼度判定を行えるものである。
さらに他の実施形態として、前記連接確率Ｐｂｋ、Ｐｆｋの他、第Ｍ位までの認識候補文字の確信度（類似度もしくは距離値）を特徴要素として加え、（２＋Ｎ＋２Ｌ＋Ｍ）次元のベクトル空間にて当該認識候補文字群の特徴ベクトルを抽出するようにしてもよい。かかる場合には図３に示す判別空間も（２＋Ｎ＋２Ｌ＋Ｍ）次元となる。かかる第２の実施の形態では、連接関係のみならず確信度が加味されるものであるから、より高精度の信頼度判定が可能となる。
【００５６】
ところで、前記実施の形態では、図１におけるブロック毎に処理を分けて一連の処理フローを説明したが、制御プログラムに従ってＣＰＵによってかかる処理フローを実行することも可能である。かかる場合、前記処理フローは、ＲＯＭまたはＲＡＭに制御プログラムとして記憶される。また、文字認識辞書３、正規画数テーブル５、判別得点―信頼度変換テーブル８、信頼度―累積正読率テーブル１０および文字間連接確率辞書１１の参照データもＲＯＭまたはＲＡＭに記憶される。ＣＰＵは、かかる制御プログラムに従って、参照データを参照しながら、前記の処理を実行する。
【００５７】
図８に、かかる制御プログラムによるフローを示す。ここで、ステップＳ１０１は入力部１における処理、ステップＳ１０２は文字認識部２における処理、ステップＳ１０３は特徴抽出部４における処理、ステップＳ１０４は判別得点算出部６における処理、ステップＳ１０５は認識信頼度算出部７における処理、ステップＳ１０６は認識候補数制御部９における処理である。
【００５８】
なお、請求項における「文字認識ステップ」は実施の形態における図７のステップＳ１０２が対応する。請求項における「特徴抽出ステップ」は実施の形態における図７のステップＳ１０３が対応する。請求項における「信頼度算出ステップ」は実施の形態における図７のステップＳ１０４およびＳ１０５が対応する。
かかる制御プログラムおよび各種参照データは、フレキシブルディスク等の記録媒体またはインターネット等の伝送媒体を介して取引され得る。記録媒体または伝送媒体を介して取引されるデータのファイル構造の一例を図９に示す。記録媒体には、かかるファイル構造のデータが記録される。また、伝送媒体を介した取引では、かかるファイル構造のデータが伝送媒体を介して供給される。
【００５９】
以上、本発明に係る実施の形態について説明したが、本発明はかかる実施の形態に制限されるものではなく、他に種々の変更が可能である。たとえば、前記実施の形態では、平均筆記速度、筆記画数、候補文字の正規画数、隣接する認識候補文字群相互間の連接確率、候補文字の確信度を特徴量として信頼度を算出する例を示したが、これら種々の特徴量の内、個々の実施装置において特に有用な特徴量のみを選択して採用することもできる。
【００６０】
また、図１の認識候補数制御部９における処理内容を、認識候補数の制限ではなく、信頼度に応じて当該認識結果をリジェクト（無効）とするようにしても良い。
更に、手書き入力の対象は、文字を一例として挙げたが、これには限られず、図形でも構わないことはいうまでもない。
【００６１】
その他、図１の判別得点算出部６における判別得点の算出方法や、図１の認識信頼度算出部７における認識信頼度の算出方法も、前記実施の形態にて示したマハラノビス距離ＤＭを用いる方法や、ベイズの定理を用いる方法以外の方法を採用することもできる。
本発明の実施形態は、本発明の技術的思想の範囲内において、適宜、様々な変更が可能である。
【００６２】
また、前述の実施の形態は、あくまでも、本発明の一つの実施形態であって、本発明ないし各構成要件の用語の意義は、実施の形態に記載されたものに制限されるものではない。
【００６３】
【発明の効果】
以上、本発明によれば、手書き入力データから得られる信頼度推定に有用な特徴を用いることにより、候補文字の確信度や候補文字間の連接確率など文字認識結果から得られる情報のみでは当該認識結果の信頼度を適切に推定できない場合でも比較的精度よく信頼度を算出でき、もって、これを文字認識装置に採用した場合には、文字認識の精度を向上させることができるようになる。
【図面の簡単な説明】
【図１】第１の実施の形態に係る回路ブロック図を示す図である。
【図２】第１の実施の形態に係る特徴抽出部の処理を説明するための図である。
【図３】第１の実施の形態に係る判別得点算出部の処理を説明するための図である。
【図４】第１の実施の形態に係る認識信頼度算出部の処理を説明するための図である。
【図５】第１の実施の形態に係る認識候補数制御部の処理を説明するための図である。
【図６】第２の実施の形態に係る回路ブロック図を示す図である。
【図７】第２の実施の形態に係る特徴抽出部の処理を説明するための図である。
【図８】第１および第２の実施の形態に係る実行フローチャートである。
【図９】第２の実施の形態に係る実行プログラムと参照データのファイル構造である。
【符号の説明】
１…入力部
２…文字認識部
３…文字認識辞書
４…特徴抽出部
５…正規画数テーブル
６…判別得点算出部
７…認識信頼度算出部
８…判別得点―信頼度変換テーブル
９…認識候補数制御部
１０…信頼度−累積正読率テーブル
１１…文字間連接確率辞書[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character recognition device, a character recognition method, an execution program thereof, and a recording medium on which the character recognition is performed for recognizing a character input by handwriting by determining reliability (probability) of a character recognition result.
[0002]
[Prior art]
In the conventional character recognition method, for example, the feature amount of a character input by writing is extracted, and compared with the feature amount in the recognition dictionary, the similarity between the two is high, or the distance value between the two is small (these For the sake of convenience, the recognition candidate character is output (referred to as “high confidence”). However, in such character recognition for each character, a relatively accurate recognition result can be obtained when the written character is close to the feature value of the recognition dictionary, but the character that is far from the feature value of the recognition dictionary is written. In this case, a proper recognition result cannot be easily obtained.
[0003]
Therefore, in addition to the character recognition for each character, the connection probability or co-occurrence probability of the preceding and following characters or between words and phrases is detected, and the consistency of the character string is calculated from the certainty factor for each character and these probabilities. Then, so-called post-processing is performed in which recognized character string candidates for the entire character string are output in accordance with the degree of matching.
However, if too many recognition candidate characters are targeted during such post-processing, the post-processing calculation processing time increases. In addition, if a recognition candidate character with a low certainty factor is targeted, an erroneous character string candidate may be output instead as a result of post-processing.
[0004]
Therefore, various methods for limiting candidate characters to be post-processed have been proposed in order to increase the accuracy of post-processing while suppressing an increase in calculation time. As typical examples, for example, (a) a method of directly using the certainty of each candidate character obtained by character recognition, (b) estimating the reliability of the recognition result from the certainty of each candidate character, A method of controlling the number of candidate characters accordingly, (c) a method of estimating the reliability of the recognition result from the linguistic connection between adjacent recognition candidate character groups, and the like.
[0005]
(A) is the simplest method. In other words, the certainty factor of each candidate character is compared with a predetermined threshold value, and only candidate characters having a certainty factor higher than this threshold value are subjected to post-processing.
In addition, as an example of a method for obtaining the reliability of the recognition result in the method (b), Japanese Patent Application Laid-Open No. 09-259226 “Recognition Result Evaluation Method and Recognition Device” can be cited. This is a method for obtaining a difference value between the certainty factor of the first candidate character and the certainty factor of the second candidate character, and using the linear sum of the difference value and the certainty factor of the first candidate character as a measure of the correctness of the recognition result. . This method focuses on the tendency that when the recognition result is correct, the certainty of the first candidate character is relatively high and the difference between the certainty of the first candidate character and the certainty of the second candidate character is relatively large. It is. In addition to this method, a method that uses the ratio between the certainty factor of the first candidate character and the certainty factor of each candidate character equal to or lower than the second place, the certainty factor of each candidate character is regarded as a multidimensional probability distribution, and statistical Various methods have been proposed, such as a method for obtaining reliability.
[0006]
Further, as the method (c), the reliability of the recognition result is statistically estimated by focusing on the connection probability between the characters included in the recognition candidate character group to be determined and the immediately preceding or subsequent recognition candidate character group. It is a method to do.
[0007]
[Problems to be solved by the invention]
However, generally, the certainty factor of each candidate character obtained as a result of character recognition does not necessarily properly reflect the correctness of the candidate character. For example, in the case of characters written relatively messy, the certainty factor tends to be low even if the candidate character is correct. On the other hand, if the candidate character and the correct character are similar, the certainty level may be high even if the answer is incorrect. Therefore, it is difficult to limit candidate characters with high accuracy by the method (a).
[0008]
In addition, when the combination of the first candidate character and the second candidate character is a similar character such as “tsu”-“fu”, “no”-“e”, these are recognized regardless of whether the recognition result is correct or not. The difference value of the certainty factor tends to be small. When all hiragana, katakana, kanji, and alphanumeric characters are to be recognized, such a combination of similar characters frequently occurs. Therefore, the reliability of the recognition result is also accurately estimated with the method (b). Difficult to do.
[0009]
On the other hand, in the method (c), problems due to the certainty characteristic as seen in the methods (a) and (b) are avoided. However, in this method, in the case of a character string having a low connection probability between characters (for example, a technical term or a proper noun that is not frequently used), the reliability of the recognition result may not be accurately obtained.
Therefore, according to the prior art, the recognition candidate characters that should be excluded from the target of post-processing are output as the targets of post-processing with high reliability, and conversely, the recognition candidate characters that should originally be the target of post-processing. As a result, there is a result that the reliability is excluded from the post-processing target, and the recognition accuracy is decreased by the determination of the reliability. A feature common to these conventional techniques is that a feature amount used for calculation of reliability determination is extracted only from a character recognition result.
[0010]
By the way, in so-called on-line handwritten character recognition for recognizing characters handwritten by a stylus or the like, time-series coordinate point sequences that are input data may contain information useful for estimating the reliability of recognition results. . For example, generally written characters tend to have a fast writing speed, and carefully written characters tend to have a slow writing speed. It is considered that the reliability of the recognition result is reduced because misrecognized characters are likely to be erroneously written. In general, when the number of strokes is small, there are many similar characters, so that erroneous recognition is likely to occur.
[0011]
Further, when the number of strokes is significantly smaller than the normal number of candidate characters obtained as a result of the recognition process, for example, the number of strokes of the first candidate character is 10 strokes while the number of strokes is one stroke. In such a case, there is a high possibility that a character having a large number of strokes is written in one stroke, that is, an extreme continuous character. Even in such a case, erroneous recognition is likely to occur. Therefore, the relationship between the number of strokes and the number of normal strokes of the candidate character also reflects the reliability of the character recognition result.
[0012]
Therefore, the present invention eliminates the problems of the prior art by focusing on features useful for reliability estimation obtained from handwritten input data, and character recognition results such as confidence of candidate characters and connection probability between candidate characters. The character recognition device, the character recognition method, the execution program thereof, and the program capable of improving the recognition accuracy can be calculated with relatively high accuracy even when the reliability of the recognition result cannot be estimated properly only by the features obtained from It is an object to provide a stored recording medium.
[0013]
[Means for Solving the Problems]
The invention according to claim 1 of the present application recognizes a coordinate point sequence of a character input by handwriting and outputs a recognition candidate character group, and a reliability of a determination target recognition candidate character group output from the character recognition unit. The average writing speed of the coordinate point sequence of the handwritten character as a feature amount for calculating the degree And the normal number of strokes of the top N characters (N ≧ 1) in the determination target recognition candidate character group A feature extraction unit that calculates the reliability, a reliability calculation unit that calculates a reliability of the determination target recognition candidate character group based on a feature amount from the feature extraction unit and a statistical tendency of the sample data, and the reliability Post-processing control means for controlling the post-processing of the determination target recognition candidate character group based on the reliability from the degree calculation means.
[0014]
The invention according to claim 2 is the invention according to claim 1, wherein the feature extraction means calculates the average writing speed and the number of writing strokes of the coordinate point sequence of the handwritten input character as the determination target recognition candidate character. It is characterized in that it is extracted as a feature quantity for calculating the reliability of the group.
[0015]

Claim

3 The invention according to claim 1 to claim 1 2 In the invention according to any one of the above, the feature extraction means further calculates the reliability of the upper M characters (M ≧ 1) in the determination target recognition candidate character group and the reliability of the determination target recognition candidate character group The feature amount is extracted as a feature amount.

Claim

4 The invention according to claim 1 to claim 1 3 In the invention according to any one of the above, the feature extraction means further includes a value of a connection probability between each recognition candidate character in the determination target recognition candidate character group and the immediately preceding recognition candidate character group with respect to the immediately preceding handwritten input or A value of the connection probability between the immediately-recognized candidate character group for the handwritten input immediately after that is extracted as a feature amount for calculating the reliability of the determination-target recognized candidate character group.
[0016]

Claim

5 The invention according to claim 4 In the described invention, the feature extraction means is a value of a connection probability between each recognition candidate character in the determination target recognition candidate character group and a recognition candidate character having the highest certainty factor in the immediately preceding recognition candidate character group, or The value of the connection probability with the recognition candidate character with the highest certainty factor in the immediately following recognition candidate character group is extracted as a feature amount of the determination target recognition candidate character group.
[0017]

Claim

6 The invention according to claim 5 In the described invention, the feature extraction means includes the highest probability of connection between one recognition candidate character in the determination target recognition candidate character group and each recognition candidate character in the recognition candidate character group immediately before or immediately after the recognition candidate character group. A connection probability between the one recognition candidate character and the immediately preceding or immediately following recognition candidate character group.
[0018]

Claim

7 The invention according to claim 1 to claim 1 6 In any of the inventions, the reliability calculation means includes a discrimination score calculation means for calculating the probability of one recognition candidate character in the determination target recognition candidate character group as a discrimination score from the feature amount, The reliability is calculated based on the discrimination score.

Claim

8 The invention according to claim 1 to claim 1 7 In any one of the inventions, the post-processing control means limits recognition candidate characters to be post-processed based on the reliability calculated from the reliability calculation means.
[0019]
The invention according to claim 9 is a character recognition method executed by a device for recognizing handwritten input characters, Character recognition means A character recognition step of recognizing a coordinate point sequence of a handwritten character and outputting a recognition candidate character group; Feature extraction means As a feature value for calculating the reliability of the determination target recognition candidate character group output from the character recognition step, an average writing speed of the coordinate point sequence of the handwritten character and the determination target recognition candidate character group A feature extraction step of calculating the normal number of strokes of the top N characters (N ≧ 1) of The reliability calculation means A reliability calculation step of calculating the reliability of the determination target recognition candidate character group based on the feature amount from the feature extraction step and the statistical tendency of the sample data; Post-processing control means And a post-processing control step of controlling post-processing of the determination target recognition candidate character group based on the reliability from the reliability calculation step.
[0020]
The invention according to claim 10 is the invention according to claim 9, wherein the feature extraction step includes: The feature extraction means is The average writing speed and the number of strokes of the coordinate point sequence of the characters input by handwriting are extracted as feature quantities for calculating the reliability of the determination target recognition candidate character group.
[0021]
The invention according to claim 11 is the invention according to any one of claims 9 to 10, wherein the feature extraction step includes: The feature extraction means is Further, the certainty factor of the upper M characters (M ≧ 1) in the determination target recognition candidate character group is extracted as a feature amount for calculating the reliability of the determination target recognition candidate character group.
The invention according to claim 12 is the invention according to any one of claims 9 to 11, wherein the feature extraction step includes: The feature extraction means is Further, the value of the connection probability between each recognition candidate character in the determination target recognition candidate character group and the immediately preceding recognition candidate character group for the immediately preceding handwriting input or the immediately following recognition candidate character group for the handwriting input immediately thereafter The value of the connection probability is extracted as a feature amount for calculating the reliability of the determination target recognition candidate character group.
[0022]
The invention according to claim 13 is the invention according to claim 12, wherein the feature extraction step comprises: The feature extraction means is The value of the connection probability between each recognition candidate character in the determination target recognition candidate character group and the recognition candidate character having the highest certainty factor in the immediately preceding recognition candidate character group or the highest certainty in the immediately following recognition candidate character group The value of the connection probability with the recognition candidate character of the degree is extracted as the feature amount of the determination target recognition candidate character group.
[0023]
The invention according to claim 14 is the invention according to claim 12, wherein the feature extraction step includes: The feature extraction means is Among the connection probabilities between one recognition candidate character in the determination target recognition candidate character group and each recognition candidate character in the immediately preceding or immediately following recognition candidate character group, the highest connection probability is the one recognition candidate character. The connection probability between the immediately preceding and immediately following recognition candidate character groups is used.
[0024]
The invention according to claim 15 is the invention according to any one of claims 9 to 14, wherein the reliability calculation step includes: The reliability calculation means is A discrimination score calculating step of calculating, as a discrimination score, a probability of one recognition candidate character in the determination target recognition candidate character group from the feature amount, and calculating the reliability based on the discrimination score To do.
The invention according to claim 16 is the invention according to any one of claims 9 to 15, wherein the post-processing control step comprises: The post-processing control means is Based on the reliability calculated from the reliability calculation step, recognition candidate characters to be post-processed are limited.
[0025]
Claim 17 The invention according to claim is claimed in a computer. 9 Or 16 A program for executing each processing step in the character recognition method according to any one of the above.

Claim

18 The invention according to claim is claimed in a computer. 9 Or 16 The computer-readable recording medium which recorded the program for performing each processing step in the character recognition method in any one of these.
[0026]
DETAILED DESCRIPTION OF THE INVENTION
<First Embodiment>
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings.
First, FIG. 1 is a circuit block diagram of a handwritten character recognition apparatus according to the first embodiment.
[0027]
In FIG. 1, reference numeral 1 denotes an input unit which generates and outputs handwritten character information from handwriting input by handwriting on a tablet or the like. Reference numeral 2 denotes a character recognition unit which compares handwritten character information supplied from the input unit 1 with the character feature amount of the character recognition dictionary 3, and the degree of proximity (confidence) between the two in the recognition dictionary from the first to the Nth. The character is output as a recognition candidate character for the handwritten character. Reference numeral 3 denotes a character recognition dictionary in which candidate characters are stored together with their character feature amounts.
[0028]
4 is a feature extraction unit that calculates the average writing speed from the handwritten character information obtained by the input unit 1 and calculates the number of strokes. Further, referring to the normal stroke count table 5, the normal stroke count of the top N characters in the recognition candidate character group to be determined supplied from the character recognition unit 2 is extracted. Reference numeral 5 denotes a normal stroke number table, which stores the normal stroke number of the recognition target character in association with each recognition target character.
[0029]
A discrimination score calculation unit 6 processes the feature amount obtained by the feature extraction unit 4 and calculates a correct / incorrect discrimination score of the recognition candidate character group to be determined.
A recognition reliability calculation unit 7 compares the discrimination score from the discrimination score calculation unit 6 with the discrimination score-reliability conversion table 8 and outputs the reliability of the recognition candidate character group to be determined. Reference numeral 8 denotes a discrimination score-reliability conversion table, which stores the relationship between the discrimination score and the reliability as a table.
[0030]
A recognition candidate number control unit 9 compares the reliability from the recognition reliability calculation unit 7 with the reliability-cumulative correct reading rate table 10 and limits the number of recognition candidates for the determination target character. . Reference numeral 10 denotes a reliability-cumulative correct reading rate table, which stores the relationship between reliability and correct reading rate as a table.
A language processing unit 21 performs post-processing on the number of recognition candidate characters set by the recognition candidate number control unit 9 and outputs character string candidates.
[0031]
The “character recognition means” in the claims corresponds to the character recognition unit 2 and the character recognition dictionary 3 in FIG. 1 in the embodiment. The “feature extraction unit” in the claims corresponds to the feature extraction unit 4 and the normal stroke number table 5 of FIG. 1 in the embodiment. The “reliability calculation means” in the claims corresponds to the discrimination score calculation unit 6, the recognition reliability calculation unit 7, and the discrimination score-reliability conversion table 8 of FIG. 1 in the embodiment. The “post-processing control means” in the claims corresponds to the recognition candidate number control unit 9 and the reliability-cumulative correct reading rate table 10 of FIG. 1 in the embodiment.
[0032]
Next, details of processing of each unit shown in the circuit block diagram will be described.
First, the process of the feature extraction unit 4 will be described with reference to FIG.
FIG. 2 shows a coordinate point sequence when the character “I” is written from the input unit 1. The feature extraction unit 4 calculates the average writing speed from the coordinate point sequence (xi, yi, ti, pi) (i = 1 to K) obtained by the input unit 1 and obtains the number of writing strokes. Here, K is the number of coordinate points, xi and yi are the x-coordinate and y-coordinate of the i-th coordinate point, respectively, and ti is the time when the i-th coordinate point occurs. Pi represents the state of the input pen at time ti, and has a value of pi = 1 when the pen is in contact with the tablet and pi = 0 when the pen is away from the tablet.
[0033]
The writing speed vi when the pen moves from the i-th coordinate point to the (i + 1) -th coordinate point is expressed by Formula 1.
[0034]
[Expression 1]

[0035]
Therefore, the average writing speed V of the coordinate point sequence is calculated by Equation 2.
[0036]
[Expression 2]

[0037]
Further, by counting the number of times that the value of pi indicating the input pen state changes from 1 to 0, the number of handwritten strokes Sinp is obtained. Further, the normal stroke number Sn1, Sn2,..., SnN is obtained by referring to the normal stroke number table 5 for the upper N characters of the recognition candidate character group.
Next, with reference to FIG. 3, the process in the discrimination score calculation unit 6 will be described.
[0038]
A set of the average writing speed V, the number of strokes Sinp, and the number of normal strokes Sn1, Sn2,..., SnN extracted by the feature extraction unit 4 can be expressed as a predetermined vector (feature vector) in a (2 + N) -dimensional vector space. In the discrimination score calculation unit 6, a feature vector is similarly extracted as learning data for a sample in which the first recognition candidate character is correctly read or misread in advance, and the feature vector of the recognition candidate character group to be determined and Are compared, and the discrimination score of the recognition candidate character group is calculated.
[0039]
For example, it is assumed that the learning data of the feature vectors of correct reading and misreading and the feature vectors of the recognition candidate character group to be determined are in a state as shown in FIG. FIG. 3 shows a case where N = 1, that is, the dimension number of the feature vector is 3 for the sake of simplicity. The discrimination score calculation unit 6 obtains an average value (centroid vector) and a covariance matrix of feature vectors of both classes in advance from the distribution of feature vectors of the correct reading and misreading sets (classes), and stores them. ing. Then, the Mahalanobis distances DMc and DMe between the center of gravity of each class and the feature vector of the determination target character group are obtained, and the ratio of these values, the logarithm or the difference of the values is used as a discrimination score.
[0040]
Here, the Mahalanobis distance DM is calculated as follows. That is, when the center of gravity vector of class C1 is m1 and the covariance matrix of class C1 is Σ1, Mahalanobis square distance DM1 from a predetermined feature vector x to m1 is defined by Equation 3.
[0041]
[Equation 3]

[0042]
Here, the covariance matrix Σ1 is an n × n square matrix (n: dimension number of the feature space), and its (i, j) element is the covariance of the i-th feature quantity and the j-th feature quantity, that is, Σ1 (i, j) = σij.
In the above description, the discriminant score is calculated using the Mahalanobis distance DM. Instead of this, a linear discriminant function is obtained by linear discriminant analysis from the distribution of feature vectors of the correct reading and misreading classes, and the determination target is obtained. A discrimination score may be obtained by applying this linear discriminant function to the feature vector of the recognition candidate character group.
[0043]
In addition, using the neural network trained so that the feature vector extracted from the correct reading and misreading learning samples can be determined whether the target feature vector is correct reading or misreading as the learning data, the neural network for the determination target feature vector is used. The net output value may be used as the discrimination score.
Next, processing of the recognition reliability calculation unit 7 will be described with reference to FIG.
[0044]
For example, in the calculation of the discrimination score, when the Mahalanobis distance DMc, DMe distance ratio or the logarithm of the ratio is used as the discrimination score, the relationship between the discrimination score obtained from each of the correctly read and misread learning samples and the reliability is shown in FIG. As shown. Here, the reliability is calculated from the learning sample by Bayes' theorem.
That is, the ratio of the number of correctly read samples having the discrimination score y to the total number of correctly read samples is p (y | X1 = C), and the ratio of the number of misread samples having the discrimination score y to the number of all misread samples is p (y | X1 = E), the ratio of the total number of correctly read samples to the total number of samples is P (X1 = C), and the ratio of the total number of misread samples to the total number of samples is P (X1 = E), it has a discrimination score y The reliability of the recognition candidate character having the highest certainty in the recognition candidate character group can be calculated by the following equation.
[0045]
P (X1 = C | y) = p (y | X1 = C) · P (X1 = C) / [p (y | X1 = C) · P (X1 = C) + p (y | X1 = E) · P (X1 = E)]
Here, X1 represents the first recognition candidate character, and X1 = C and X1 = E mean events where the first recognition candidate character is correct and incorrect, respectively.
A discriminant score-reliability conversion table showing the relationship between the discriminant score and the reliability is created in advance from this equation, and this is stored in the discriminant score-reliability conversion table 8. The recognition reliability calculation unit 7 compares the discrimination score from the discrimination score calculation unit 6 with the score of the discrimination score-reliability conversion table 8, and determines the corresponding reliability as the first recognition of the recognition candidate character group. Output as reliability of candidate characters.
[0046]
Next, the processing of the recognition candidate number control unit 9 will be described with reference to FIG.
The table shown in the upper part of FIG. 5 shows the reliability of the recognition candidate character with the first confidence level of the determination target and the cumulative probability that the correct recognition character is included in the recognition candidate characters up to the Nth determination target. It shows the relationship. The probabilities in the table are calculated in advance based on the correct reading and misreading learning samples.
[0047]
The reliability-cumulative correct reading rate table 10 stores such a table. Then, the recognition candidate number control unit 9 compares the reliability of the recognition candidate character group to be determined with the reliability level in the table, and recognizes up to what level while referring to the cumulative probability of the corresponding reliability level. Whether to output the candidate character to the language processing unit 21 is determined. Here, how much is output is determined by, for example, whether or not the cumulative probability of the corresponding reliability level has reached a predetermined threshold value. At this time, the set threshold value may be uniform for all reliability levels, or may be set individually for each reliability level.
[0048]
Alternatively, the number of output candidates for each reliability level may be set in advance based on the upper table in FIG. 5 and stored in the reliability-cumulative correct reading rate table 10. The table shown in the lower part of FIG. 5 is an example when the reliability level and the number of output candidates are preset. When the table is stored in the reliability-cumulative correct reading rate table 10 in advance, the recognition candidate number control unit 9 reads the corresponding output candidate number from the table and outputs it to the language processing unit 21 accordingly. Restrict recognition candidate characters.
[0049]
In the above embodiment, since the feature quantity useful for estimating the reliability of the recognition result is extracted from the handwritten input data, the reliability of the recognition result cannot be properly estimated from the certainty of the candidate character. The reliability can be calculated with relatively high accuracy, so that recognition candidate characters with a high correct reading rate can be output to the language processing unit.
<Second Embodiment>
Next, a second embodiment according to the present invention will be described below.
[0050]
In the present embodiment, the feature extraction processing in the feature extraction unit 4 is changed.
First, FIG. 6 shows a circuit block diagram of the handwritten character recognition apparatus according to the present embodiment. The difference from FIG. 1 shown in the first embodiment is that a “character connection probability dictionary 11” is added.
[0051]
In the present embodiment, as the feature extraction processing in the feature extraction unit 4, in addition to the processing contents shown in the first embodiment, by referring to the inter-character concatenation probability dictionary 11, in the determination target recognition candidate character group Is added to extract the connection probability between the recognition candidate character and the immediately preceding and subsequent recognition candidate character group.
That is, in the first embodiment, the average writing speed, the number of writing strokes, and the normal N number of the top N characters of the determination target recognition candidate character group are used as the feature quantities used in the reliability calculation. In the second embodiment, In addition to these, the connection probability value Pbk (k = 1 to L) between the recognition candidate characters up to the top L in the determination target recognition candidate character group and the immediately preceding recognition candidate character group, and the determination target recognition The connection probability value Pfk (k = 1 to L) between the recognition candidate characters up to the top L in the candidate character group and the recognition candidate character group immediately after the recognition candidate character is used as a feature amount used for reliability calculation.
[0052]
Here, in the present embodiment, the connection probability value Pbk between the kth candidate character in the determination target recognition candidate character group and the immediately preceding recognition candidate character group is the kth candidate character and the immediately preceding 1 character character. The maximum value of the connection probability between the candidate characters from the first place to the Jth place. Similarly, Pfk is set to the maximum value of the connection probability between the kth candidate character and the immediately succeeding first to Jth candidate characters.
[0053]
For example, in the example of FIG. 7, Pb1 for the first character “day” of the recognition candidate character to be determined is the concatenation of the “day” and the immediately preceding first character “朋” to the J character “hu”. Among the probabilities P (C1 | Cbk), the maximum connection probability is adopted. Further, Pf1 for the first character “day” is the largest concatenation among the respective connection probabilities P (Cfk | C1) from the first character “mo” immediately after that “day” to the J character “dead”. Adopt probability. Similarly, Pb2 and Pf2 for the second character “Month” of the recognition candidate character to be determined adopt the maximum values of the connection probabilities for the immediately preceding and immediately following character groups, respectively.
[0054]
Here, C1 represents the character at the first recognition candidate to be determined, and Cbk and Cfk represent the character at the kth recognition candidate immediately before and after, respectively. P (Cj | Ci) represents the connection probability that the character Cj appears after the character Ci.
In the second embodiment, the discrimination space shown in FIG. 3 has a (2 + N + 2L) dimension. In addition to the average writing speed, the number of strokes, the number of normal strokes of the top N characters of the recognition candidate character group of the sample, the sample of correct reading and misreading, the connection probability Pbk for the top L characters of the recognition candidate character group of the sample, Pfk is used as a feature extraction element, and tables stored in the discrimination score-reliability conversion table 8 and the reliability-cumulative correct reading rate table 10 are set according to the sample data.
[0055]
In the second embodiment, in addition to the feature amount employed in the first embodiment, the connection probability between characters included in the adjacent recognition candidate character group is employed as the feature amount for reliability determination. The reliability determination can be performed with higher accuracy than in the first embodiment.
As yet another embodiment, the certainty factor (similarity or distance value) of recognition candidate characters up to the Mth position is added as a feature element in addition to the connection probabilities Pbk and Pfk, and the corresponding (+ N + 2L + M) -dimensional vector space You may make it extract the feature vector of a recognition candidate character group. In such a case, the discrimination space shown in FIG. 3 also has a (2 + N + 2L + M) dimension. In the second embodiment, not only the connection relationship but also the certainty factor is taken into account, so that it is possible to determine the reliability with higher accuracy.
[0056]
By the way, in the said embodiment, although a series of processing flow was demonstrated dividing processing for every block in FIG. 1, it is also possible to perform this processing flow by CPU according to a control program. In such a case, the processing flow is stored as a control program in the ROM or RAM. Further, reference data of the character recognition dictionary 3, the normal stroke number table 5, the discrimination score-reliability conversion table 8, the reliability-cumulative correct reading rate table 10, and the inter-character connection probability dictionary 11 are also stored in the ROM or RAM. The CPU executes the above process according to the control program while referring to the reference data.
[0057]
FIG. 8 shows a flow according to such a control program. Here, step S101 is processing in the input unit 1, step S102 is processing in the character recognition unit 2, step S103 is processing in the feature extraction unit 4, step S104 is processing in the discrimination score calculation unit 6, and step S105 is recognition reliability calculation. Processing in the unit 7, step S 106 is processing in the recognition candidate number control unit 9.
[0058]
The “character recognition step” in the claims corresponds to step S102 in FIG. 7 in the embodiment. The “feature extraction step” in the claims corresponds to step S103 in FIG. 7 in the embodiment. The “reliability calculation step” in the claims corresponds to steps S104 and S105 in FIG. 7 in the embodiment.
Such control programs and various reference data can be traded via a recording medium such as a flexible disk or a transmission medium such as the Internet. An example of a file structure of data traded through a recording medium or a transmission medium is shown in FIG. Data having such a file structure is recorded on the recording medium. In a transaction via a transmission medium, data having such a file structure is supplied via the transmission medium.
[0059]
As mentioned above, although embodiment which concerns on this invention was described, this invention is not restrict | limited to this embodiment, A various change is possible for others. For example, in the embodiment, an example is shown in which the reliability is calculated using the average writing speed, the number of strokes, the number of normal strokes of candidate characters, the connection probability between adjacent recognition candidate character groups, and the certainty of candidate characters as feature quantities. However, among these various feature amounts, only the feature amounts that are particularly useful in the individual implementation apparatuses can be selected and employed.
[0060]
Further, the processing contents in the recognition candidate number control unit 9 in FIG. 1 may be rejected (invalidated) according to the reliability, not the limitation on the number of recognition candidates.
Furthermore, although the target of handwriting input has exemplified a character as an example, it is needless to say that the object is not limited to this and may be a figure.
[0061]
In addition, the discrimination score calculation method in the discrimination score calculation unit 6 in FIG. 1 and the recognition reliability calculation method in the recognition reliability calculation unit 7 in FIG. 1 also use the Mahalanobis distance DM shown in the above embodiment. Alternatively, a method other than the method using Bayes' theorem can be adopted.
The embodiment of the present invention can be variously modified as appropriate within the scope of the technical idea of the present invention.
[0062]
Further, the above-described embodiment is merely one embodiment of the present invention, and the meaning of the term of the present invention or each constituent element is not limited to that described in the embodiment.
[0063]
【The invention's effect】
As described above, according to the present invention, by using features useful for reliability estimation obtained from handwritten input data, only the information obtained from character recognition results such as the certainty of candidate characters and the connection probability between candidate characters can be used for the recognition. Even when the reliability of the result cannot be estimated properly, the reliability can be calculated with relatively high accuracy. If this is adopted in the character recognition device, the accuracy of character recognition can be improved.
[Brief description of the drawings]
FIG. 1 is a circuit block diagram according to a first embodiment.
FIG. 2 is a diagram for explaining processing of a feature extraction unit according to the first embodiment.
FIG. 3 is a diagram for explaining processing of a discrimination score calculation unit according to the first embodiment.
FIG. 4 is a diagram for explaining processing of a recognition reliability calculation unit according to the first embodiment.
FIG. 5 is a diagram for explaining processing of a recognition candidate number control unit according to the first embodiment;
FIG. 6 is a circuit block diagram according to a second embodiment.
FIG. 7 is a diagram for explaining processing of a feature extraction unit according to the second embodiment.
FIG. 8 is an execution flowchart according to the first and second embodiments.
FIG. 9 is a file structure of an execution program and reference data according to the second embodiment.
[Explanation of symbols]
1 ... Input section
2 ... Character recognition part
3 ... Character recognition dictionary
4 ... Feature extraction unit
5 ... Regular stroke number table
6: Discrimination score calculation unit
7: Recognition reliability calculation unit
8 ... Distinguished score-Reliability conversion table
9 ... Recognition candidate number control unit
10 ... Reliability-cumulative correct reading rate table
11 ... Intercharacter connection probability dictionary

Claims

A character recognition means for recognizing a coordinate point sequence of a handwritten character and outputting a recognition candidate character group;
As a feature amount for calculating the reliability of the determination target recognition candidate character group output from the character recognition means, an average writing speed of the coordinate point sequence of the handwritten character and the determination target recognition candidate character group Feature extraction means for calculating the number of normal strokes of the top N characters (N ≧ 1) of
Reliability calculation means for calculating the reliability of the determination target recognition candidate character group based on the feature amount from the feature extraction means and the statistical tendency of the sample data;
And a post-processing control unit that controls post-processing of the determination target recognition candidate character group based on the reliability from the reliability calculation unit.

In the invention according to claim 1, the feature extraction unit calculates the reliability of the determination target recognition candidate character group based on the average writing speed and the number of strokes of the coordinate point sequence of the character input by handwriting. A character recognition device characterized by extracting as a feature amount.

3. The invention according to claim 1, wherein the feature extraction unit further determines the certainty factor of the upper M characters (M ≧ 1) in the determination target recognition candidate character group in the determination target recognition candidate character group. A character recognition device characterized in that it is extracted as a feature amount for calculating the reliability of the character.

In the invention according to any one of claims 1 to 3, the feature extraction unit further includes a position between each recognition candidate character in the determination target recognition candidate character group and a previous recognition candidate character group for handwriting input immediately before the recognition candidate character group. Extracting the value of the concatenation probability or the value of the concatenation probability between the immediately-recognized candidate character group for the handwriting input immediately after that as a feature amount for calculating the reliability of the determination-target recognition candidate character group, Character recognition device.

5. The invention according to claim 4, wherein the feature extraction means is a connection probability between each recognition candidate character in the determination target recognition candidate character group and a recognition candidate character having the highest certainty factor in the immediately preceding recognition candidate character group. Or a value of the connection probability with the recognition candidate character with the highest certainty factor in the immediately following recognition candidate character group as a feature quantity of the determination target recognition candidate character group.

In the invention according to claim 5, the feature extraction unit is configured to calculate a connection probability between one recognition candidate character in the determination target recognition candidate character group and each recognition candidate character in the recognition candidate character group immediately before or after the recognition candidate character group. A character recognition device characterized in that the highest connection probability is a connection probability between the one recognition candidate character and the immediately preceding or immediately following recognition candidate character group.

7. The discrimination score according to claim 1, wherein the reliability calculation unit calculates a probability of one recognition candidate character in the determination target recognition candidate character group from the feature amount as a discrimination score. A character recognition apparatus comprising: a calculation means, wherein the reliability is calculated based on the discrimination score.

8. The invention according to claim 1, wherein the post-processing control means limits recognition candidate characters to be post-processed based on the reliability calculated from the reliability calculation means. Character recognition device.

A character recognition method executed by a device for recognizing a handwritten character , wherein the character recognition means recognizes a coordinate point sequence of the handwritten character and outputs a recognition candidate character group;
As the feature quantity for calculating the reliability of the determination target recognition candidate character group output from the character recognition step by the feature extraction unit, the average writing speed of the coordinate point sequence of the character input by handwriting and the determination target A feature extraction step of calculating the number of normal strokes of the top N characters (N ≧ 1) in the recognition candidate character group;
A reliability calculation step , wherein the reliability calculation means calculates the reliability of the determination target recognition candidate character group based on the feature amount from the feature extraction step and the statistical tendency of the sample data;
And a post- processing control step for controlling post-processing of the determination target recognition candidate character group based on the reliability from the reliability calculation step.

In the invention according to claim 9, in the feature extraction step, the feature extraction means calculates the average writing speed and the number of strokes of the coordinate point sequence of the handwritten input character in the determination target recognition candidate character group. A character recognition method characterized by extracting as a feature amount for calculating reliability.

In the invention according to any one of claims 9 to 10, in the feature extraction step, the feature extraction unit further determines the certainty of the upper M characters (M ≧ 1) in the determination target recognition candidate character group. A character recognition method, characterized in that it is extracted as a feature value for calculating a reliability of a determination target recognition candidate character group.

12. The feature extraction step according to claim 9, wherein in the feature extraction step, the feature extraction means further recognizes each recognition candidate character in the determination target recognition candidate character group and a previous recognition candidate for the handwriting input immediately before the recognition candidate character. The value of the connection probability with the character group or the value of the connection probability with the immediately following recognition candidate character group for the handwriting input immediately after that is used as a feature amount for calculating the reliability of the determination target recognition candidate character group. A character recognition method characterized by extracting.

13. The feature extraction step according to claim 12, wherein in the feature extraction step, the feature extraction unit recognizes each recognition candidate character in the determination target recognition candidate character group and a recognition candidate character of the highest certainty factor in the immediately preceding recognition candidate character group. Or the value of the connection probability between the recognition candidate character with the highest certainty factor in the immediately following recognition candidate character group as the feature amount of the determination target recognition candidate character group Character recognition method.

In the invention according to claim 12, in the feature extraction step, the feature extraction means includes one recognition candidate character in the determination target recognition candidate character group and each recognition candidate character in the immediately preceding or immediately following recognition candidate character group. A character recognition method characterized in that a connection probability between the one recognition candidate character and the immediately preceding or immediately following recognition candidate character group is the highest connection probability among the connection probabilities between.

15. The invention according to claim 9, wherein in the reliability calculation step, the reliability calculation means determines the probability of one recognition candidate character in the determination target recognition candidate character group from the feature amount. A character recognition method comprising: a discrimination score calculation step for calculating as a discrimination score, and calculating the reliability based on the discrimination score.

16. The invention according to claim 9, wherein in the post-processing control step, the post-processing control unit recognizes a target for post-processing based on the reliability calculated from the reliability calculation step. A character recognition method characterized by restricting candidate characters.

The program for making a computer perform each process step in the character recognition method in any one of Claim 9 thru | or 16.

A computer-readable recording medium storing a program for causing a computer to execute each processing step in the character recognition method according to claim 9.