JP2004272396A

JP2004272396A - Character recognition device, character recognition method, character recognition program and recording medium

Info

Publication number: JP2004272396A
Application number: JP2003059214A
Authority: JP
Inventors: Yoshihisa Oguro; 慶久大黒
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2003-03-05
Filing date: 2003-03-05
Publication date: 2004-09-30
Anticipated expiration: 2023-03-05
Also published as: JP4263928B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a character recognition device whose dictionary retrieving performance is improved, and whose recognition error correcting precision is improved by preliminarily setting similar characters likely to be erroneously recognized for each character or the different fonts, and executing dictionary retrieval by considering the similar characters and different fonts(including old characters, familiar characters, person's name Chinese characters or the like) in addition to the normal dictionary retrieval. <P>SOLUTION: An original image is inputted by image inputting equipment such as a scanner (S1). Next, a rectangle circumscribing this is calculated for each of the continuous ranges of black pixels (S2). Then, the circumscribing rectangles are integrated, and grown to a line (S3). The line is divided into the ranges which seem to be one character based on black pixel projection and line height or the like. In this case, a plurality of candidates may be extracted (S4). Then, the ranges acquired in the step S4 are recognized as one character, and image characteristics are collated with a recognition dictionary so that recognition scores can be calculated, and solutions which are not less than a preliminarily set threshold are held as recognition candidates (S5). The list of the recognition candidates calculated in the step S5 is collated with a language dictionary and grammar, and the proper solution is selected considering the recognition scores(S6). <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、文字認識装置に関し、さらに詳しくは、光学的文字読取装置の読み取り結果に基づいて、言語情報を用いて自動修正を行なう文字認識装置に関するものである。
【０００２】
【従来の技術】
従来技術として特開平０５−０４６８１４号公報には、類似文字を考慮して辞書検索を行うことにより、認識精度を向上させる方法について開示されている。それによると、文字列を読み込む文字読み取り部と、読み込んだ文字列を電子計算機に定められた文字コードに変換する文字認識部と、文字コードに変換した文字列を単語および文節単位に分割する文節認識部と、文字列を格納している辞書部と、単語および文節単位に分割された文字列が辞書部内に格納された文字列と同一のものがあるかを検索する辞書検索部とを備え、検索した結果、認識された文字列が辞書部に存在する場合は、認識が正しいとして文字認識を終了し、認識された文字列が辞書部に存在しない場合は、再度前記文字認識部が輪郭の近い他の文字列に置換して認識した文字列が辞書部にある文字列と一致するまで繰り返すとしている。
また、特開平０７−０３６８８２号公報には、辞書検索装置において、入力ミスによりこれまで辞書検索できなかった単語についても検索できるようにする技術について開示されている。それによると、文字の部分集合に対してグループＩＤを与える変換文字定義体、文字をグループＩＤに置き換える文字−グループＩＤ変換部、入力部から入力された文字列をグループＩＤ列に置き換える入力文字列変換部、単語辞書を表記グループＩＤにより定義された変換単語辞書に変換する辞書変換部、グループＩＤ列により変換単語辞書を検索する辞書検索部により辞書検索装置を構成する。そして変換文字定義体で定義された文字集合の要素を同一とみなすことにより、辞書検索できなかった単語についても辞書検索できるようになるとしている。
【特許文献１】特開平０５−０４６８１４号公報
【特許文献２】特開平０７−０３６８８２号公報
【０００３】
【発明が解決しようとする課題】
しかしながら、特許文献１で示されている方法は、辞書検索に失敗した際には、輪郭の類似した文字に置換して、再度辞書検索を試みることによって辞書検索の成功の可能性を高める方法である。この方法では、輪郭の類似した文字しか対象としていないこと、しかも認識結果の正否の検定に辞書検索を使用するので、辞書登録されている正解ではない他の単語と照合してしまった場合には、類似文字の存在を考慮することなく、認識結果を確定してしまうといった問題がある。
また、特許文献２で示されている方法は、互いに類似している文字集合（類似文字集合）を用いて、辞書検索の際に類似文字照合を実現する方法である。この方法では、同じ類似文字集合内に属する文字は相互に混同することとなる。その結果、前後の単語との組み合わせの可能性が増える分、処理時間が増大し、しかも選択を誤る虞がある。また「黒澤」「黒沢」のように、類似文字とは性格の違う異字体（澤、沢）に関しては対処することができない。つまり辞書の見出し語に一方の表記しか登録されていない場合には、照合することができないといった問題がある。
本発明は、かかる課題に鑑み、文字毎に、その文字に誤認識しやすい類似文字や、その異字体を予め設定しておき、通常の辞書検索に加えて、その類似文字および異字体（旧字・俗字・人名漢字などを含む）を考慮して辞書検索を実施することにより、辞書検索性能を高め、誤認識の訂正精度を向上させる文字認識装置を提供することを目的とする。
【０００４】
【課題を解決するための手段】
本発明はかかる課題を解決するために、請求項１は、画像入力機器により原稿画像を入力する画像入力部と、該画像入力部により取り込まれた画像の黒画素の連続する範囲ごとに前記黒画素と外接する矩形を求める矩形抽出処理部と、該矩形抽出処理部により抽出された隣接する矩形同士を統合して行に成長させていく行切り出し処理部と、該行切り出し処理部により切り出された行を１文字の範囲に分割する文字切り出し処理部と、該文字切り出し処理部により分割された範囲を１文字とみなし、前記文字の画像特徴と認識辞書とを照合して予め設定した閾値以上の解を認識候補とする認識処理部と、該認識処理部により求められた認識候補の並びと言語辞書及び文法とを照合して妥当な解を選択する言語処理部とを備えたことを特徴とする。
本発明は入力された画像の黒画素と外接する矩形を求め、その矩形の隣接する矩形同士を統合して行を形成する。その行から文字として１文字の範囲に分割し、それを１文字とみなしその文字の特徴と認識辞書とを照合して認識候補を選定する。選定された候補から文法的に妥当な解を見つけ出して文字として認識する方法である。
かかる発明によれば、認識候補の並びと言語辞書及び文法とを照合して妥当な解を選択するので、文字の認識率を高めてしかも処理速度を速くすることができる。
【０００５】
請求項２は、前記文字切り出し処理部により分割された文字毎に、該文字と関連する文字を記憶する関連文字記憶手段を更に備え、前記言語処理部は、辞書検索時に前記認識処理部により認識された認識文字と前記関連文字記憶手段から読み出された関連する文字とを切り替えて検索することを特徴とする。
文字には言語辞書の単語に含まれる文字が変化しやすい文字（誤認識文字や異字体・旧字・俗字など）があり、それらの文字を関連文字として記憶しておき、辞書検索における単語照合の際、登録単語の変化後の文字列も考慮することよって、高精度に辞書検索を行うことができる。
かかる発明によれば、文字と関連する文字を記憶するので、検索に失敗する確率を低下させて辞書検索の精度を向上することができる。
請求項３は、前記関連文字記憶手段の記憶内容を操作者が変更可能とすることを特徴とする。
使用者の傾向を反映しない場合、特に誤認識傾向については使用する原稿によって大きく異なる。従って、誤認識傾向が適合しない原稿に対して、辞書検索を実施することは、不要な単語候補を増やし認識精度の低下につながる。同様に、旧字体を使用しないことがわかっている原稿に対して、旧字体を考慮することは処理時間が増大するだけでなく、認識精度も低下する。よって、実際に使用する誤認識傾向および使用傾向を反映できるよう、関連文字記憶手段の記憶内容を変更可能にすることが好ましい。
かかる発明によれば、関連文字記憶手段の記憶内容を変更可能にするので、実際に使用する誤認識傾向および使用傾向を反映することができる。
【０００６】
請求項４は、前記文字切り出し処理部により分割された文字列の長さを計測する文字列計測手段を更に備え、前記言語処理部は、辞書検索時に検索する対象単語の長さを前記文字列計測手段により計測し、該計測結果に応じて使用する認識候補数の上限を変更することを特徴とする。
短い単語の場合は候補文字数が多くなれば類似する文字と照合する危険があるが、また長い単語の場合は候補文字を多くしても、誤って認識結果と合致することが少い。それよりも、有効候補順位以下に正解が含まれている場合に、それをキーに単語照合を続けることによって、正しい単語の照合が成功する可能性がある。従って、照合する単語の文字数に応じて有効候補数を変更すれば、汚れなどで１文字だけ極端に認識スコアが低い文字などを正解として選択することが可能になる。
かかる発明によれば、文字列の長さに応じて使用する認識候補数の上限を変更するので、１文字だけ極端に認識スコアが低い文字などを正解として選択することが可能となり、文字の認識率を高くすることができる。
請求項５は、前記言語処理部は、前記文字列計測手段により計測した辞書検索対象の単語の長さに応じて、前記関連文字記憶手段に記憶されている関連文字を使用するか否かを決定することを特徴とする。
検索対象となる単語の文字数が多い場合には、認識候補数が多くても問題になりにくいが、単語の文字数が少い場合には誤って照合してしまう虞がある。したがって、検索対象となる単語長が長い場合に限って、関連文字を使用する辞書検索を実施するようにすれば、誤った辞書検索を避けることができる。
かかる発明によれば、辞書検索対象の単語の長さに応じて、関連文字を使用するか否かを決定するので、検索対象となる単語長が長い場合に限って、関連文字を使用する辞書検索を実施するようにして、誤った辞書検索を避けることができる。
請求項６は、前記言語処理部は、前記文字列計測手段により計測した辞書検索対象の単語の長さに応じて、前記関連文字記憶手段に記憶されている関連文字の使用レベルを決定することを特徴とする。
認識処理に関係する類似文字は単語の長さに応じて使用の有無を判断し、異字体・旧字・俗字など単語表記のバリエーションに関連する文字は、単語の長さに関わらず常に照合時に使用するようにする。これは、各文字に関連づける文字を、複数種類に分類しておけば容易に実現できる。種類の数は類似の程度に応じて段階的に分類しておいてもよい。
かかる発明によれば、辞書検索対象の単語の長さに応じて、関連文字の使用レベルを決定するので、単語の長さに応じて最適な使用レベルを決定することができる。
【０００７】
請求項７は、前記文字列計測手段により計測した検索対象となる単語の長さに応じて、完全に一致する文字の割合を設定する一致割合設定手段を更に備え、該一致割合設定手段により設定された割合に完全に一致した文字数が達しない場合には、検索した単語を受理しないことを特徴とする。
類似文字を考慮して照合した場合、文字列のほとんどが類似文字と照合することによって全く異る単語を受理してしまう危険がある。このような過剰な照合を防ぐため、文字列の長さに対する類似文字一致の割合を設定しておき、その割合を越えた場合には、その単語は破棄するようにする。
かかる発明によれば、設定された割合に完全に一致した文字数が達しない場合には、検索した単語を受理しないので、全く異る単語を受理してしまう危険性を少なくすることができる。
請求項８は、前記関連文字記憶手段の記憶文字により関連する文字を適用して辞書検索を受理した単語の評価スコアは、完全に一致した単語の評価スコアより評価を低くし、且つ前記辞書検索を受理した単語の評価スコアを未知語の評価スコアよりも高くすることを特徴とする。
類似文字を考慮せず、認識結果と完全に一致した単語と、類似文字を考慮して一致した単語とが言語処理において競合する場合がある。この場合には、完全に一致した単語を優先すべきであるから、単語候補スコアの算出式においては、同一長さであれば、類似文字を考慮して一致した単語候補よりも、認識結果と完全に一致した単語候補の方が高くなるように調整するのが好ましい。
かかる発明によれば、同一長さであれば、類似文字を考慮して一致した単語候補よりも、認識結果と完全に一致した単語候補の方が高くなるように調整するので、認識結果と完全に一致した単語候補を優先的に選択することができる。
請求項９は、検索する単語の両端の文字が完全に一致する場合のみ前記単語の検索結果を受理することを特徴とする。
単語の両端部は隣接文字との接触面であり、文脈的制約が弱く文字切り出し誤りが訂正されにくい部分であり、この部分を不正確な照合で辞書検索を確定してしまうことは、誤認識を誘発してしまう結果となる。したがって、その危険性を加味して、単語の両端は認識結果の有効文字範囲内と完全に一致するという制限を設ける。
かかる発明によれば、単語の両端の文字が完全に一致する場合のみ前記単語の検索結果を受理するので、単語の両端での誤認識を減少することができる。
【０００８】
請求項１０は、画像入力機器により原稿画像を入力する画像入力ステップと、該画像入力ステップにより取り込まれた画像の黒画素の連続する範囲ごとに前記黒画素と外接する矩形を求める矩形抽出処理ステップと、該矩形抽出処理ステップにより抽出された隣接する矩形同士を統合して行に成長させていく行切り出し処理ステップと、該行切り出し処理ステップにより切り出された行を１文字の範囲に分割する文字切り出し処理ステップと、該文字切り出し処理ステップにより分割された範囲を１文字とみなし、前記文字の画像特徴と認識辞書とを照合して予め設定した閾値以上の解を認識候補とする認識処理ステップと、該認識処理ステップにより求められた認識候補の並びと言語辞書及び文法とを照合して妥当な解を選択する言語処理ステップとを備えたことを特徴とする。
かかる発明によれば、請求項１と同様の作用効果を奏する。
請求項１１は、前記文字切り出し処理ステップにより分割された文字毎に、該文字と関連する文字を記憶する関連文字記憶ステップを更に備え、前記言語処理ステップは、辞書検索時に前記認識処理ステップにより認識された認識文字と前記関連文字記憶ステップから読み出された関連する文字とを切り替えて検索することを特徴とする。
かかる発明によれば、請求項２と同様の作用効果を奏する。
請求項１２は、前記関連文字記憶ステップの記憶内容を操作者が変更可能とすることを特徴とする。
かかる発明によれば、請求項３と同様の作用効果を奏する。
【０００９】
請求項１３は、前記文字切り出し処理ステップにより分割された文字列の長さを計測する文字列計測ステップを更に備え、前記言語処理ステップは、辞書検索時に検索する対象単語の長さを前記文字列計測ステップにより計測し、該計測結果に応じて使用する認識候補数の上限を変更することを特徴とする。
かかる発明によれば、請求項４と同様の作用効果を奏する。
請求項１４は、前記言語処理ステップは、前記文字列計測ステップにより計測した辞書検索対象の単語の長さに応じて、前記関連文字記憶ステップに記憶されている関連文字を使用するか否かを決定することを特徴とする。
かかる発明によれば、請求項５と同様の作用効果を奏する。
請求項１５は、前記言語処理ステップは、前記文字列計測ステップにより計測した辞書検索対象の単語の長さに応じて、前記関連文字記憶ステップに記憶されている関連文字の使用レベルを決定することを特徴とする。
かかる発明によれば、請求項６と同様の作用効果を奏する。
請求項１６は、前記文字列計測ステップにより計測した検索対象となる単語の長さに応じて、完全に一致する文字の割合を設定する一致割合設定ステップを更に備え、該一致割合設定ステップにより設定された割合に完全に一致した文字数が達しない場合には、検索した単語を受理しないことを特徴とする。
かかる発明によれば、請求項７と同様の作用効果を奏する。
請求項１７は、前記関連文字記憶ステップの記憶文字により関連する文字を適用して辞書検索を受理した単語の評価スコアは、完全に一致した単語の評価スコアより評価を低くし、且つ前記辞書検索を受理した単語の評価スコアを未知語の評価スコアよりも高くすることを特徴とする。
かかる発明によれば、請求項８と同様の作用効果を奏する。
請求項１８は、検索する単語の両端の文字が完全に一致する場合のみ前記単語の検索結果を受理することを特徴とする。
かかる発明によれば、請求項９と同様の作用効果を奏する。
請求項１９は、請求項１０乃至１８の何れか一項に記載の文字認識方法をコンピュータが制御可能にプログラミングしたことを特徴とする。
かかる発明によれば、本発明の文字認識方法をコンピュータが制御可能なＯＳに従ってプログラミングすることにより、そのＯＳを備えたコンピュータであれば同じ文字認識方法により制御することができる。
請求項２０は、請求項１９に記載の文字認識プログラムをコンピュータが読み取り可能な形式で記録したことを特徴とする。
かかる発明によれば、そのプログラムをコンピュータが読み取り可能な形式で記録媒体に記録することにより、この記録媒体を持ち運ぶことにより何処でもプログラムを稼動することができる。
【００１０】
【発明の実施の形態】
以下、本発明を図に示した実施形態を用いて詳細に説明する。但し、この実施形態に記載される構成要素、種類、組み合わせ、形状、その相対配置などは特定的な記載がない限り、この発明の範囲をそれのみに限定する主旨ではなく単なる説明例に過ぎない。
図１は、本発明を説明するための原稿の一例を示す図である。なお実施例では日本語文の横書原稿を例に用いるが、特にことわらない限り、本発明は実施例に限定されるものではなく、言語辞書の単語に含まれる文字が変化しやすい文字（誤認識文字や異字体・旧字・俗字など）を記憶しておき、辞書検索における単語照合の際、登録単語の変化後の文字列も考慮することよって、高精度に辞書検索を行うことを示すものであり、特定の言語、文字画像種類（手書き／活字文字など）、書式（縦書き／横書き）に限定されない。
図２は、図１の原稿における黒画素の外接矩形を求めた場合の図である。これは各文字の黒画素の最も外側に接する点を結んで求めた外接矩形である。
図３は、図２の外接矩形の隣接する矩形同士を連結していき行に成長させたものである。これが行切り出し処理であるが、処理の詳細は本発明の主旨ではないので説明を省略する。一行中には複数の黒画素外接矩形が存在し、それらを組み合わせたバラエティの内、認識スコアや言語的な確定性を考慮して、最終的に文字切り出し位置が決まり認識結果が得られる。
【００１１】
図４は本発明の実施形態の認識結果を表す図である。この図では例えば入力文「怒鳴る。」に対して、各文字に複数の認識候補が得られ、さらに複数の認識候補の内、認識スコアが低いものは足切りされ、有効候補外となる。図では各文字最大８位まで認識候補を求めるが、有効候補数以下の候補（図では実線で区切っている）は言語処理（辞書検索）においては考慮されない。認識候補文字を組み合わせて、辞書に登録されている単語と照合する。辞書には「怒」「怒鳴る」「鳴る」「努」「。」の単語が登録されているとする。その結果、図４の認識結果の有効候補から「怒」「怒鳴る」「鳴る」「鴨」「努」「。」の辞書検索結果を得る。
また一行全体をカバーする単語の組み合わせは言語処理によって「怒鳴る／。」「怒／鳴る／。」「努／鳴る／。」「努／鴨／る／。」が求められる。求め方の詳細は本発明の主旨ではないので説明を省略する。
かな漢字変換などで用いられる最小文節数優先のヒューリスティクスや、認識順位などを加味して、もっともらしい組み合わせを選択する。上記の例では、文節数が最小２である「怒鳴る／。」を選択する。しかし、認識スコアが悪く、図５のように有効候補数内に正解文字が存在しない場合には、辞書検索しても正解の単語を得ることができないので、正しい認識結果を得ることはできない。そこで、認識候補中に含まれている候補に注目すると、正解「怒」に類似している「怨」及び正解「鳴」に類似している「鴫」が含まれている。これをキーに単語照合を行えば辞書検索が成功する可能性がある。これを実現するために、各文字毎に、その文字と等価とみなす文字集合を記憶しておけばよい。第１表にその例を示す。

また、「龍ヶ崎市」と「竜ヶ崎市」、「丹沢大学」と「丹澤大学」、「日学院大学」と「日學院大学」「日學院大學」、「Ａ鐵工所」と「Ａ鉄工所」、「江の島」と「江ノ島」「江乃島」のように異字体・俗字・旧字の使い方によって、複数の表記が存在する場合もある。これは地名、法人名など固有名詞に顕著に見られる。言語辞書にいずれか一つの表記しか登録されていない場合には、認識候補内に正解が存在していたとしても辞書検索は失敗してしまう。このような場合でも、第１表の示した誤認識傾向にある文字と同様に、第２表のように異字体・俗字・旧字の対応表を記憶しておき、単語照合の際に等価とみなせばよい。

第１表および第２表を用いれば、辞書検索に失敗しても正解単語「怒鳴る」と認識結果とを照合することができる。
【００１２】
図６は、以上の方法を応用した本発明の文字認識装置における処理のフローチャートである。まずスキャナーなどの画像入力機器によって、原稿画像を入力する（Ｓ１）。次に黒画素の連続する範囲毎に、それと外接する矩形を求める（Ｓ２）。そして隣接する矩形同士を統合して、行に成長させていく（Ｓ３）。黒画素射影、行高さ、などに基づいて行を１文字だと思われる範囲に分割していく。この場合複数の候補があっても構わない（Ｓ４）。次にステップＳ４で求められた範囲を１文字とみなし、画像特徴と認識辞書とを照合し、認識スコアを算出し、予め設定したしきい値以上の解を認識候補として残す（Ｓ５）。ステップＳ５で求められた認識候補の並びと、言語辞書および文法と照合して、認識スコアを加味した上、妥当な解を選択する（Ｓ６）。
【００１３】
図７は本発明の文字認識装置における類似文字を考慮した辞書検索のフローチャートである。まず検索したい文字列を設定し（Ｓ１１）、照合文字カウンタと認識文字位置カウンタをクリアする（Ｓ１２）。次に照合文字カウンタが文字数を越えたか否かをチェックし（Ｓ１３）、文字数を超えていれば（Ｓ１３でＹＥＳのルート）終了１（完全一致して辞書検索成功）となり、ステップＳ１３で文字数を超えていなければ（Ｓ１３でＮＯのルート）、照合文字カウンタが指す文字が認識文字位置カウンタの指す認識候補中にあるか否かをチェックし（Ｓ１４）、あれば（Ｓ１４でＹＥＳのルート）照合文字カウンタと認識文字位置カウンタをインクリメントしてステップＳ１３に戻り、ステップＳ１４でなければ（Ｓ１４でＮＯのルート）照合文字カウンタと認識文字位置カウンタをクリアする（Ｓ１６）。次に照合文字カウンタが文字数を超えたか否かをチェックし（Ｓ１７）、超えたなら（Ｓ１７でＹＥＳのルート）終了２（類似文字を使用して辞書検索成功）となり、超えなければ（Ｓ１７でＮＯのルート）照合文字カウンタが指す文字が認識文字位置カウンタの指す認識候補中にあるか否かをチェックし（Ｓ１８）、あれば（Ｓ１８でＹＥＳのルート）照合文字カウンタと認識文字位置カウンタをクリアして（Ｓ１９）ステップＳ１７に戻り、なければ（Ｓ１８でＮＯのルート）照合文字カウンタが指す文字に対応する文字（第１表および第２表に記載）が認識文字位置カウンタの認識候補中にあるか否かをチェックし（Ｓ２０）、あれば（Ｓ２０でＹＥＳのルート）ステップＳ１９に進み、なければ（Ｓ２０でＮＯのルート）終了３（辞書検索失敗）となる。
【００１４】
ここで、第１表および第２表の内容は予め設定してある一般的な情報であり、使用者の傾向を反映したものではない。特に誤認識傾向については使用する原稿によって大きく異なる。誤認識傾向が適合しない原稿に対して、辞書検索を実施することは不要な単語候補を増やし認識精度の低下につながる。同様に、旧字体を使用しないことがわかっている原稿に対して、旧字体を考慮することは処理時間が増大するだけでなく認識精度も低下する。よって、実際に使用する誤認識傾向および使用傾向を反映できるよう、第１表および第２表を変更可能にする。
また文字数の少ない単語の場合、類似文字が存在すると全く異る単語に照合してしまう。例えば２文字の単語であれば、（部都）部内、都内、（体休）全体、全休、（車重）車力、重力、（科料）電気科、電気料、（企全）企業種、全業種のように単語の混同が生じる。このような場合、不要な単語候補が生成されてしまう。特に正しくは未知語である部分が類似文字を考慮することによって、辞書検索が成功してしまうと、未知語候補が棄却されて、誤った単語が選択され、最終的に誤った認識結果になる。日本語の場合、１単語の平均的な文字数は２文字強であるから、混同する組み合わせは少くない。文字数が長い場合は類似文字が存在しても全文字で照合する可能性は低いから、混同する単語の組み合わせは少ない。一方、文字の認識難易度はその形状によって異るので、有効な認識候補数は一意に決らず、図４、図５に示すように、有効候補数を設けて可変にする場合が多い。認識処理部では全文字の可能性を試して、その上位固定の個数を残した後、さらに認識スコアの良否で足切り処理を行う。これは辞書検索時に誤った単語と認識結果とが照合することを避けるためである。
【００１５】
また図５の例では、有効候補順位以下に正解が含まれており、辞書検索が失敗する。上述のとおり、短い単語の場合は候補文字数が多くなれば類似する文字と照合する危険があるが、長い単語の場合は候補文字を多くしても誤って認識結果と合致することが少い。それよりも、有効候補順位以下に正解が含まれている場合に、それをキーに単語照合を続けることによって、正しい単語の照合が成功する可能性がある。従って、照合する単語の文字数に応じて有効候補数を変更すれば、汚れなどで１文字だけ極端に認識スコアが低い文字などを正解として選択することが可能になる。この処理のフローチャートを図８に示す。図８は、検索単語の長さに応じて有効候補数の上限を変更する辞書検索のフローチャートである。まず検索したい文字列を設定し（Ｓ２１）、次に文字列の長さをカウントする（Ｓ２２）。そしてステップＳ２２の計数結果に応じて有効候補数の上限を設定する（Ｓ２３）。次に照合文字カウンタと認識文字位置カウンタをクリアし（Ｓ２４）、照合文字カウンタが文字数を越えたか否かをチェックし（Ｓ２５）、越えたなら（Ｓ２５でＹＥＳのルート）終了１（検索成功）とする。ステップＳ２５で超えていなければ（Ｓ２５でＮＯのルート）、照合文字カウンタが指す文字が認識文字位置カウンタの指す認識候補の有効候補数の上限中にあるか否かをチェックし（Ｓ２６）、あれば（Ｓ２６でＹＥＳのルート）照合文字カウンタと認識文字位置カウンタをインクリメントしてステップＳ２５に戻り（Ｓ２７）、ステップＳ２６でなければ（Ｓ２６でＮＯのルート）、終了２（検索失敗）とする。
このように、検索対象となる単語の文字数が多い場合には、認識候補数が多くても問題になりにくいが、単語の文字数が少い場合には誤って照合してしまう虞がある。これは類似文字を考慮する場合でも同様である。したがって、検索対象となる単語長が長い場合に限って、辞書検索を実施するようにすれば、誤った辞書検索を避けることができる。
【００１６】
また関連文字の内、異字体・旧字・俗字など認識処理とは無関係に単語表記のバリエーションとなる文字（第２表）は、前記で言及した単語の長さに応じて、照合時に使用の有無を検討する必要はない。したがって、認識処理に関係する類似文字（第１表）は単語の長さに応じて使用の有無を判断し、異字体・旧字・俗字など単語表記のバリエーションに関連する文字は、単語の長さに関わらず常に照合時に使用するようにする。これは、各文字に関連づける文字を、第１表と第２表のように複数種類に分類しておけば容易に実現できる。種類の数は２とは限らず、類似の程度に応じて段階的に分類しておいてもよい。
また類似文字を考慮して照合した場合、文字列のほとんどが類似文字と照合することによって全く異る単語を受理してしまう危険がある。このような過剰な照合を防ぐため、文字列の長さに対する類似文字一致の割合を設定しておき、その割合を越えた場合には、その単語は破棄する。たとえば、１〜５文字の文字列なら５０％以上の文字は完全に一致する必要がある６文字の文字列なら３０％以上の文字は完全に一致する必要がある７〜９文字の文字列なら１０％以上の文字は完全に一致する必要がある１０文字の文字列なら０％以上の文字は完全に一致する必要があるのように設定する。
また類似文字を考慮せず、認識結果と完全に一致した単語と、類似文字を考慮して一致した単語とが言語処理において競合する場合がある。この場合には、完全に一致した単語を優先すべきであるから、単語候補スコアの算出式においては、同一長さであれば、類似文字を考慮して一致した単語候補よりも、認識結果と完全に一致した単語候補の方が高くなるように調整する。
【００１７】
また類似文字を考慮して正解に転じる可能性があるのは置換エラーの場合である。挿入・脱落が生じている文字切り出しの失敗を救済するものではない。類似文字照合を文字切り出し誤り箇所に適用しないためには、照合単語の両端の文字が正しく文字切り出しされていることを確認すればよい。実際には、真に正しく文字切り出しされていることを文字認識装置は知ることができない。そこで、照合単語の両端だけは、類似文字を考慮せず、完全に一致しなければならないという制限を設ける。文字切り出しが失敗した部分は正解の文字とは全く類似していない文字が出力されるので、その類似文字を利用しても更に正解とは程遠い文字となる。単語の両端部は隣接文字との接触面であり、文脈的制約が弱く、文字切り出し誤りが訂正されにくい部分であり、この部分を不正確な照合で辞書検索を確定してしまうことは、誤認識を誘発してしまう結果となる。したがって、その危険性を加味して、単語の両端は認識結果の有効文字範囲内と完全に一致するという制限を設ける。
【００１８】
図９（ａ）は、本発明の文字認識装置をソフトウェアによって実現する場合の構成図である。この文字認識装置は、全体の制御をハードディスクやＲＯＭに格納されたプログラムにより制御するＣＰＵ１と、データの一時的な記憶してワークメモリとして使用するメモリ２と、外部の機器と通信回線により接続する通信装置３と、液晶ディスプレイ等で構成されデータを表示する表示装置４と、プログラムやデータを格納するハードディスク５と、データを入力するキーボード６と、ＣＤに記録されたデータを読み込むＣＤ−ＲＯＭドライブ７と、フロッピー（登録商標）ディスクにデータを読み書きするＦＤドライブ８と、各構成要素を接続するバス９とを備えて構成される。この構成は一般のＰＣと同等であるので、動作説明は省略する。
図９（ｂ）は、本発明の文字認識装置をネットワークを介して実現する場合の構成図である。この場合は、図９（ａ）の文字認識装置２０、２１、２２がネットワーク２３を介して相互に接続されている。このように、図９（ａ）のような構成図によって、ソフトウエアによって実現しても構わないし、図９（ｂ）のように機能の一部をネットワーク上にもって、通信回線などを通して実現しても本発明の効果に変わりはないことは明らかである。
【００１９】
【発明の効果】
以上記載のごとく請求項１、１０の発明によれば、認識候補の並びと言語辞書及び文法とを照合して妥当な解を選択するので、文字の認識率を高めてしかも処理速度を速くすることができる。
また請求項２、１１では、文字と関連する文字を記憶するので、検索に失敗する確率を低下させて辞書検索の精度を向上することができる。
また請求項３、１２では、関連文字記憶手段の記憶内容を変更可能にするので、実際に使用する誤認識傾向および使用傾向を反映することができる。
また請求項４、１３では、文字列の長さに応じて使用する認識候補数の上限を変更するので、１文字だけ極端に認識スコアが低い文字などを正解として選択することが可能となり、文字の認識率を高くすることができる。
また請求項５、１４では、辞書検索対象の単語の長さに応じて、関連文字を使用するか否かを決定するので、検索対象となる単語長が長い場合に限って、関連文字を使用する辞書検索を実施するようにして、誤った辞書検索を避けることができる。
また請求項６、１５では、辞書検索対象の単語の長さに応じて、関連文字の使用レベルを決定するので、単語の長さに応じて最適な使用レベルを決定することができる。
また請求項７、１６では、設定された割合に完全に一致した文字数が達しない場合には、検索した単語を受理しないので、全く異る単語を受理してしまう危険性を少なくすることができる。
また請求項８、１７では、同一長さであれば、類似文字を考慮して一致した単語候補よりも、認識結果と完全に一致した単語候補の方が高くなるように調整するので、認識結果と完全に一致した単語候補を優先的に選択することができる。
また請求項９、１８では、単語の両端の文字が完全に一致する場合のみ前記単語の検索結果を受理するので、単語の両端での誤認識を減少することができる。
また請求項１９では、本発明の文字認識方法をコンピュータが制御可能なＯＳに従ってプログラミングすることにより、そのＯＳを備えたコンピュータであれば同じ文字認識方法により制御することができる。
また請求項２０では、そのプログラムをコンピュータが読み取り可能な形式で記録媒体に記録することにより、この記録媒体を持ち運ぶことにより何処でもプログラムを稼動することができる。
【図面の簡単な説明】
【図１】本発明を説明するための原稿の一例を示す図である。
【図２】本発明の図１の原稿における黒画素の外接矩形を求めた場合の図である。
【図３】本発明の図２の外接矩形の隣接する矩形同士を連結していき行に成長させたものである。
【図４】本発明の実施形態の認識結果を表す図である。
【図５】本発明の有効候補数内に正解文字が存在しない場合を説明する図である。
【図６】本発明の文字認識装置における処理のフローチャートである。
【図７】本発明の文字認識装置における類似文字を考慮した辞書検索のフローチャートである。
【図８】本発明の検索単語の長さに応じて有効候補数の上限を変更する辞書検索のフローチャートである。
【図９】（ａ）は本発明の文字認識装置をソフトウェアによって実現する場合の構成図、（ｂ）はネットワークを介して実現する場合の構成図である。
【符号の説明】
１ＣＰＵ、２メモリ、３通信装置、４表示装置、５ハードディスク、６キーボード、７ＣＤ−ＲＯＭドライブ、８ＦＤドライブ、９バス[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a character recognition device, and more particularly, to a character recognition device that performs automatic correction using linguistic information based on a reading result of an optical character reading device.
[0002]
[Prior art]
As a conventional technique, Japanese Patent Application Laid-Open No. 05-046814 discloses a method of improving recognition accuracy by performing a dictionary search in consideration of similar characters. According to this, a character reading unit that reads a character string, a character recognition unit that converts the read character string into a character code specified by a computer, and a phrase that divides the character string converted into the character code into words and phrases. A recognition unit, a dictionary unit that stores character strings, and a dictionary search unit that searches whether a character string divided into words and phrases is the same as a character string stored in the dictionary unit. If, as a result of the search, the recognized character string exists in the dictionary unit, the character recognition is determined to be correct and the character recognition is terminated. If the recognized character string does not exist in the dictionary unit, the character It repeats until a character string recognized by replacing with another character string close to matches a character string in the dictionary part.
In addition, Japanese Patent Application Laid-Open No. 07-036882 discloses a technique in which a dictionary search device can search words that could not be searched in the past due to an input error. According to this, a conversion character definition body that gives a group ID to a subset of characters, a character-group ID conversion unit that replaces characters with a group ID, and an input character string that replaces a character string input from an input unit with a group ID sequence A dictionary search device is configured by a conversion unit, a dictionary conversion unit that converts a word dictionary into a conversion word dictionary defined by a notation group ID, and a dictionary search unit that searches for a conversion word dictionary using a group ID string. By regarding the elements of the character set defined by the conversion character definition as the same, it is possible to search the dictionary even for words that could not be searched for the dictionary.
[Patent Document 1] Japanese Patent Application Laid-Open No. 05-046814
[Patent Document 2] JP-A-07-036882
[0003]
[Problems to be solved by the invention]
However, the method disclosed in Patent Literature 1 is a method in which when a dictionary search fails, it is replaced with a character having a similar outline, and the dictionary search is attempted again to increase the possibility of successful dictionary search. is there. In this method, only words with similar outlines are targeted.In addition, since a dictionary search is used to test whether the recognition result is correct, if it is matched with another word that is not the correct answer registered in the dictionary, However, there is a problem that the recognition result is determined without considering the existence of similar characters.
The method disclosed in Patent Literature 2 is a method of performing similar character matching at the time of dictionary search using character sets similar to each other (similar character sets). In this method, characters belonging to the same similar character set are confused with each other. As a result, the processing time increases and the selection may be erroneously performed because the possibility of combination with the preceding and following words increases. Also, it is not possible to deal with allomorphs (Sawa, Sawa) that differ in character from similar characters, such as “Kurosawa” and “Kurosawa”. That is, if only one of the notations is registered in the dictionary headword, there is a problem that the matching cannot be performed.
In view of such a problem, the present invention sets in advance, for each character, a similar character that is likely to be erroneously recognized as the character and its allomorph, and performs the similar character and allomorph (formerly known) in addition to a normal dictionary search. It is an object of the present invention to provide a character recognition device that enhances dictionary search performance and improves correction accuracy of erroneous recognition by performing a dictionary search in consideration of characters, slang characters, personal name kanji, and the like.
[0004]
[Means for Solving the Problems]
In order to solve the above problem, the present invention provides an image input unit for inputting a document image by an image input device and the black input unit for each continuous range of black pixels of an image captured by the image input unit. A rectangle extraction processing unit for obtaining a rectangle circumscribing a pixel, a line extraction processing unit that integrates adjacent rectangles extracted by the rectangle extraction processing unit and grows into a row, and a line extraction processing unit that is extracted by the line extraction processing unit. A character cutout processing unit that divides the divided line into a range of one character, and a range that is divided by the character cutout processing unit is regarded as one character, and the image feature of the character is compared with a recognition dictionary and a predetermined threshold or more is determined. And a language processing unit that selects a proper solution by comparing the arrangement of the recognition candidates obtained by the recognition processing unit with a language dictionary and a grammar. Toss .
According to the present invention, a rectangle circumscribing a black pixel of an input image is determined, and rectangles adjacent to the rectangle are integrated to form a row. The line is divided into a range of one character as a character, and the character is regarded as one character, and a feature of the character is compared with a recognition dictionary to select a recognition candidate. This is a method of finding a grammatically valid solution from the selected candidates and recognizing it as a character.
According to the invention, since a proper solution is selected by comparing the arrangement of the recognition candidates with the language dictionary and the grammar, it is possible to increase the character recognition rate and to increase the processing speed.
[0005]
The character processing apparatus according to claim 2, further comprising a related character storage unit that stores, for each character divided by the character cutout processing unit, a character related to the character, wherein the language processing unit recognizes the character by the recognition processing unit when searching a dictionary. The search is performed by switching between the recognized character and the related character read from the related character storage unit.
Some of the characters in the words in the language dictionary are characters in which the characters tend to change (misrecognized characters, allographs, old characters, slang characters, etc.). These characters are stored as related characters, and word matching in dictionary search is performed. At this time, the dictionary search can be performed with high accuracy by considering the character string after the change of the registered word.
According to this invention, since the characters related to the characters are stored, the probability of a search failure is reduced, and the accuracy of the dictionary search can be improved.
A third aspect of the present invention is characterized in that the storage contents of the related character storage means can be changed by an operator.
When the tendency of the user is not reflected, especially the tendency of misrecognition greatly differs depending on the original to be used. Therefore, performing a dictionary search on a document that does not match the tendency of erroneous recognition increases unnecessary word candidates and leads to a decrease in recognition accuracy. Similarly, for an original document that is known not to use the old font, considering the old font not only increases the processing time, but also reduces the recognition accuracy. Therefore, it is preferable that the storage contents of the related character storage unit can be changed so as to reflect the erroneous recognition tendency and the usage tendency actually used.
According to this invention, since the storage contents of the related character storage means can be changed, the tendency of erroneous recognition and the tendency of actual use can be reflected.
[0006]
5. The apparatus according to claim 4, further comprising a character string measuring unit configured to measure a length of the character string divided by the character segmentation processing unit, wherein the language processing unit determines a length of a target word to be searched at the time of dictionary search by the character string. It is characterized by measuring by a measuring means and changing the upper limit of the number of recognition candidates to be used according to the measurement result.
In the case of a short word, there is a risk of collating with a similar character if the number of candidate characters is large. On the other hand, when the correct answer is included in the valid candidate rank or less, the word matching may be continued by using the correct answer as a key, and the correct word matching may be successful. Therefore, if the number of valid candidates is changed according to the number of characters of the word to be collated, it is possible to select a character or the like whose recognition score is extremely low by one character due to dirt or the like as a correct answer.
According to this invention, the upper limit of the number of recognition candidates to be used is changed in accordance with the length of the character string, so that a character having an extremely low recognition score by one character can be selected as a correct answer. Rate can be higher.
According to a fifth aspect of the present invention, the language processing unit determines whether or not to use a related character stored in the related character storage unit in accordance with a length of a dictionary search target word measured by the character string measuring unit. It is characterized in that it is determined.
When the number of characters of the word to be searched is large, there is little problem even if the number of recognition candidates is large. However, when the number of characters of the word is small, there is a risk of erroneous collation. Therefore, an erroneous dictionary search can be avoided by performing a dictionary search using related characters only when the word length to be searched is long.
According to this invention, it is determined whether or not to use a related character in accordance with the length of a word to be searched for a dictionary. Therefore, only when the word to be searched is long, a dictionary using a related character is determined. By performing a search, an erroneous dictionary search can be avoided.
The language processing unit may determine a use level of a related character stored in the related character storage unit in accordance with a length of a dictionary search target word measured by the character string measurement unit. It is characterized by.
Similar characters related to the recognition process are used or not determined according to the word length.Characters related to variations in word notation, such as allographs, old letters, and slang, are always used for matching regardless of the word length. To use. This can be easily realized by classifying the characters associated with each character into a plurality of types. The number of types may be classified in stages according to the degree of similarity.
According to this invention, since the use level of the related character is determined according to the length of the word to be searched for the dictionary, the optimum use level can be determined according to the word length.
[0007]
And a matching ratio setting unit that sets a ratio of characters that completely match according to the length of the word to be searched, which is measured by the character string measuring unit. If the number of characters that completely match the determined ratio does not reach, the searched word is not accepted.
When matching is performed in consideration of similar characters, there is a risk that almost all of the character strings will receive completely different words by matching with similar characters. In order to prevent such an excessive collation, the ratio of similar character matching to the length of the character string is set, and if the ratio is exceeded, the word is discarded.
According to this invention, when the number of characters that completely matches the set ratio does not reach, the searched word is not accepted, so that the risk of accepting a completely different word can be reduced.
9. The method according to claim 8, wherein the evaluation score of a word that has received a dictionary search by applying a character related to the character stored in the related character storage unit is lower than the evaluation score of a completely matched word. The evaluation score of the received word is made higher than the evaluation score of the unknown word.
There is a case where a word that completely matches the recognition result without considering the similar character and a word that matches with the similar character are considered in the language processing. In this case, the word that completely matches should be given priority, so in the calculation formula of the word candidate score, if the word length is the same, the recognition result is smaller than the matching word candidate considering similar characters. It is preferable to make adjustments so that word candidates that completely match are higher.
According to this invention, if the length is the same, the word candidate that perfectly matches the recognition result is adjusted to be higher than the word candidate that matches in consideration of similar characters, so Can be preferentially selected.
The ninth aspect is characterized in that a search result of the word is received only when the characters at both ends of the word to be searched completely match.
Both ends of the word are the contact surfaces with the adjacent characters, and are the parts where the contextual constraints are weak and character extraction errors are difficult to correct. As a result. Therefore, taking into account the danger, a restriction is set that both ends of the word completely match the valid character range of the recognition result.
According to this invention, the search result of the word is accepted only when the characters at both ends of the word completely match, so that erroneous recognition at both ends of the word can be reduced.
[0008]
11. An image input step of inputting a document image by an image input device, and a rectangle extraction processing step of obtaining a rectangle circumscribing the black pixels for each continuous range of black pixels of the image captured by the image input step. A line cutout processing step of integrating adjacent rectangles extracted by the rectangle extraction processing step to grow into a line, and a character for dividing the line cut out by the line cutout processing step into one character range A cutout processing step, and a recognition processing step in which the range divided by the character cutout processing step is regarded as one character, and the image feature of the character is compared with a recognition dictionary, and a solution having a predetermined threshold or more is set as a recognition candidate. Language processing for selecting a proper solution by comparing the arrangement of recognition candidates obtained in the recognition processing step with a language dictionary and a grammar Characterized in that a step.
According to this invention, the same operation and effect as those of the first aspect can be obtained.
12. The image processing apparatus according to claim 11, further comprising a related character storing step of storing, for each character divided by the character cutout processing step, a character related to the character, wherein the language processing step is performed by the recognition processing step when a dictionary is searched. The search is performed by switching between the recognized character and the related character read from the related character storage step.
According to this invention, the same operation and effect as those of the second aspect can be obtained.
The twelfth aspect is characterized in that the storage content of the related character storage step can be changed by an operator.
According to this invention, the same operation and effect as those of the third aspect can be obtained.
[0009]
14. The method according to claim 13, further comprising: a character string measuring step of measuring a length of the character string divided by the character segmentation processing step; The measurement is performed in a measurement step, and the upper limit of the number of recognition candidates to be used is changed according to the measurement result.
According to this invention, the same operation and effect as those of the fourth aspect can be obtained.
According to a twelfth aspect of the present invention, the language processing step determines whether or not to use a related character stored in the related character storage step according to a length of a word to be searched for a dictionary measured in the character string measuring step. It is characterized in that it is determined.
According to this invention, the same operation and effect as those of the fifth aspect can be obtained.
The language processing step may determine a use level of a related character stored in the related character storage step according to a length of a dictionary search target word measured in the character string measuring step. It is characterized by.
According to this invention, the same operation and effect as those of the sixth aspect can be obtained.
17. The method according to claim 16, further comprising a matching ratio setting step of setting a ratio of completely matching characters in accordance with a length of the word to be searched measured in the character string measuring step. If the number of characters that completely match the determined ratio does not reach, the searched word is not accepted.
According to this invention, the same operation and effect as those of the seventh aspect can be obtained.
18. The method according to claim 17, wherein the evaluation score of a word that has received a dictionary search by applying a character related to the character stored in the related character storage step is lower than the evaluation score of a completely matched word, and the dictionary search is performed. The evaluation score of the received word is made higher than the evaluation score of the unknown word.
According to this invention, the same operation and effect as those of the eighth aspect can be obtained.
The eighteenth aspect is characterized in that the search result of the word is received only when the characters at both ends of the word to be searched completely match.
According to this invention, the same operation and effect as those of the ninth aspect can be obtained.
A nineteenth aspect is characterized in that the character recognition method according to any one of the tenth to eighteenth aspects is programmed to be controllable by a computer.
According to this invention, by programming the character recognition method of the present invention in accordance with an OS controllable by a computer, a computer having the OS can be controlled by the same character recognition method.
According to a twentieth aspect, the character recognition program according to the nineteenth aspect is recorded in a computer-readable format.
According to the invention, by recording the program on the recording medium in a computer readable format, the program can be operated anywhere by carrying the recording medium.
[0010]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the present invention will be described in detail using embodiments shown in the drawings. However, the components, types, combinations, shapes, relative arrangements, and the like described in this embodiment are not merely intended to limit the scope of the present invention but are merely illustrative examples unless otherwise specified. .
FIG. 1 is a view showing an example of a document for explaining the present invention. In the embodiment, a horizontally written manuscript of a Japanese sentence is used as an example. However, unless otherwise specified, the present invention is not limited to the embodiment, and characters included in words in a language dictionary are liable to change. Recognition characters, allographs, old characters, slang, etc.) are stored, and when word matching in dictionary search is performed, the character string after the registered word is changed is also taken into account, indicating that dictionary search is performed with high accuracy. It is not limited to a specific language, character image type (handwritten / printed characters, etc.), and format (vertical / horizontal).
FIG. 2 is a diagram illustrating a case where a circumscribed rectangle of black pixels in the document of FIG. 1 is obtained. This is a circumscribed rectangle obtained by connecting the points that contact the outermost black pixels of each character.
FIG. 3 is a diagram in which adjacent rectangles of the circumscribed rectangle in FIG. 2 are connected to each other and grown in rows. This is the line segmentation process, but the details of the process are not the gist of the present invention, and therefore description thereof is omitted. A plurality of black pixel circumscribed rectangles exist in one line, and a character cutout position is finally determined in consideration of a recognition score and linguistic determinism among a variety of combinations of the rectangles, and a recognition result is obtained.
[0011]
FIG. 4 is a diagram illustrating a recognition result according to the embodiment of the present invention. In this figure, for example, for the input sentence “screaming.”, A plurality of recognition candidates are obtained for each character, and among the plurality of recognition candidates, those with a low recognition score are cut off and are not valid candidates. In the figure, recognition candidates are obtained up to the eighth place of each character, but candidates less than the number of valid candidates (separated by solid lines in the figure) are not considered in language processing (dictionary search). The recognition candidate characters are combined and matched with words registered in the dictionary. It is assumed that the words “anger”, “scream”, “ring”, “do” and “.” Are registered in the dictionary. As a result, dictionary search results of “anger”, “ringing”, “ringing”, “duck”, “tsuto”, and “.” Are obtained from the valid candidates of the recognition result in FIG.
In addition, a combination of words covering the entire line requires “scream /.”, “Scream / sound /.”, “Stress / sound /.”, And “stress / duck / sound /.” By language processing. Details of the method of obtaining are not the gist of the present invention, and thus the description thereof is omitted.
A plausible combination is selected by taking into account heuristics that give priority to the minimum number of phrases used in kana-kanji conversion and recognition order. In the above example, “scream /.” Which has a minimum of two phrases is selected. However, if the recognition score is poor and there are no correct characters in the number of valid candidates as shown in FIG. 5, a correct word cannot be obtained even if a dictionary search is performed, so that a correct recognition result cannot be obtained. Therefore, when focusing on the candidates included in the recognition candidates, "grudge" similar to the correct answer "anger" and "shigi" similar to the correct answer "ringing" are included. If this is used as a key to perform word matching, the dictionary search may succeed. In order to realize this, a character set regarded as equivalent to the character may be stored for each character. Table 1 shows an example.

In addition, "Ryugasaki City" and "Ryugasaki City", "Tanzawa University" and "Tanzawa University", "Nigakuin University" and "Nigakuin University""NigakuinUniversity","A Iron Works" and "A Iron Works" , "Enoshima", "Enoshima", "Enoshima", etc., there may be multiple notations depending on how to use allographs, slang characters, and old characters. This is noticeable in proper nouns such as place names and corporate names. When only one notation is registered in the language dictionary, the dictionary search fails even if a correct answer exists in the recognition candidate. Even in such a case, as in the case of the characters having a tendency to be misrecognized as shown in Table 1, the correspondence table of allographs, slang, and old characters is stored as shown in Table 2, and the equivalent table is used for word matching. Should be considered.

By using the first and second tables, even if the dictionary search fails, the correct word "scream" can be compared with the recognition result.
[0012]
FIG. 6 is a flowchart of a process in the character recognition device of the present invention to which the above method is applied. First, a document image is input by an image input device such as a scanner (S1). Next, for each continuous range of black pixels, a rectangle circumscribing it is determined (S2). Then, adjacent rectangles are integrated and grown into rows (S3). The line is divided into ranges considered to be one character based on black pixel projection, line height, and the like. In this case, there may be a plurality of candidates (S4). Next, the range obtained in step S4 is regarded as one character, the image feature is compared with the recognition dictionary, a recognition score is calculated, and a solution having a threshold value or more set in advance is left as a recognition candidate (S5). The arrangement of the recognition candidates obtained in step S5 is compared with the language dictionary and the grammar, and a proper solution is selected after considering the recognition score (S6).
[0013]
FIG. 7 is a flowchart of a dictionary search considering similar characters in the character recognition device of the present invention. First, a character string to be searched is set (S11), and a collation character counter and a recognized character position counter are cleared (S12). Next, it is checked whether or not the collation character counter has exceeded the number of characters (S13). If the number of characters has exceeded the number (the route of YES in S13), the end 1 (completely matched and dictionary search succeeded), and the number of characters is determined in step S13. If it does not exceed (NO route in S13), it is checked whether the character indicated by the collation character counter is in the recognition candidate indicated by the recognition character position counter (S14), and if it is (YES route in S14), the collation is performed. The character counter and the recognized character position counter are incremented, and the process returns to step S13. If not in step S14 (NO in S14), the collation character counter and the recognized character position counter are cleared (S16). Next, it is checked whether or not the collation character counter has exceeded the number of characters (S17). If it exceeds the number (YES route in S17), end 2 (successful dictionary search using similar characters) is reached. (NO route) It is checked whether or not the character indicated by the collation character counter is among the recognition candidates indicated by the recognition character position counter (S18). If there is (YES in S18), the collation character counter and the recognition character position counter are set. Clear (S19) and return to step S17. If not (NO in S18), the character corresponding to the character indicated by the collation character counter (described in Tables 1 and 2) is in the recognition candidates of the recognition character position counter. Is checked (S20), and if there is (YES route in S20), the process proceeds to step S19, otherwise (NO route in S20). Search failure) to become.
[0014]
Here, the contents of Tables 1 and 2 are general information set in advance and do not reflect the tendency of the user. In particular, the tendency of erroneous recognition greatly differs depending on the original used. Performing a dictionary search on a document that does not match the misrecognition tendency increases unnecessary word candidates and leads to a decrease in recognition accuracy. Similarly, for a document that is known not to use an old font, considering the old font not only increases processing time but also reduces recognition accuracy. Therefore, Tables 1 and 2 can be changed so that the tendency of misrecognition and the tendency of use actually used can be reflected.
Further, in the case of a word having a small number of characters, if a similar character exists, it is collated with a completely different word. For example, in the case of a two-letter word, the words (between departments), inside Tokyo, (whole rest), all rest, (vehicle weight) vehicle power, gravity, (scientific) electrical department, electricity bill, (corporate) company type Word confusion occurs in all industries. In such a case, unnecessary word candidates are generated. In particular, if the part that is correctly an unknown word considers similar characters, if the dictionary search succeeds, the unknown word candidate is rejected, the wrong word is selected, and finally the incorrect recognition result . In the case of Japanese, since the average number of characters in one word is slightly more than two characters, there are few confusing combinations. When the number of characters is long, even if there is a similar character, it is unlikely that all characters are collated, so that there are few combinations of confused words. On the other hand, the degree of difficulty in recognizing a character varies depending on its shape. Therefore, the number of valid recognition candidates is not uniquely determined, and is often variable by providing the number of valid candidates as shown in FIGS. The recognition processing unit tests the possibility of all the characters, leaves the fixed number of the upper characters, and further performs a cut-off process based on the quality of the recognition score. This is to avoid matching an erroneous word with a recognition result during dictionary search.
[0015]
In the example of FIG. 5, the correct answer is included below the effective candidate rank, and the dictionary search fails. As described above, in the case of a short word, there is a risk of collating with a similar character if the number of candidate characters is large. However, in the case of a long word, even if the number of candidate characters is large, it is unlikely that the recognition result matches the recognition result. On the other hand, when the correct answer is included in the valid candidate rank or less, the word matching may be continued by using the correct answer as a key, and the correct word matching may be successful. Therefore, if the number of valid candidates is changed according to the number of characters of the word to be collated, it is possible to select a character or the like whose recognition score is extremely low by one character due to dirt or the like as a correct answer. FIG. 8 shows a flowchart of this processing. FIG. 8 is a flowchart of a dictionary search for changing the upper limit of the number of valid candidates according to the length of a search word. First, a character string to be searched is set (S21), and then the length of the character string is counted (S22). Then, the upper limit of the number of valid candidates is set according to the counting result of step S22 (S23). Next, the collation character counter and the recognition character position counter are cleared (S24), and it is checked whether or not the collation character counter has exceeded the number of characters (S25). If the number exceeds the number (YES route in S25), end 1 (search successful) And If it does not exceed in step S25 (NO in S25), it is checked whether the character indicated by the collation character counter is within the upper limit of the number of valid candidates of the recognition candidates indicated by the recognized character position counter (S26). If it is (YES route in S26), the collation character counter and the recognized character position counter are incremented and the process returns to step S25 (S27). If not in step S26 (NO route in S26), end 2 (search failure).
As described above, when the number of characters of the word to be searched is large, it is unlikely that the number of recognition candidates is large. However, when the number of characters of the word is small, there is a risk of erroneous matching. This is the same even when similar characters are considered. Therefore, if the dictionary search is performed only when the word length to be searched is long, an erroneous dictionary search can be avoided.
[0016]
Also, of the related characters, characters that are variations of the word notation, such as allographs, old letters, and slang, regardless of the recognition process (Table 2), are used at the time of matching according to the length of the word mentioned above. There is no need to consider the presence or absence. Therefore, similar characters related to the recognition process (Table 1) are used or not determined according to the word length. Characters related to variations in word notation, such as allographs, old letters, and slang, are used for word length. Regardless, always use it for verification. This can be easily realized by classifying the characters associated with each character into a plurality of types as shown in Tables 1 and 2. The number of types is not limited to two, and may be classified in stages according to the degree of similarity.
Also, when matching is performed in consideration of similar characters, there is a risk that almost all of the character strings will receive completely different words by matching with similar characters. In order to prevent such excessive collation, the ratio of similar character matching to the length of the character string is set, and if the ratio is exceeded, the word is discarded. For example, if the character string is 1 to 5 characters, 50% or more of the characters must match exactly. If the character string is 6 characters, 30% or more of the characters must match completely. If the character string is 7 to 9 characters, 10% or more characters need to be completely matched. If the character string is 10 characters, 0% or more characters must be completely matched.
Further, there is a case where a word completely matching the recognition result without considering the similar character and a word matching considering the similar character compete in the language processing. In this case, the word that completely matches should be given priority, so in the calculation formula of the word candidate score, if the word length is the same, the recognition result is smaller than the matching word candidate considering similar characters. Adjust so that the word candidate that completely matches becomes higher.
[0017]
In addition, there is a possibility that a correct answer is given in consideration of a similar character in the case of a replacement error. It does not remedy the failure of character segmentation where insertion / dropout has occurred. In order not to apply the similar character collation to the character segmentation error portion, it is sufficient to confirm that the characters at both ends of the collation word are correctly segmented. In practice, the character recognition device cannot know that the character has been correctly cut out. Therefore, there is a restriction that only the two ends of the collation word must match completely without considering similar characters. Since a character that is not completely similar to the correct character is output in a portion where the character extraction fails, even if the similar character is used, the character is far from the correct character. Both ends of the word are the contact surfaces with adjacent characters, where the contextual constraints are weak and character extraction errors are difficult to correct.It is not correct to determine a dictionary search by inaccurate matching of this part. The result is that it triggers recognition. Therefore, taking into account the danger, a restriction is set that both ends of the word completely match the valid character range of the recognition result.
[0018]
FIG. 9A is a configuration diagram when the character recognition device of the present invention is realized by software. This character recognition device is connected by a communication line to a CPU 1 for controlling the entire control by a program stored in a hard disk or a ROM, a memory 2 for temporarily storing data and used as a work memory, and an external device. A communication device 3, a display device 4 including a liquid crystal display and the like for displaying data, a hard disk 5 for storing programs and data, a keyboard 6 for inputting data, and a CD-ROM drive for reading data recorded on a CD 7, an FD drive 8 for reading and writing data from / to a floppy (registered trademark) disk, and a bus 9 for connecting each component. Since this configuration is equivalent to a general PC, the description of the operation is omitted.
FIG. 9B is a configuration diagram when the character recognition device of the present invention is realized via a network. In this case, the

character recognition devices

20, 21, and 22 in FIG. 9A are mutually connected via a network 23. In this way, the configuration shown in FIG. 9A may be realized by software, or as shown in FIG. 9B, a part of the functions may be realized on a network and realized through a communication line or the like. Obviously, the effect of the present invention does not change.
[0019]
【The invention's effect】
As described above, according to the first and tenth aspects of the present invention, an appropriate solution is selected by comparing the arrangement of recognition candidates with the language dictionary and the grammar, so that the character recognition rate is increased and the processing speed is increased. be able to.
In the second and eleventh aspects, since characters related to the characters are stored, the probability of a search failure is reduced and the accuracy of dictionary search can be improved.
According to the third and twelfth aspects, since the storage contents of the related character storage means can be changed, the tendency of misrecognition and the tendency of actual use can be reflected.
In claims 4 and 13, since the upper limit of the number of recognition candidates to be used is changed according to the length of the character string, it is possible to select a character having an extremely low recognition score by one character as a correct answer. Recognition rate can be increased.
According to the fifth and fourteenth aspects, it is determined whether or not to use a related character in accordance with the length of a word to be searched in a dictionary. Therefore, the related character is used only when the word to be searched is long. By performing a dictionary search, a wrong dictionary search can be avoided.
According to the sixth and fifteenth aspects, since the use level of the related character is determined according to the length of the word to be searched for the dictionary, the optimum use level can be determined according to the length of the word.
In the seventh and sixteenth aspects, if the number of characters that completely matches the set ratio does not reach, the searched word is not accepted, so that the risk of accepting a completely different word can be reduced. .
In the eighth and seventeenth aspects, if the length is the same, the word candidate that completely matches the recognition result is adjusted to be higher than the word candidate that matches in consideration of similar characters. Word candidates that completely match with can be preferentially selected.
According to the ninth and eighteenth aspects, since the search result of the word is accepted only when the characters at both ends of the word completely match, erroneous recognition at both ends of the word can be reduced.
In the nineteenth aspect, by programming the character recognition method of the present invention in accordance with an OS controllable by a computer, a computer having the OS can be controlled by the same character recognition method.
In the twentieth aspect, the program is recorded on a recording medium in a computer-readable format, so that the program can be operated anywhere by carrying the recording medium.
[Brief description of the drawings]
FIG. 1 is a diagram showing an example of a document for explaining the present invention.
FIG. 2 is a diagram illustrating a case where a circumscribed rectangle of black pixels in the document of FIG. 1 according to the present invention is obtained.
FIG. 3 is a diagram in which adjacent rectangles of the circumscribed rectangle in FIG. 2 of the present invention are connected to each other and grown in rows.
FIG. 4 is a diagram illustrating a recognition result according to the embodiment of the present invention.
FIG. 5 is a diagram illustrating a case where a correct character does not exist in the number of valid candidates according to the present invention.
FIG. 6 is a flowchart of processing in the character recognition device of the present invention.
FIG. 7 is a flowchart of a dictionary search considering similar characters in the character recognition device of the present invention.
FIG. 8 is a flowchart of a dictionary search for changing the upper limit of the number of valid candidates according to the length of a search word according to the present invention.
9A is a configuration diagram when the character recognition device of the present invention is implemented by software, and FIG. 9B is a configuration diagram when the character recognition device is implemented via a network.
[Explanation of symbols]
1 CPU, 2 memory, 3 communication device, 4 display device, 5 hard disk, 6 keyboard, 7 CD-ROM drive, 8 FD drive, 9 bus

Claims

An image input unit for inputting a document image by an image input device; a rectangle extraction processing unit for obtaining a rectangle circumscribing the black pixels for each continuous range of black pixels of the image captured by the image input unit; A line cutout processing unit that integrates adjacent rectangles extracted by the processing unit to grow them into lines, a character cutout processing unit that divides the line cut out by the line cutout processing unit into a range of one character, A recognition processing unit that regards the range divided by the character cutout processing unit as one character, compares an image feature of the character with a recognition dictionary, and sets a solution that is equal to or greater than a predetermined threshold as a recognition candidate; A character recognition device comprising: a language processing unit that selects a proper solution by comparing the arrangement of recognition candidates obtained by the above with a language dictionary and a grammar.

For each character divided by the character cut-out processing unit, further comprising a related character storage means for storing a character related to the character, the language processing unit, the recognition character recognized by the recognition processing unit during dictionary search 2. The character recognition device according to claim 1, wherein a search is performed by switching between related characters read from the related character storage unit.

3. The character recognition device according to claim 1, wherein the storage contents of the related character storage unit can be changed by an operator.

The language processing unit further includes a character string measuring unit that measures the length of the character string divided by the character cutout processing unit, and the language processing unit measures the length of a target word to be searched during a dictionary search by the character string measuring unit. 2. The character recognition apparatus according to claim 1, wherein the upper limit of the number of recognition candidates to be used is changed according to the measurement result.

The language processing unit determines whether or not to use a related character stored in the related character storage unit according to a length of a word to be searched for a dictionary measured by the character string measurement unit. 2. The character recognition device according to claim 1, wherein:

The said language processing part determines the use level of the related character memorize | stored in the said related character storage means according to the length of the word of the dictionary search object measured by the said character string measurement means. Item 2. The character recognition device according to Item 1.

A matching ratio setting unit that sets a ratio of characters that completely match according to the length of the search target word measured by the character string measuring unit; 2. The character recognition device according to claim 1, wherein if the number of characters that match the number does not reach, the searched word is not accepted.

The evaluation score of a word that has received a dictionary search by applying a character that is more relevant to the character stored in the related character storage unit is lower than the evaluation score of a completely matched word, and the evaluation score of the word that has received the dictionary search is lower. The character recognition device according to claim 1, wherein the evaluation score is higher than the evaluation score of the unknown word.

The character recognition device according to claim 5, wherein the search result of the word is received only when the characters at both ends of the word to be searched completely match.

An image input step of inputting a document image by an image input device; a rectangle extraction processing step of obtaining a rectangle circumscribing the black pixels for each continuous range of black pixels of the image captured by the image input step; A line cutout processing step of integrating adjacent rectangles extracted by the processing step to grow into a line, a character cutout processing step of dividing the line cut out by the line cutout processing step into a range of one character, A recognition processing step in which the range divided by the character cutout processing step is regarded as one character, and an image feature of the character is compared with a recognition dictionary, and a solution that is equal to or greater than a preset threshold is set as a recognition candidate; A language processing step of comparing a sequence of recognition candidates obtained by the above with a language dictionary and a grammar to select a proper solution. Character recognition wherein the a.

The apparatus further comprises a related character storing step of storing a character related to the character for each character divided by the character cutout processing step, wherein the language processing step includes a step of: 11. The character recognition method according to claim 10, wherein a search is performed by switching between related characters read from the related character storage step.

12. The character recognition method according to claim 10, wherein the storage content of the related character storage step can be changed by an operator.

The method further includes a character string measuring step of measuring a length of the character string divided by the character cutout processing step, wherein the language processing step measures the length of a target word to be searched at the time of dictionary search by the character string measuring step. 11. The character recognition method according to claim 10, wherein an upper limit of the number of recognition candidates to be used is changed according to the measurement result.

The language processing step determines whether or not to use a related character stored in the related character storage step according to a length of the word to be searched in the dictionary measured in the character string measuring step. The character recognition method according to claim 10, wherein

The language processing step determines a use level of a related character stored in the related character storage step according to a length of a word to be searched in the dictionary measured in the character string measuring step. Item 11. The character recognition method according to Item 10.

The method further includes a matching ratio setting step of setting a ratio of characters that completely match according to the length of the search target word measured by the character string measuring step. 11. The character recognition method according to claim 10, wherein if the number of characters that match the number does not reach, the searched word is not accepted.

The evaluation score of a word that has received a dictionary search by applying a character that is more relevant to the stored character in the related character storage step is lower than the evaluation score of a completely matched word, and the evaluation score of the word that has received the dictionary search is The character recognition method according to claim 10, wherein the evaluation score is set higher than the evaluation score of the unknown word.

The character recognition method according to claim 14, wherein a search result of the word is received only when characters at both ends of the word to be searched completely match.

A character recognition program, wherein the character recognition method according to claim 10 is programmed to be controllable by a computer.

A recording medium characterized by recording the character recognition program according to claim 19 in a computer-readable format.