JP3578247B2

JP3578247B2 - Character recognition method and device

Info

Publication number: JP3578247B2
Application number: JP05078697A
Authority: JP
Inventors: 美奈子澤木; 紀博萩田; 健一郎石井
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-03-05
Filing date: 1997-03-05
Publication date: 2004-10-20
Anticipated expiration: 2017-03-05
Also published as: JPH10247219A

Description

【０００１】
【発明の属する技術分野】
本発明は入力画像が多値画像、特に、かすれ、つぶれなどの雑音が加わった文書画像、縞模様などのデザイン処理を施してある文書画像、さらには白抜き文字画像などが混在する文書画像に対する文字認識方法および装置に関する。
【０００２】
【従来の技術】
従来、漢字ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅａｄｅｒ）などの文字認識処理装置では、入力された多値の文書画像パターンに対して、（１）しきい値処理により２値化し、２値画像として扱う手法、（２）多値画像から直接文字の特徴量を抽出する手法、の２種類が知られている。
【０００３】
（１）の手法では、２値化された画像に対し、水平または垂直方向に走査して、各走査線ごとに黒画素または白画素の数を射影して、これらの量の投影分布を求め、文字列領域および個別文字領域を抽出する。その後に、切り出された文字ごとに認識のための特徴をベクトルの形で抽出し、予め作成してある標準辞書内の各カテゴリの標準パターンベクトルとの間で類似度または相違度などの識別関数を求めて、最も類似した文字または図形カデゴリを認識結果とする。この方法では、文字パターンが黒文字で、背景部分が白画素からなる文書画像の場合には、投影分布により画像中から個別文字の領域を判定することが可能であるが、広告中の文字に見られるようにデザイン処理された文字や写真の上に書かれた文字などに対しては、背景領域と文字領域の投影分布が類似しているため、うまく個別文字の領域を抽出できない問題点があった。
【０００４】
また、２値画像を対象に、デザイン処理などにロバストな射影尺度により、デザイン処理された画像から文字領域を抽出する方法も知られているが（例えば、澤木、萩田：特願平７−２４８１４７）、この手法では、入力画像として２値画像を対象としているため、多値画像に適用できない問題があった。さらに、多値画像を２値化したのちにこの手法を用いると、しきい値処理により画像が劣化し、文字領域がうまく切り出されない問題があった。
【０００５】
一方、（２）の手法では、多値画像の濃度分布を観測し、山線にあたるところを文字の心線として抽出し、辞書内に予め作成してある各カテゴリの標準心線形状と比較する手法が知られている。しかし、デザイン処理などの雑音が存在する場合には、デザイン部分を誤って文字線として抽出してしまったり、文字線部分を抽出し損ねてしまうという問題点がある。
【０００６】
【発明が解決しようとする課題】
上記のように、従来のＯＣＲをはじめとする文字認識技術は、背景にデザイン処理を行った文字列画像や、かすれ、つぶれのある文字列画像、白抜き文字混在などからなる文字列画像、特に多値画像中のそれらの画像から、文字カテゴリを正しく認識できる手段が十分に確立していなかった。
【０００７】
本発明の目的はこのような問題点に鑑み、多値画像中の、かすれやよごれなどの雑音やデザイン処理された文字列に対しても、文字列領域と背景領域の違いを反映する計数値を用いて文字列の領域および個別文字の候補画像を抽出する方法と、個別文字の候補画像に対してこの計数値を用いて雑音の程度を推定し、認識、棄却の判定に用いる識別関数のしきい値を雑音の程度に応じて変化させる方法とを提供することで、従来法に比べて精度の高い文字認識を実現することである。
【０００８】
【課題を解決するための手段】
上記目的を達成するために、本発明では、新聞見出しのようなデザイン処理された文字列画像パターンから文字領域と背景領域を分離するために、入力される多値画像パターンに対して、複数方向に走査して、走査線上で、ある画素の濃度と隣接画素の濃度の積、およびある画素の濃度と隣接画素の濃度をある定数から引いた値との積、およびある画素の濃度をある定数から引いた値と隣接画素の濃度との積、およびある画素の濃度をある定数から引いた値と隣接画素の濃度をある定数から引いた値との積を計数し、これらの計数値を利用して文字列領域を抽出することを特徴とする。
【０００９】
また、本発明では、抽出した文字列領域の画像パターンの情報をもとにして、複数方向に走査して、走査線上で、ある画素の濃度と隣接画素の濃度の積、およびある画素の濃度と隣接画素の濃度をある定数から引いた値との積、およびある画素の濃度をある定数から引いた値と隣接画素の濃度との積、およびある画素の濃度をある定数から引いた値と隣接画素の濃度をある定数から引いた値との積を計数し、これらの計数値を利用して、個別文字の候補領域を抽出する方法や、文字列中で個別文字カテゴリが照合により確定した文字カテゴリの位置と認識情報をもとに、残りの未処理の個別文字の候補領域を抽出する方法を備えたことを特徴とする。
【００１０】
さらに、個別候補画像パターンと辞書の参照パターンとの照合において、公知の識別関数を用いて識別関数値（類似度）を求め、この値と個別候補画像パターンの雑音の程度によって定まる閾値とをもとに、認識または棄却の判定処理を行うことで、多くの候補画像パターンの中から、正しい文字カテゴリとその位置を選択することを特徴とする。
また、入力された多値画像パターンの情報をもとに、辞書の参照パターンと認識・棄却判定の閾値を新規作成・追加・更新できる特徴をもつ。
これにより、本発明では、入力された多値画像パターンに対して、文書画像全体の傾き補正処理やシェーディング補正処理などの前処理が行われた後の文書等の画像パターンを対象として、文書中の文字列領域を抽出し、各文字列内に含まれる文字カテゴリを辞書の参照パターンとの照合により、認識する。また、誤認識や未学習の文字の画像パターンを辞書の参照パターンに追加作成または更新する。
【００１１】
【発明の実施の形態】
以下、本発明の一実施例を図面により説明する。
図１は本発明の方法を適用する文字認識装置の構成を示すブロック図で、１は多値画像パターン記憶回路、２は積算回路、３は文字列抽出回路、４は個別文字抽出回路、５は個別文字照合回路、６は参照パターン学習回路、７は画像入出力回路である。
図２は上記文字認識装置の動作例を示すフローチャートである。
図３は積算回路２の動作例を説明する説明図。図４は文字列抽出回路３の動作例を説明する説明図。図５は個別文字抽出回路４の動作例を説明する説明図である。
【００１２】
多値画像パターン記憶回路１は、認識対象となる多値画像パターンを記憶している。多値画像パターンはＮ画素からなり、例えば画素の濃度は０．０から１．０の間の小数値で示され、「０．０」は白を、「１．０」は黒を示している。
積算回路２は、多値画像パターン記憶回路１から取り出されたＮ画素からなる多値画像パターンを入力し、複数方向から走査して、文字列領域と背景領域を区別するために、各走査線上にある、ある画素の濃度と隣接する画素の濃度の積、および、ある画素の濃度と隣接する画素の濃度をある定数から引いた値との積、および、ある画素の濃度をある定数から引いた値と隣接する画素の濃度との積、および、ある画素の濃度をある定数から引いた値と隣接する画素の濃度をある定数から引いた値との積を計数し、出力する。
【００１３】
文字列抽出回路３は、積算回路２によって得られた各走査線上の画素の濃度の積和の計数値をもとに文字列領域と背景領域を区別し、入力画像中から文字列領域の画像パターンを出力する。
個別文字抽出回路４は、文字列抽出回路３で得られた文字列領域の画像パターンを入力し、個別文字照合回路５で照合すべき個別文字が含まれる候補画像を選択し、出力する。
個別文字照合回路５は、個別文字抽出回路４で選ばれた個別文字の候補画像パターンを入力し、所定の識別関数を用いて辞書の参照パターンと照合し、各文字カテゴリの識別関数値と候補画像パターンの雑音の程度によって決まるしきい値と比較して、この候補画像パターンの中に認識文字カテゴリがあるかどうかを判定し、ある場合にはこれらの結果を出力する。
【００１４】
参照パターン学習回路６は、画像入出力回路７から出力される画像情報、および、多値画像パターン記憶回路１の多値画像パターン、および、個別文字照合回路５内の辞書の参照パターンを入力し、参照パターンの新規作成・更新を行い、個別文字照合回路５の辞書の参照パターンに追加・更新する。
画像入出力回路７は、個別文字照合回路５から出力される文字カテゴリ番号、識別関数値、多値画像パターン等の認識結果の情報を入力し、これらの情報を表示し、また、マウスやキーボードなどによって画像情報を編集する。さらに、画像入出力回路７は、参照パターン学習回路６に参照パターンの新規作成または更新に必要な情報を送出する。
【００１５】
図１において、入力された多値画像パターンから文字列領域を抽出する処理について説明する。
まず、ステップＳ１（図２参照）では、多値画像パターン記憶回路１から認識対象となる多値画像パターンが読み出され、積算回路２に入力される。横Ｎｘ画素×縦Ｎｙ画素からなる多値画像パターンに対して、水平と垂直方向の２方向に文字を走査し、各走査線上の各画素で、画素の濃度と隣接する画素の濃度の積ａ、および、画素の濃度と隣接する画素の濃度をある定数から引いた値との積ｂ、および、画素の濃度をある定数から引いた値と隣接する画素の濃度との積ｃ、および、画素の濃度をある定数から引いた値と隣接する画素の濃度をある定数から引いた値との積ｅ、および、走査画素数ｎを計数する。
【００１６】
さらに、走査開始画素の１つ前の画素から走査終了画素の１つ前の画素までの画素値の和Ｔおよび二乗和Ｔ２、および、走査開始画素から走査終了画素までの画素値の和Ｘおよび二乗和Ｘ２を求める。ここでは例えば、画素値は、０．０から１．０の間の値をとり、黒点は画素値＝１．０に、白点は画素値＝０．０とする。また、定数としては、例えば、１．０を用いる。各走査線上の全画素で得られたａ，ｂ，ｃ，ｅの値を総和した値を、それぞれ、ａｐ，ｂｐ，ｃｐ，ｅｐとして求める。
【００１７】
各走査線上で、ａｐ，ｂｐ，ｃｐ，ｅｐの各計数値を用いて、
〔式１〕
【数１】

なるｐの値を求める。ｐの値は、文字領域が背景領域に比べ画素値の変化が小さい場合（例えば文字が無地）でも、大きい場合（例えばテクスチャがかかった文字）でも、文字領域で背景領域よりも大きい値をとる。また、ｐ値は分母の正規化項により、−１≦ｐ≦１の範囲の値をとる。
【００１８】
このｐ値を、各走査方向に対応した投影軸に投影する。位置ｙの水平方向走査で得られるｐ値をｈｙとおくと、水平方向走査で投影分布Ｐｙ＝（ｈ１，ｈ２，…，ｈｙ，…，ｈＮｙ）が得られる。位置ｘの水平方向走査で得られるｐ値をｖｘとおくと、水平方向走査で投影分布Ｐｘ＝（ｖ１，ｖ２，…，ｖｘ，…，ｖＮｘ）が得られる。これ以外の走査方向についても計数可能である。
【００１９】
次に、本装置の処理は、ステップＳ２に移り、文字列抽出回路３は、積算回路２から得られた投影分布Ｐｙと投影分布Ｐｘを入力し、もし、Ｎｘ＞Ｎｙならば、水平方向の走査で得られた投影分布Ｐｙを、予め設定したＮ区間になるように、Ｎ等分して、新しい投影分布Ｑ＝（ｑ１，ｑ２，…，ｑｊ，…，ｑＮ）を求める。さもなければ、垂直方向の走査で得られた投影分布Ｐｘを、予め設定したＮ区間になるように、Ｎ等分して、新しい投影分布Ｑ＝（ｑ１，ｑ２，…，ｑｊ，…，ｑＮ）を求める。
【００２０】
今、Ｎx ＞Ｎy の場合について説明する。
新しい投影分布Ｑに対して、予め設定したしきい値ＴQ と比較し、ｑj ＞ＴQとなるｊの連続区間〔ｊs ，ｊe 〕（ｊs ≦ｊ≦ｊe ）から文字列の高さＷy （＝ｊe −ｊs ＋１）を求める。
この高さＷy 画素と幅Ｎx 画素によって囲まれる画像領域を文字列画像パターンＷとして出力する。
ここで、文字列が複数行からなる場合は、連続区間が複数個出力されるので、複数個の文字列画像パターンＷを出力する。
【００２１】
次に、本装置の処理はステップＳ３に移り、個別文字抽出回路４は、文字列抽出回路３で得られたＮｘ ×Ｗｙ画素の文字列画像パターンＷを、高さがＮ画素なるように正規化処理を行い、正規化された文字列画像パターンＷＮを求める。
この正規化文字列画像パターンＷＮに対して、積算回路２の処理手順でのべた垂直方向走査によるｐ値の投影分布Ｐｘを求め、この投影分布Ｐｘに対して、予め設定したしきい値Ｔｘと比較し、ｖｊ＞Ｔｘとなるｘの連続区間〔ｘｓ，ｘｃ〕（ｘｓ ≦ｘ≦ｘｅ）を求める。
【００２２】
ｋ個の連続区間［ｘｓ１，ｘｅ１］，［ｘｓ２，ｘｅ２］，…，［ｘｓｋ，ｘｅｋ］に対して、参照パターンに記憶されている文字幅の存在範囲であるｚ１画素からｚ２画素の中にはいる、ｚ１ ≦ｘｅｊ−ｘｓｉ＋１≦ｚ２（ｉ≦ｊ）を満たす区間ｉと区間ｊを求め、個別文字の候補画像パターンＧの文字幅Ｗｘ（＝Ｘｅｊ−Ｘｓｉ＋１）を求める。これによりＷｘ ×Ｎ画素からなる個別文字の候補画像パターンＧが複数個得られる。
区間ｉでｘｅｉ−ｘｓｉ＋１＞ｚ２となる場合は、候補画像パターンＧの左枠の位置ｘｓｉを１画素単位にずらして、その右枠が位置ｘｅｉになるまで個別文字の候補画像パターンＧを複数個作成する。
【００２３】
次に個別文字の候補画像を求める第２の方法では、文字列画像パターンＷに対して、個別文字照合回路５で「ある文字カテゴリに認識した」と判定された個別文字の候補画像パターンＧ以外の未処理区間〔ｘｔ，ｘｕ〕に対して、ｚ１ ≦ｘｕ −ｘｔ ≦ｚ２を満たす区間を求め、幅Ｗｘ（＝ｘｕ一ｘｔ＋１）×Ｎ画素からなる個別文字候補画像パターンを作成する。この条件を満たす区間が複数個あれば、複数個の個別文字候補画像パターンを出力する。
また、個別文字照合回路５で「ある文字カテゴリに認識した」と判定された個別文字の候補画像パターンＧの区間から文字のピッチ（間隔）を推定し、残りの領域から個別文字候補画像パターンＧを抽出する方法も適用可能である。
【００２４】
これまでは横に長い多値画像パターンが入力された場合に関して主に説明したが、縦に長い多値画像パターンの場合も、ｘ方向とｙ方向の処理を逆にすれば同様に処理できる。
【００２５】
次に、個別文字照合回路５の処理について説明する。
本装置の処理はステップＳ４に移り、まず、Ｗｘ ×Ｎ画素からなる個別文字候補画像パターンＧを入力し、ｎ（＝Ｎ×Ｎ）画素の正規化多値文字パターンＸ＝（ｘ１，ｘ２，…，ｘｉ，…，ｘｎ）を求める。このとき、個別文字候補画像パターンＧが正規化多値文字パターンＸの中心にくるように平行移動しておく。また、次の式を用いて、この正規化多値文字パターンＸに対するｘｉの総和Ｈを求める。
【数２】

【００２６】
次に、この正規化多値文字パターンＸを辞書の参照パターンＴ＝（ｔ１，ｔ２，…，ｔｉ，…，ｔｎ）と比較するためには、公知の識別関数を用いる。例えば、複合類似度（飯島「パターン認識理論」森北出版株式会社）などが挙げられる。
【００２７】
各文字カテゴリｍの参照パターンに対して、Ｘとの類似度Ｓ（Ｘ，Ｔｍ）と上記Ｈの値によって決まる参照パターンの閾値Ｓｍを用いて、次のように認識または棄却の判走を行う。このＨと閾値Ｓｍの関係テーブルは、参照パターン学習回路６によって予め学習されている。
（条件１）もし、Ｓ（Ｘ，Ｔｍ）＞Ｓｍならば、この文字カテゴリｍとして認識できたと判定する。
（条件２）さもなければ、この文字カテゴリｍではないと判定する。
【００２８】
すべての参照パターンに対して、複数のカテゴリに対して条件１を満たした場合は、類似度が最も大きいカテゴリを認識カテゴリとして出力する。また、条件１を満たすカテゴリが１つもない場合は、この個別文字候補画像パターンＧに対しては認識カテゴリがなかったとして棄却する。
【００２９】
最後に参照パターン学習回路６と画像入出力回路７を用いて、個別文字照合回路５の辞書の参照パターンを追加・更新する処理を説明する。
画像入出力装置７は、個別文字照合回路５から出力される認識文字カテゴリ番号、識別関数値、多値画像パターン等の分類結果の情報を入力し、これらの情報を表示し、また、マウスやキーボードなどによって画像情報を編集する。
【００３０】
認識結果に誤認識がある場合や、新規の文字カテゴリの参照パターンを作成する場合は、画像入出力装置７を用いて、入力の多値画像パターンから、追加、更新すべき認識カテゴリｍの多値参照パターンＴｍを編集・作成する。
参照パターン学習回路６では、作成された多値参照パターンと個別文字照合回路５の辞書の参照パターンを用いて、この新規のｎ（＝Ｎ×Ｎ）画素からなる正規化多値参照パターンＴｍに対する認識棄却判定の閾値Ｓｍを学習する。
【００３１】
今、ｎ（＝Ｎ×Ｎ）画素からなる多値のランダム雑音パターンＲ＝（ｒ１，ｒ２，…，ｒｉ，…，ｒｎ）を作成し、辞書の各多値参照パターンＴに重畳して、学習のための入力画像パターンＸを作成する。すなわち、位置ｉの画素値ｘｉ＝ｔｉ＋ｒｉとなる。
新規のカテゴリｍの多値参照パターンＴｍに対する入力画像パターンＸの類似度Ｓ（Ｘ，Ｔｍ）を計算し、すべての参照パターンの中で、ｍ以外のカテゴリで最大の類似度をもとめ、その値をＳｍとして、個別文字照合回路５の辞書に、新規カテゴリｍの多値参照パターンＴｍとその閾値Ｓｍを格納する。
【００３２】
本発明の主要部分である積算回路２の具体的な動作例として、多値画像パターン記憶回路１に図３に示すような横２０画素×縦１４画素からなる多値画像パターンの場合について説明する。
この図は「文字」とかかれた文字列画像パターンの背景に網点のデザイン処理が施された入力画像パターンを表わしている。
この多値画像パターンが読み出され、積算回路２に入力される。この入力多値画像パターンに対して、水平方向と垂直方向の２方向に文字を走査し、式１で示したｐ値をそれぞれ、投影軸２−１および投影軸２−２に投影する。
【００３３】
走査の例として図３で、ｙ＝２における水平方向走査２−３では、ａｐ＝３．３４，ｂｐ＝４．８６，ｃｐ＝５．３６，ｅｐ＝５．４４，Ｔ＝８．７０，Ｔ２＝５．３１，Ｘ＝８．２０，Ｘ２＝４．７６，ｎ＝１９が計数され、ｙ＝２におけるｐ値は、
〔式２〕
ｐ＝（３．３４＊５．４４−４．８６＊５．３６）／｛ｓｑｒｔ［１９＊５．３１−８．７０＊８．７０］＊ｓｑｒｔ［１９＊４．７６−８．２０＊８．２０］｝＝−０．３２６
となる。
【００３４】
また、ｙ＝６における水平方向走査２−４では、ａｐ＝４．８６，ｂｐ＝４．２４，ｃｐ＝４．７４，ｅｐ＝５．１６，Ｔ＝９．６０，Ｔ２＝６．３６，Ｘ＝９．１０，Ｘ２＝５．８１，ｎ＝１９が計数され、ｙ＝６におけるｐ値は、
〔式３〕
ｐ＝（４．８６＊５．１６−４．２４＊４．７４）／｛ｓｑｒｔ［１９＊６．３６−９．６０＊９．６０］＊ｓｑｒｔ［１９＊５．８１−９．１０＊９．１０］｝＝０．１７７
となる。
【００３５】
この結果から、網点からなる背景部分のｐ値は、文字線部分の走査で得られるｐ値に比べて小さい値をとることを示し、式１で示すｐ値が文字部と背景部分とをわけるのに有効な尺度であることがわかる。
ｙ＝１からｙ＝１４までの水平方向走査により、投影分布Ｐｙが求められる。
【００３６】
文字列抽出回路３では、ｙ＝１からｙ＝１４までの水平方向走査で投影分布Ｐｙを入力し、Ｎｘ＞Ｎｙを満たすので、水平方向の走査で得られた投影分布Ｐｙを、予め設定したＮ区間になるようにＮ等分して、新しい投影分布Ｑ＝（ｑ１，ｑ２，…，ｑｊ，…，ｑＮ）を求める。ここで、Ｎ＝１１画素とすると、投影分布Ｑの中で、予め設定してあるｐ値のしきい値ＴＱ＝−０．５以上となる区間は、ｙ＝２からｙ＝１２までとなり、この結果、横２０画素×縦１１画素の文字列画像パターンＷが抽出される（図４参照）。
【００３７】
個別文字抽出回路４では、文字列抽出回路３で得られた、横２０画素×縦１１画素の文字列画像パターンＷを、高さがＮ＝１１画素なるように正規化処理を行い、２０×１１画素の文字列画像パターンＷＮを求める。この例では、Ｗｘ＝Ｎなので、文字列画像パターンＷと文字列画像パターンＷＮは同じ画像パターンになる。
この正規化された文字列画像パターンＷＮに対して、積算回路２の処理手順でのべた、ｘ＝１からｘ＝２０までの垂直方向走査によるｐ値の投影分布Ｐｘを求め、この投影分布Ｐｘに対して、予め設定したしきい値Ｔｘ＝−０．８とした場合、ｖｊ＞Ｔｘとなるｘの連続区間は〔ｘｓ＝３，ｘｅ＝２０〕となる。
【００３８】
今、参照パターン中で、文字幅がｚ１＝２画素とｚ２＝１１画素の範囲に決められているとすると、ｘｅ −ｘｓ＋１＞ｚ２を満たすので、１１×１１画素の個別文字の候補画像パターンＧとして、Ｇの左枠の位置ｘｓ＝３から１画素単位にずらして、Ｇの右枠の位置ｘｅ＝２０になるまで、８個の個別文字候補画像パターンＧが作成される。図５の４−１１は、Ｇの左枠がｘ＝３である場合を表わし、４−２は、Ｇの左枠がｘ＝４である場合を表わす。
【００３９】
以上、この発明の実施形態を図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計の変更等があってもこの発明に含まれる。
【００４０】
【発明の効果】
この発明によれば、従来技術では抽出が難しい、かすれ、よごれのような雑音や背景に網点などのテクスチャが重畳した多値画像パターンに対して、画像パターンを走査して新しい計数値を用いて、文字列領域を簡易にかつ正しく抽出することができる。
また、この発明によれば、従来の文字認識法が行っていた雑音やデザイン除去の前処理を行わなくてすむだけでなく、雑音やデザイン除去に誤って文字ストロークを削除してしまうことがないため、後段に行われる文字認識を高精度に行うことができるという利点がある。
【００４１】
また、この発明によれば、雑音を含む文字列画像パターンに対して、雑音背景下で認識が難しい、文字幅が小さいかまたは文字の高さが低い句読点や記号などの文字カテゴリに対しても、正しく認識できる利点がある。
また、この発明によれば、雑音の程度に応じて識別関数値のしきい値を文字カテゴリごとに設定できるので、雑音のない文字から雑音のある文字に対しても認識判定は正しくできる利点がある。
また、この発明によれば、雑音を含む文字列画像パターンに対して、新規に文字カテゴリを追加したり、参照パターンを更新する場合に、これら新規作成・追加・更新の処理が容易である。
【図面の簡単な説明】
【図１】本発明による文字認識装置の構成例を示すブロック図である。
【図２】本装置の動作例を示すフローチャートである。
【図３】本発明による入力の多値画像パターンに対する文字列抽出処理の一例を示す説明図である。
【図４】本発明による個別文字抽出処理の一例を示す説明図である。
【図５】本発明による個別文字候補画像パターンの一例を示す説明図である。
【符号の説明】
１……多値画像パターン記憶回路、２……積算回路、
３……文字列抽出回路、４……個別文字抽出回路、
５……個別文字照合回路、６……参照パターン学習回路、
７……画像入出力回路、
２−１……水平方向走査の投影軸、
２−２，２−３……水平方向走査の例、
３−１……垂直方向走査の投影軸、
３−２，３−３……垂直方向走査の例、
４−１，４−２……個別文字候補画像パターンの例[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a multi-valued input image, particularly a document image to which noise such as blurring and crushing has been added, a document image subjected to design processing such as a striped pattern, and a document image in which a blank character image and the like are mixed. The present invention relates to a character recognition method and apparatus.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, in a character recognition processing device such as a kanji OCR (Optical Character Reader), an input multi-valued document image pattern is binarized by (1) threshold value processing and treated as a binary image. And 2) a method of directly extracting a characteristic amount of a character from a multi-valued image.
[0003]
In the method (1), the binarized image is scanned in the horizontal or vertical direction, the number of black pixels or white pixels is projected for each scanning line, and the projection distribution of these amounts is obtained. , A character string area and an individual character area. After that, a feature for recognition is extracted in the form of a vector for each cut-out character, and a discrimination function such as a similarity or a difference between the standard pattern vector of each category in the standard dictionary created in advance. And determine the most similar character or graphic category as the recognition result. In this method, in the case of a document image in which the character pattern is a black character and the background portion is composed of white pixels, it is possible to determine the area of the individual character from the image based on the projection distribution. For characters that are designed so that they can be processed or written on photographs, the background area and the character area have similar projection distributions. Was.
[0004]
In addition, a method of extracting a character area from a design-processed image using a projection scale that is robust to design processing for a binary image is also known (for example, Sawaki, Hagita: Japanese Patent Application No. 7-248147). In this method, since a binary image is used as an input image, there is a problem that the method cannot be applied to a multivalued image. Furthermore, if this method is used after binarizing the multi-valued image, the threshold value processing deteriorates the image, and there is a problem that the character area is not properly cut out.
[0005]
On the other hand, in the method (2), the density distribution of the multi-valued image is observed, a portion corresponding to a mountain line is extracted as a character core, and compared with a standard core shape of each category created in advance in the dictionary. Techniques are known. However, when noise such as design processing is present, there is a problem that a design part is erroneously extracted as a character line or a character line part is not properly extracted.
[0006]
[Problems to be solved by the invention]
As described above, the conventional OCR and other character recognition techniques are based on character string images that have been subjected to design processing on the background, character string images with blurred or crushed characters, character string images composed of mixed white characters, etc. Means for correctly recognizing character categories from those images in the multi-valued image have not been sufficiently established.
[0007]
SUMMARY OF THE INVENTION In view of the above problems, it is an object of the present invention to provide a count value that reflects a difference between a character string region and a background region even in a multi-valued image, such as noise such as blur or dirt, and a designed character string. And the method of extracting the candidate image of the character string region and the individual character by using the count value of the individual character candidate image. By providing a method of changing the threshold value according to the degree of noise, it is possible to realize character recognition with higher accuracy than the conventional method.
[0008]
[Means for Solving the Problems]
In order to achieve the above object, according to the present invention, in order to separate a character region and a background region from a character string image pattern that has been subjected to a design processing such as a newspaper headline, a multi-directional image pattern is input in multiple directions. On the scanning line, the product of the density of a certain pixel and the density of an adjacent pixel, the product of the density of a certain pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of a certain pixel , And the product of the value obtained by subtracting the density of a certain pixel from a certain constant and the value obtained by subtracting the density of an adjacent pixel from a certain constant, and using these count values And extracting a character string region.
[0009]
Further, in the present invention, scanning is performed in a plurality of directions based on the information of the extracted image pattern of the character string area, and the product of the density of a certain pixel and the density of an adjacent pixel, and the density of a certain pixel on a scanning line. And the product of the value obtained by subtracting the density of an adjacent pixel from a certain constant, the product of the value obtained by subtracting the density of a certain pixel from a certain constant and the density of an adjacent pixel, and the value obtained by subtracting the density of a certain pixel from a certain constant The product of the value obtained by subtracting the density of adjacent pixels from a certain constant is counted, and using these count values, a method of extracting candidate regions for individual characters, or determining the individual character category in a character string by collation. A method for extracting a remaining unprocessed individual character candidate area based on the position of a character category and recognition information is provided.
[0010]
Further, in matching the individual candidate image pattern with the reference pattern in the dictionary, a discriminant function value (similarity) is obtained using a known discriminant function, and this value and a threshold determined by the degree of noise of the individual candidate image pattern are also used. By performing recognition or rejection determination processing, a correct character category and its position are selected from many candidate image patterns.
Further, it has a feature that a reference pattern of a dictionary and a threshold for recognition / rejection determination can be newly created / added / updated based on input multi-valued image pattern information.
As a result, in the present invention, an image pattern of a document or the like after a pre-process such as a tilt correction process or a shading correction process of the entire document image is performed on the input multi-valued image pattern, Is extracted, and the character category included in each character string is recognized by collating with the reference pattern of the dictionary. In addition, an image pattern of an erroneously recognized or unlearned character is additionally created or updated as a reference pattern in the dictionary.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of a character recognition apparatus to which the method of the present invention is applied. 1 is a multi-valued image pattern storage circuit, 2 is an integration circuit, 3 is a character string extraction circuit, 4 is an individual character extraction circuit, Denotes an individual character matching circuit, 6 denotes a reference pattern learning circuit, and 7 denotes an image input / output circuit.
FIG. 2 is a flowchart showing an operation example of the character recognition device.
FIG. 3 is an explanatory diagram illustrating an operation example of the integrating circuit 2. FIG. 4 is an explanatory diagram illustrating an operation example of the character string extraction circuit 3. FIG. 5 is an explanatory diagram illustrating an operation example of the individual character extraction circuit 4.
[0012]
The multi-level image pattern storage circuit 1 stores a multi-level image pattern to be recognized. The multi-valued image pattern is composed of N pixels. For example, the pixel density is indicated by a decimal value between 0.0 and 1.0, and “0.0” indicates white and “1.0” indicates black. I have.
The multiplication circuit 2 receives a multi-valued image pattern composed of N pixels extracted from the multi-valued image pattern storage circuit 1 and scans the image from a plurality of directions to distinguish between a character string region and a background region. , The product of the density of a certain pixel and the density of an adjacent pixel, the product of the density of a certain pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of a certain pixel from a certain constant And the product of the value obtained by subtracting the density of a certain pixel from a certain constant and the value obtained by subtracting the density of an adjacent pixel from a certain constant is output.
[0013]
The character string extraction circuit 3 discriminates a character string area from a background area based on the count value of the product sum of the densities of the pixels on each scanning line obtained by the integration circuit 2, and outputs an image of the character string area from the input image. Output a pattern.
The individual character extraction circuit 4 inputs the image pattern of the character string area obtained by the character string extraction circuit 3, selects a candidate image including an individual character to be collated by the individual character collation circuit 5, and outputs the candidate image.
The individual character matching circuit 5 receives the candidate image pattern of the individual character selected by the individual character extracting circuit 4 and compares it with a reference pattern in a dictionary using a predetermined identification function. By comparing with a threshold determined by the degree of noise of the image pattern, it is determined whether or not there is a recognized character category in the candidate image pattern, and if so, these results are output.
[0014]
The reference pattern learning circuit 6 receives the image information output from the image input / output circuit 7, the multivalued image pattern in the multivalued image pattern storage circuit 1, and the reference pattern of the dictionary in the individual character matching circuit 5. , A new reference pattern is created and updated, and added to or updated in the reference pattern of the dictionary of the individual character matching circuit 5.
The image input / output circuit 7 inputs recognition result information such as a character category number, a discriminant function value, and a multi-valued image pattern output from the individual character matching circuit 5 and displays the information. Edit the image information by, for example, Further, the image input / output circuit 7 sends information necessary for newly creating or updating the reference pattern to the reference pattern learning circuit 6.
[0015]
In FIG. 1, a process of extracting a character string region from an input multi-valued image pattern will be described.
First, in step S <b> 1 (see FIG. 2), a multivalued image pattern to be recognized is read from the multivalued image pattern storage circuit 1 and input to the integration circuit 2. A character is scanned in two directions, horizontal and vertical, with respect to a multivalued image pattern composed of Nx horizontal pixels × Ny vertical pixels, and for each pixel on each scanning line, the product a of the density of the pixel and the density of the adjacent pixel , And the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the product c of the value obtained by subtracting the density of the pixel from the constant and the density of the adjacent pixel, and the pixel Is calculated by multiplying a product e of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of an adjacent pixel from the certain constant, and the number n of scanning pixels.
[0016]
Furthermore, the sum T and sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X and X of the pixel values from the scan start pixel to the scan end pixel Find the sum of squares X2. Here, for example, the pixel value takes a value between 0.0 and 1.0, the black point has a pixel value = 1.0, and the white point has a pixel value = 0.0. As the constant, for example, 1.0 is used. The sum of the values of a, b, c, and e obtained for all the pixels on each scanning line is determined as ap, bp, cp, and ep, respectively.
[0017]
On each scanning line, using each count value of ap, bp, cp, ep,
[Equation 1]
(Equation 1)

Is obtained. The value of p takes a larger value in the text area than in the background area, whether the text area has a small change in pixel value compared to the background area (for example, the text is plain) or large (for example, a textured text). . Further, the p value takes a value in a range of -1 ≦ p ≦ 1 according to the normalized term of the denominator.
[0018]
This p value is projected on a projection axis corresponding to each scanning direction. Assuming that the p value obtained in the horizontal scanning of the position y is hy, the projection distribution Py = (h1, h2,..., Hy,..., HNy) is obtained in the horizontal scanning. Assuming that ap value obtained by horizontal scanning of the position x is vx, a projection distribution Px = (v1, v2,..., Vx,..., VNx) is obtained by horizontal scanning. Counting can also be performed in other scanning directions.
[0019]
Next, the processing of the present apparatus proceeds to step S2, where the character string extraction circuit 3 inputs the projection distribution Py and the projection distribution Px obtained from the integration circuit 2, and if Nx> Ny, the character string extraction circuit 3 The projection distribution Py obtained by the scanning is divided into N equal parts so as to have a preset N section, and a new projection distribution Q = (q1, q2,..., Qj,..., QN) is obtained. Otherwise, the projection distribution Px obtained by scanning in the vertical direction is divided into N equal parts so as to have a preset N section, and a new projection distribution Q = (q1, q2,..., Qj,. ).
[0020]
Now, the case of Nx> Ny will be described.
The new projection distribution Q is compared with a preset threshold value TQ, and the height of the character string Wy (= je) from the continuous section [js, je] (js≤j≤je) of j where qj> TQ is satisfied. −js + 1).
An image area surrounded by the height Wy pixels and the width Nx pixels is output as a character string image pattern W.
Here, if the string is composed of multiple lines, since the continuous section is plural output, you output a plurality of character string image pattern W.
[0021]
Next, the process of the present apparatus shifts to step S3, where the individual character extraction circuit 4 converts the character string image pattern W of Nx × Wy pixels obtained by the character string extraction circuit 3 into a normal image so that the height becomes N pixels. And a normalized character string image pattern WN is obtained.
With respect to the normalized character string image pattern WN, a projection distribution Px of ap value obtained by the solid vertical scanning in the processing procedure of the integration circuit 2 is obtained. By comparison, a continuous section [xs, xc] of x satisfying vj> Tx (xs ≦ x ≦ xe) is obtained.
[0022]
For the k continuous sections [xs1, xe1], [xs2, xe2],..., [xsk, xek], the pixels z1 to z2, which are the existence range of the character width stored in the reference pattern, The section i and the section j satisfying z1 ≦ xej−xsi + 1 ≦ z2 (i ≦ j) are obtained, and the character width Wx (= Xej−Xsi + 1) of the candidate image pattern G of the individual character is obtained. As a result, a plurality of individual character candidate image patterns G each composed of Wx × N pixels are obtained.
If xei−xsi + 1> z2 in the section i, the position xsi of the left frame of the candidate image pattern G is shifted in units of one pixel, and a plurality of candidate image patterns G of individual characters are shifted until the right frame reaches the position xei. create.
[0023]
Next, in a second method of obtaining an individual character candidate image, a character string image pattern W other than the individual character candidate image pattern G determined to be "recognized in a certain character category" by the individual character matching circuit 5 is used. For each unprocessed section [xt, xu], a section satisfying z1 ≦ xu−xt ≦ z2 is obtained, and an individual character candidate image pattern having a width Wx (= xu−1xt + 1) × N pixels is created. If there are a plurality of sections satisfying this condition, a plurality of individual character candidate image patterns are output.
Further, the individual character matching circuit 5 estimates the character pitch (interval) from the section of the individual character candidate image pattern G determined to be "recognized in a certain character category", and estimates the individual character candidate image pattern G from the remaining area. Is also applicable.
[0024]
Until now, a case where a horizontally long multi-valued image pattern is input has been mainly described, but a vertically long multi-valued image pattern can be similarly processed by reversing the processing in the x and y directions.
[0025]
Next, the processing of the individual character matching circuit 5 will be described.
The process of the present apparatus shifts to step S4. First, an individual character candidate image pattern G composed of Wx × N pixels is input, and a normalized multi-valued character pattern X = (x1, x2, n) of n (= N × N) pixels is input. .., Xi,..., Xn). At this time, the individual character candidate image pattern G is translated so as to be at the center of the normalized multi-valued character pattern X. Further, the sum H of xi with respect to the normalized multi-valued character pattern X is obtained by using the following equation.
(Equation 2)

[0026]
Next, in order to compare the normalized multi-valued character pattern X with the dictionary reference pattern T = (t1, t2,..., Ti,..., Tn), a known identification function is used. For example, a composite similarity (Iijima “Pattern Recognition Theory” Morikita Publishing Co., Ltd.) and the like can be mentioned.
[0027]
Judgment of recognition or rejection is performed on the reference pattern of each character category m using the similarity S (X, Tm) with X and the threshold value Sm of the reference pattern determined by the value of H as follows. . The relation table between H and the threshold value Sm has been learned by the reference pattern learning circuit 6 in advance.
(Condition 1) If S (X, Tm)> Sm, it is determined that the character category m has been recognized.
(Condition 2) Otherwise, it is determined that the character category is not m.
[0028]
When the condition 1 is satisfied for a plurality of categories for all reference patterns, the category having the highest similarity is output as the recognition category. If there is no category that satisfies the condition 1, the individual character candidate image pattern G is rejected as having no recognition category.
[0029]
Finally, a process of adding / updating a reference pattern of the dictionary of the individual character matching circuit 5 using the reference pattern learning circuit 6 and the image input / output circuit 7 will be described.
The image input / output device 7 inputs classification result information such as a recognized character category number, a discrimination function value, and a multi-valued image pattern output from the individual character matching circuit 5, displays the information, Edit image information using a keyboard or the like.
[0030]
When there is an erroneous recognition in the recognition result or when a reference pattern of a new character category is created, the image input / output device 7 is used to determine the number of recognition categories m to be added or updated from the input multivalued image pattern. Edit and create the value reference pattern Tm.
The reference pattern learning circuit 6 uses the created multi-valued reference pattern and the reference pattern of the dictionary of the individual character matching circuit 5 to generate a new normalized multi-valued reference pattern Tm composed of n (= N × N) pixels. The threshold value Sm for the recognition rejection determination is learned.
[0031]
Now, a multi-valued random noise pattern R = (r1, r2,..., Ri,..., Rn) composed of n (= N × N) pixels is created and superimposed on each multi-valued reference pattern T in the dictionary. An input image pattern X for learning is created. That is, the pixel value xi = ti + ri at the position i is obtained.
Calculate the similarity S (X, Tm) of the input image pattern X with respect to the multi-valued reference pattern Tm of the new category m, find the maximum similarity among categories other than m among all the reference patterns, and calculate the value. Is set as Sm, and the multivalued reference pattern Tm of the new category m and its threshold Sm are stored in the dictionary of the individual character matching circuit 5.
[0032]
As a specific operation example of the integrating circuit 2 which is a main part of the present invention, a case of a multi-valued image pattern composed of 20 horizontal pixels × 14 vertical pixels as shown in FIG. .
This figure shows an input image pattern in which halftone dot design processing is performed on the background of a character string image pattern in which "character" is written.
This multi-valued image pattern is read and input to the integrating circuit 2. With respect to this input multi-valued image pattern, a character is scanned in two directions, that is, a horizontal direction and a vertical direction, and the p-value shown in Expression 1 is projected onto the projection axis 2-1 and the projection axis 2-2, respectively.
[0033]
As an example of the scanning in FIG. 3, in the horizontal scanning 2-3 at y = 2, ap = 3.34, bp = 4.86, cp = 5.36, ep = 5.44, T = 8.70, T2 = 5.31, X = 8.20, X2 = 4.76, n = 19, and the p-value at y = 2 is:
[Equation 2]
p = (3.34 * 5.44-4.86 * 5.36) / ｛sqrt [19 * 5.31-8.70 * 8.70] * sqrt [19 * 4.76-8.20 * 8.20]｝ = − 0.326
It becomes.
[0034]
In the horizontal scanning 2-4 at y = 6, ap = 4.86, bp = 4.24, cp = 4.74, ep = 5.16, T = 9.60, T2 = 6.36, X = 9.10, X2 = 5.81 and n = 19 are counted, and the p-value at y = 6 is
[Equation 3]
p = (4.86 * 5.16-4.24 * 4.74) / ｛sqrt [19 * 6.36-9.60 * 9.60] * sqrt [19 * 5.81-9.10 * 9.10]｝ = 0.177
It becomes.
[0035]
From this result, it is shown that the p-value of the background portion composed of halftone dots takes a smaller value than the p-value obtained by scanning the character line portion. It turns out that it is an effective measure to divide.
By horizontal scanning from y = 1 to y = 14, a projection distribution Py is obtained.
[0036]
In the character string extraction circuit 3, the projection distribution Py is input in the horizontal scanning from y = 1 to y = 14 and Nx> Ny is satisfied. Therefore, the projection distribution Py obtained in the horizontal scanning is set in advance. Divide into N equal sections so as to have N sections, and obtain a new projection distribution Q = (q1, q2,..., Qj,..., QN). Here, assuming that N = 11 pixels, in the projection distribution Q, the section where the preset p-value threshold value TQ = −0.5 or more is from y = 2 to y = 12, As a result, a character string image pattern W of 20 horizontal pixels × 11 vertical pixels is extracted (see FIG. 4).
[0037]
The individual character extraction circuit 4 performs a normalization process on the character string image pattern W of 20 pixels horizontally × 11 pixels vertically obtained by the character string extraction circuit 3 so that the height becomes N = 11 pixels. A character string image pattern WN of 11 pixels is obtained. In this example, since Wx = N, the character string image pattern W and the character string image pattern WN are the same image pattern.
With respect to the normalized character string image pattern WN, a projection distribution Px of p values by vertical scanning from x = 1 to x = 20 is obtained by the processing procedure of the integration circuit 2, and the projection distribution Px On the other hand, when the preset threshold value Tx = −0.8, the continuous section of x satisfying vj> Tx is [xs = 3, xe = 20].
[0038]
Now, assuming that the character width is determined in the range of z1 = 2 pixels and z2 = 11 pixels in the reference pattern, xe−xs + 1> z2 is satisfied. As G, eight individual character candidate image patterns G are created by shifting the position xs = 3 of the left frame of G by one pixel until the position xe = 20 of the right frame of G. 5 represents the case where the left frame of G is x = 3, and 4-2 represents the case where the left frame of G is x = 4.
[0039]
As described above, the embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and there are design changes and the like within a range not departing from the gist of the present invention. Is also included in the present invention.
[0040]
【The invention's effect】
According to the present invention, for a multivalued image pattern in which a texture such as a halftone dot is superimposed on noise or blur, which is difficult to extract with the conventional technology, and a background image, a new count value is used by scanning the image pattern. Thus, the character string area can be easily and correctly extracted.
Further, according to the present invention, not only is it unnecessary to perform preprocessing for noise and design removal that has been performed by the conventional character recognition method, but also there is no possibility that character strokes are accidentally deleted for noise or design removal. Therefore, there is an advantage that character recognition performed in the subsequent stage can be performed with high accuracy.
[0041]
Further, according to the present invention, a character string image pattern including noise is difficult to recognize under a noise background, and a character category such as a punctuation mark or a symbol having a small character width or a low character height is difficult. Has the advantage that it can be recognized correctly.
Further, according to the present invention, since the threshold value of the discrimination function value can be set for each character category according to the degree of noise, there is an advantage that recognition judgment can be correctly performed even from a character having no noise to a character having noise. is there.
Further, according to the present invention, when a new character category is added to a character string image pattern including noise or a reference pattern is updated, these new creation / addition / update processing is easy.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a configuration example of a character recognition device according to the present invention.
FIG. 2 is a flowchart illustrating an operation example of the present apparatus.
FIG. 3 is an explanatory diagram showing an example of a character string extraction process for an input multi-valued image pattern according to the present invention.
FIG. 4 is an explanatory diagram showing an example of an individual character extraction process according to the present invention.
FIG. 5 is an explanatory diagram showing an example of an individual character candidate image pattern according to the present invention.
[Explanation of symbols]
1. Multi-valued image pattern storage circuit 2. Integration circuit,
3 ... character string extraction circuit 4 ... individual character extraction circuit
5 individual character matching circuit 6 reference pattern learning circuit
7 ... Image input / output circuit,
2-1: projection axis for horizontal scanning,
2-2, 2-3 ... example of horizontal scanning
3-1: Projection axis of vertical scanning,
3-2, 3-3 ... example of vertical scanning,
4-1, 4-2... Examples of individual character candidate image patterns

Claims

The multi-valued image pattern stored in the predetermined multi-valued image pattern storage means is scanned in a plurality of directions, and for each pixel on each scanning line, the product of the density of the pixel and the density of an adjacent pixel is calculated. The product of the density and the value obtained by subtracting the density of the adjacent pixel from a certain constant, the product of the value obtained by subtracting the density of the pixel from a certain constant and the density of the adjacent pixel, and the density of the pixel being subtracted from a certain constant A product of the value obtained by subtracting the density of the adjacent pixel from a certain constant, and an integration process for obtaining the sum of the products.
A character string extraction step of extracting an image pattern of a character string region from the multi-valued image pattern based on the sum of products obtained in the integration step;
From the character string image pattern extracted in the character string extraction step, an individual character extraction step of extracting a candidate image pattern that is an area of each individual character constituting the character string image pattern,
For a candidate image pattern of each individual character extracted in the individual character extraction step, from a predetermined dictionary, an individual character matching step of searching for a recognition character pattern corresponding to the candidate image pattern,
An image output step of outputting a recognized character pattern found in the individual character matching step.

The character recognition method according to claim 1,
An image input process of inputting an arbitrary multi-valued image pattern and editing / generating a multi-valued reference pattern based on the multi-valued image pattern;
Character recognition comprising a reference pattern learning step of updating a reference pattern in the dictionary based on the multi-valued reference pattern generated in the image input step and a reference pattern stored in the dictionary. Method.

The character recognition method according to claim 1 or 2,
The integration step includes:
Reading a multi-level image pattern from the multi-level image pattern storage means,
Scan the read multi-value image pattern in the horizontal and vertical directions,
For each pixel on each scan line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
A total sum ap, bp, cp, ep of a, b, c, e obtained for all pixels on each scanning line is obtained,
For each scanning line, the sum T and sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
A character recognition method characterized by calculating

The character recognition method according to claim 3,
The multi-valued image pattern is composed of horizontal Nx pixels × vertical Ny pixels,
The character string extraction process includes:
If Nx> Ny,
The value of p calculated in the horizontal scanning in the integration process is projected on a vertical projection axis to obtain a projection distribution Py,
By dividing the projection distribution Py into N equal parts, a projection distribution Q is generated,
Regarding the projection distribution Q, a continuous section whose component is larger than a predetermined threshold value TQ is obtained,
A multi-valued image pattern in the continuous section is defined as a character string image pattern W,
On the other hand, if Ny> Nx,
The value of p calculated in the vertical scanning in the integration process is projected on a horizontal projection axis to obtain a projection distribution Px,
By dividing the projection distribution Px into N equal parts, a projection distribution Q is generated,
Regarding the projection distribution Q, a continuous section whose component is larger than a predetermined threshold value TQ is obtained,
A multi-valued image pattern in the continuous section is a character string image pattern W.

The character recognition method according to claim 4,
The individual character extraction process includes:
If Nx> Ny,
The character string image pattern W of N x × W y pixels is normalized to a normalized character string image pattern W N of horizontal ( N x · N / W y ) pixels × vertical N pixels,
Scan the normalized character string image pattern WN in the vertical direction,
For each pixel on the scanning line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
The respective sums ap, bp, cp, ep of a, b, c, e obtained for all the pixels on the scanning line are obtained,
For the scanning line, the sum T and the sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
And calculate
The value of p is projected onto a horizontal projection axis to obtain a projection distribution Px,
A continuous section in which the element of the projection distribution Px is larger than a predetermined threshold value Tx is obtained,
If the length of the continuous section is within a predetermined value, the character string image pattern W in the continuous section is set as a candidate image pattern G,
On the other hand, if Ny> Nx,
The N y × W x string image pattern W pixel, normalized to the horizontal N pixels × vertical (N y · N / W x ) normalized character string image pattern WN pixels,
The normalized character string image pattern WN is scanned in the horizontal direction,
For each pixel on the scanning line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
The respective sums ap, bp, cp, ep of a, b, c, e obtained for all the pixels on the scanning line are obtained,
For the scanning line, the sum T and the sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
And calculate
The value of p is projected onto a vertical projection axis to obtain a projection distribution Py,
A continuous section in which the element of the projection distribution Py is larger than a predetermined threshold Ty is obtained,
When the length of the continuous section is within a predetermined value, a character string image pattern W in the continuous section is set as a candidate image pattern G.

The character recognition method according to claim 5,
The individual character matching step includes:
Normalizing the candidate image pattern G into a normalized multi-valued character pattern X of N vertical pixels × N horizontal pixels;
A similarity S (X, Tm) between the normalized multi-valued character pattern X and a reference pattern T stored in a predetermined dictionary is obtained for a plurality of reference patterns T in the dictionary,
Regarding the plurality of reference patterns T, a reference pattern T whose similarity S (X, Tm) is larger than a predetermined threshold value Sm and whose similarity S (X, Tm) is the largest is recognized as a candidate image pattern G. A character recognition method characterized by using a character pattern.

Multi-valued image pattern storage means for storing a multi-valued image pattern,
The multi-level image pattern stored in the multi-level image pattern storage unit is scanned in a plurality of directions, and for each pixel on each scanning line, the product of the density of the pixel and the density of an adjacent pixel, and the density of the pixel And the product of the value obtained by subtracting the density of the adjacent pixel from a certain constant, the product of the value obtained by subtracting the density of the pixel from a certain constant and the density of the adjacent pixel, and the density of the pixel being subtracted from a certain constant A product of the value and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and integrating means for calculating the sum of the products;
Character string extraction means for extracting an image pattern of a character string area from the multi-valued image pattern based on the product sum determined by the integration means,
Individual character extraction means for extracting, from the character string image pattern extracted by the character string extraction means, a candidate image pattern that is an area of each individual character constituting the character string image pattern,
For a candidate image pattern of each individual character extracted by the individual character extraction unit, from a predetermined dictionary, an individual character matching unit that searches for a recognition character pattern corresponding to the candidate image pattern,
An image output unit for outputting a recognized character pattern found by the individual character matching unit.

The character recognition device according to claim 7,
Image input means for inputting an arbitrary multi-valued image pattern and editing / generating a multi-valued reference pattern based on the multi-valued image pattern;
Reference pattern learning means for updating the reference pattern in the dictionary based on the multi-valued reference pattern generated by the image input means and the reference pattern stored in the dictionary in the individual character matching means. Character recognition device characterized by the above-mentioned.

The character recognition device according to any one of claims 7 and 8,
The integrating means,
Reading a multi-level image pattern from the multi-level image pattern storage means,
Scan the read multi-value image pattern in the horizontal and vertical directions,
For each pixel on each scan line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
A total sum ap, bp, cp, ep of a, b, c, e obtained for all pixels on each scanning line is obtained,
For each scanning line, the sum T and sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
A character recognition device characterized by calculating

The character recognition device according to claim 9,
The multi-valued image pattern is composed of horizontal Nx pixels × vertical Ny pixels,
The character string extracting means,
If Nx> Ny,
Projecting the value of p calculated in the horizontal scanning by the integrating means on a vertical projection axis to obtain a projection distribution Py;
By dividing the projection distribution Py into N equal parts, a projection distribution Q is generated,
Regarding the projection distribution Q, a continuous section whose component is larger than a predetermined threshold value TQ is obtained,
A multi-valued image pattern in the continuous section is defined as a character string image pattern W,
On the other hand, if Ny> Nx,
Projecting the value of p calculated by the integrating means in the vertical scanning onto a horizontal projection axis to obtain a projection distribution Px;
By dividing the projection distribution Px into N equal parts, a projection distribution Q is generated,
Regarding the projection distribution Q, a continuous section whose component is larger than a predetermined threshold value TQ is obtained,
A multi-valued image pattern in the continuous section is a character string image pattern W.

The character recognition device according to claim 10,
The individual character extracting means,
If Nx> Ny,
The character string image pattern W of N x × W y pixels is normalized to a normalized character string image pattern W N of horizontal ( N x · N / W y ) pixels × vertical N pixels,
Scan the normalized character string image pattern WN in the vertical direction,
For each pixel on the scanning line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
The respective sums ap, bp, cp, ep of a, b, c, e obtained for all the pixels on the scanning line are obtained,
For the scanning line, the sum T and the sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
And calculate
The value of p is projected onto a horizontal projection axis to obtain a projection distribution Px,
A continuous section in which the element of the projection distribution Px is larger than a predetermined threshold value Tx is obtained,
If the length of the continuous section is within a predetermined value, the character string image pattern W in the continuous section is set as a candidate image pattern G,
On the other hand, if Ny> Nx,
The N y × W x string image pattern W pixel, normalized to the horizontal N pixels × vertical (N y · N / W x ) normalized character string image pattern WN pixels,
The normalized character string image pattern WN is scanned in the horizontal direction,
For each pixel on the scanning line, the product a of the density of the pixel and the density of the adjacent pixel, the product b of the density of the pixel and the value obtained by subtracting the density of the adjacent pixel from a certain constant, and the density of the pixel And a product c of a value obtained by subtracting the density of the pixel from a certain constant and a value obtained by subtracting the density of the adjacent pixel from a certain constant, and a product e of the value obtained by subtracting the density of the pixel from the certain constant. Find the number of pixels n,
The respective sums ap, bp, cp, ep of a, b, c, e obtained for all the pixels on the scanning line are obtained,
For the scanning line, the sum T and the sum of squares T2 of the pixel values from the pixel immediately before the scan start pixel to the pixel immediately before the scan end pixel, and the sum X of the pixel values from the scan start pixel to the scan end pixel X And the sum of squares X2,
Based on these values p = (ap · ep-bp · cp) / {sqrt [(n-1) · T2- T 2 ] · sqrt [(n-1) · X2- X 2 ]}
And calculate
The value of p is projected onto a vertical projection axis to obtain a projection distribution Py,
A continuous section in which the element of the projection distribution Py is larger than a predetermined threshold Ty is obtained,
When the length of the continuous section is within a predetermined value, a character string image pattern W in the continuous section is set as a candidate image pattern G.

The character recognition device according to claim 11,
The individual character collating means,
Normalizing the candidate image pattern G into a normalized multi-valued character pattern X of N vertical pixels × N horizontal pixels;
A similarity S (X, Tm) between the normalized multi-valued character pattern X and a reference pattern T stored in a predetermined dictionary is obtained for a plurality of reference patterns T in the dictionary,
Regarding the plurality of reference patterns T, a reference pattern T whose similarity S (X, Tm) is larger than a predetermined threshold value Sm and whose similarity S (X, Tm) is the largest is recognized as a candidate image pattern G. A character recognition device characterized by a character pattern.