JP4094240B2

JP4094240B2 - Image characteristic determination processing apparatus, image characteristic determination processing method, program for executing the method, and computer-readable storage medium storing the program

Info

Publication number: JP4094240B2
Application number: JP2001043145A
Authority: JP
Inventors: 裕子杉浦
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2001-02-20
Filing date: 2001-02-20
Publication date: 2008-06-04
Anticipated expiration: 2021-02-20
Also published as: JP2002245405A

Description

【０００１】
【発明の属する技術分野】
本発明は画像特性判別処理装置、画像特性判別処理方法、該方法を実行させるためのプログラム及び該プログラムを格納したコンピュータ読み取り可能な記憶媒体に関し、詳細には文字認識装置の前処理として罫線や文字を含む認識対象画像の特性を判別する処理方法に関する。
【０００２】
【従来の技術】
画像の罫線や文字の識別処理は、濃い又はかすれている等の処理対象画像の特性によって識別処理に使用されているアルゴリズムやしきい値が対応できずに識別精度が低下することがある。画像の特性は、スキャナ等の入力装置で入力する時の２値化スレッシュを変更したりすることで、入力画像の濃さなどを補正し識別アルゴリズムに対応した画像特性を得ることは可能である。
【０００３】
【発明が解決しようとする課題】
しかし、一度入力されてしまった２値データに対して、アルゴリズムに合わない特性だからと言って再度入力し直すなどは現実的ではない上に、能率が悪い。罫線や文字識別処理の前処理として、処理画像の特性を把握しその特性の情報を提供することができれば、その情報に対応したアルゴリズムやしきい値などに変更することが可能となり、識別精度を向上させることが可能となる。また、画像が濃く、つぶれているような場合をつぶれ画像と定義すると、つぶれ画像の特徴は文字がつぶれて画像部が太り罫線に接触している場合が多い。更に、つぶれ画像における実線識別処理においては画像が密で結合しているためにつぶれた文字部に短い実線が誤認識されることが多い。
【０００４】
本発明は入力画像の特徴のひとつである画像特性判別部を罫線や文字認識処理の前処理として設けることにより、罫線や文字の識別精度を向上させるための有益な情報を提供することを目的とする。
【０００５】
【課題を解決するための手段】
前記問題点を解決するために、本発明に係る画像特性判別処理装置は、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出手段と、該ラン抽出手段により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別手段と、該罫線ラン判別手段により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出手段と、該罫線抽出手段により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別手段と、該短罫線判別手段により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出手段と、該領域抽出手段により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別手段と、該つぶれ画像判別手段により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別手段とを有する。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【０００６】
また、つぶれ画像判別手段は、領域抽出手段により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の面積の分散値を求め、求めた分散値が所定値以上であれば当該領域をつぶれ画像とすることにより、つぶれ画像特性判別の精度を向上させ、罫線や文字の認識精度を向上させるための有益な情報を提供できる。
【０００７】
更に、別の発明としての画像特性判別処理方法は、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出工程と、該ラン抽出工程により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別工程と、該罫線ラン判別工程により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出工程と、該罫線抽出工程により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別工程と、該短罫線判別工程により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出工程と、該領域抽出工程により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別工程と、該つぶれ画像判別工程により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別工程とを有することに特徴がある。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【０００８】
また、つぶれ画像判別工程では、領域抽出工程により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の面積の分散値を求め、求めた分散値が所定値以上であれば当該領域をつぶれ画像とすることにより、つぶれ画像特性判別の精度を向上させ、罫線や文字の認識精度を向上させるための有益な情報を提供できる。
【０００９】
更に、別の発明として、コンピュータに、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出手順と、該ラン抽出手順により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別手順と、該罫線ラン判別手順により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出手順と、該罫線抽出手順により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別手順と、該短罫線判別手順により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出手順と、該領域抽出手順により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別手順と、該つぶれ画像判別手順により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別手順とを実行させるプログラムに特徴がある。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【００１０】
また、別の発明として、上記記載の画像特性判別処理方法を実行させるためのプログラムを格納したコンピュータ読み取り可能な記憶媒体に特徴がある。よって、既存のシステムを変えることなく、かつ画像特性判別処理システムを構築する装置を汎用的に使用することができる。
【００１１】
【発明の実施の形態】
本発明に係る画像特性判別処理方法によれば、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとし、抽出したランの長さが所定値以上であるランを罫線ランとする。そして、判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とし、更には抽出された罫線の長さが所定値より小さい罫線を短罫線とする。そして、判別された短罫線同士が交差している短罫線を含む領域を抽出し、抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とする。次に、判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する。
【００１２】
【実施例】
はじめに、本発明における構成及び動作の説明の中で使用する語句を以下に定義しておく。「矩形」とは２値画像データの画像部（例えば黒画素部）を一塊として、それらが接触包含される外接四角形で囲んだ範囲を矩形と定義する。よって、連結する黒画素を全て包含する外接範囲を矩形とする。「矩形抽出」とは矩形の位置座標を抽出することを矩形抽出と定義する。「処理方向」とは画像をＸ，Ｙの２次元とすれば、Ｘ方向（スキャン時の主走査方向）の処理とＹ方向（スキャン時の副走査方向）の処理を２方向の処理がある。点線抽出の場合には、点線がＸ方向に延びている場合とＹ方向に延びている場合があり、Ｘ方向とＹ方向の２方向でそれぞれ処理を行うものとする。「実線識別処理」の例として、矩形抽出後、着目する矩形を処理開始とし、同ベクトル上でかつ所定の距離内に存在する矩形どうしを結合してゆく処理とする。
【００１３】
以下、図を用いて本発明の実施例の動作を説明する。
図１は本発明の第１の実施例に係る画像特性処理方法の全体動作を示すフローチャートである。同図において、先ず、処理のスキャナなどで画像を２値画像入力部で２値化データに変換し、高速化のために圧縮画像を作成する（ステップＳ１０１，Ｓ１０２）。ここで、圧縮画像とは１／４圧縮の場合４画素のうち全て白画素であった場合にのみ白画素ひとつに置き換え、４画素中１つでも黒画素が含まれていれば、黒画素ひとつに置き換えるといった圧縮処理をさす。次に、罫線認識処理で罫線を認識する（ステップＳ１０３）。抽出された罫線を使ってつぶれ領域候補抽出処理を実行する（ステップＳ１０４）。抽出されたつぶれ領域候補の特徴より画像の特性を判別する（ステップＳ１０５）。
【００１４】
次に、図１のステップ１０３の処理で共通に用いられる処理、ラン抽出の一例を図２に従って説明する。ラン抽出処理はいずれの方法でもよしとする。ただし、ランの位置情報を提供する処理であることが必要である。また、入力された２値データをそのまま使用するか、処理時間短縮のために圧縮画像データを用いるかのどちらでも自由とする。ただし、本発明では、罫線認識処理の場合には圧縮画像(画素)データを矩形抽出処理の場合にはそのままの２値(画素)データを用いる。図２において、全ライン数を計数し、更に１ライン毎の画素数を計数する（ステップＳ２０１，Ｓ２０２）。そして、主走査方向の１ライン毎に黒画素連結を調べてゆき、黒画素の連結が途切れた時に黒画素一塊をランとして登録する（ステップＳ２０３；ＹＥＳ、ステップＳ２０４）。同様の処理を全ラインに対して繰返す（ステップＳ２０５）。
【００１５】
また、図１のステップＳ１０３での罫線認識処理の一例を図３に従って説明する。罫線認識方法はいずれの方法でもよいが、罫線認識情報として、認識した罫線の位置を示すアドレス情報を提供する処理であることが必要である。ここでは一方向のみの罫線認識の場合であるが、Ｘ、Ｙ方向と両方向の罫線認識するには、図３の処理を繰り返せばよい。図３において、図２で抽出されたランの全ての数を計数し、そのランの長さが所定値のＡ値以上であるかを判定する（ステップＳ３０１，Ｓ３０２）。所定値のＡ値以上であるものを罫線ランとして登録する（ステップＳ３０２；ＹＥＳ、ステップＳ３０３）。以上の処理を全てのランに対して行う（ステップＳ３０４）。次に、その罫線ラン数を計数し、罫線ラン同士が所定値以内に存在しているかを調べて、所定値内であれば罫線として登録する（ステップＳ３０５、ステップＳ３０６；ＹＥＳ、ステップＳ３０７）。同様の処理を全罫線ランの数だけ繰り返す（ステップＳ３０８）。
【００１６】
更に、図１のステップＳ１０４でのつぶれ領域候補抽出処理の一例を図４に従って説明する。図１のステップＳ１０３で抽出された全罫線において、全罫線数を計数し、その罫線の長さが所定値のＢ値以下であるかを判定する（ステップＳ４０１，Ｓ４０２）。所定値のＢ値以下であるのものを短罫線として登録する（ステップＳ４０２；ＹＥＳ、ステップＳ４０３）。以上の処理を全ての罫線に対して行う（ステップＳ４０４）。次に、全短罫線数を計数し、短罫線同士で完全に交差しているものがないかを探索する（ステップＳ４０５，Ｓ４０６）。図５の（ａ）を例にすると、短罫線ＡとＢは接触しているだけでなく完全に交差している。例えば、図５の（ｃ）に示すように交差接触しているだけでは、枠の場合であることがあるために完全に突き抜けて交差していることが条件である。図５の（ａ）で示すような位置関係にあれば、それを包含する領域を登録する（ステップＳ４０６；ＹＥＳ、ステップＳ４０７）。次に、図５の（ｂ）で示すように短罫線Ｃも完全交差していれば、図４のステップＳ４０７で登録した領域を成長させる。以上同様の処理を全短罫線に対して実施して領域を抽出する（ステップＳ４０８）。
【００１７】
次に、図１のステップ１０５での画像特性判別処理を図６に従って説明する。図１のステップＳ１０４で抽出されたつぶれ領域候補は大体文字上で抽出される(図５の（ｄ）参照)。そこで、通常の文字サイズを予測し、あるいは予め、サイズをいずれかの手段を用いて情報として与えておき、それよりも大きい領域の場合には「文字がつぶれて隣どうし接触しあいつながっている状態」であると判別する。従って、図６の抽出された全領域のうち領域面積が所定値のＣ値以上のものをつぶれ領域として登録する（ステップＳ５０１、ステップＳ５０２；ＹＥＳ、ステップＳ５０３）。以上の処理を全ての領域に対して行う（ステップＳ５０４）。ステップＳ５０３でつぶれ領域として登録されたものが所定値のＤ個以上あればつぶれ画像と判別する（ステップＳ５０５；ＹＥＳ、ステップＳ５０６）。
【００１８】
また、図１のステップ１０５での画像特性判別処理の別の例を図７に従って説明する。図１のステップＳ１０４で抽出されたつぶれ領域候補に対して、各領域に包含されている罫線の信頼性を評価し、つぶれ領域であるか否かの判別をする（ステップＳ６０１〜Ｓ６０５）ことで、さらに画像特性判別の精度を向上させるものである。例えば図５の（ａ）を例にとると、罫線ＡとＢが誤認識された罫線であるかの判定を行う。その判定方法はいずれの方式でも可とする。判定方法の一例をあげると、罫線Ａの近傍に黒画素を利用し、例えば黒画素密度が高い場合には文字上で誤認識された罫線と判別する。このようにして、つぶれ領域候補を構成する罫線のうち誤認識された罫線の占める割合が所定値Ｚ値以上であればつぶれ領域と判定する（ステップＳ６０３；ＹＥＳ）。ステップＳ６０４でつぶれ領域として登録されたものが所定値のＤ個以上あればつぶれ画像と判別する（ステップＳ６０６；ＹＥＳ、ステップＳ６０７）。
【００１９】
また、上記実施例ではつぶれ領域候補の面積が文字サイズ以上という条件を判定理由に用いているが、文字サイズが不明であったり、想定した文字サイズが実際の認識対象の文字サイズと異なっていた場合には画像特性の判定精度が悪くなる可能性がある。そこで、全つぶれ領域候補の面積のバラツキを算出（分散値）を用いることでこれを解決する。つまり、抽出されたつぶれ領域候補がバラツキなく一定であれば、つぶれていない一文字一文字上で抽出された領域であると判別できるのである。図８のステップＳ７０１でつぶれ領域候補数が所定値のＹ個以上の場合（ステップＳ７０１；ＹＥＳ）、分散値を算出し、所定値のＥ値以上であれば、つぶれ画像と判別する（ステップＳ７０２、ステップＳ７０３；ＹＥＳ、ステップＳ７０４）。
【００２０】
図９は本発明の第２の実施例に係る画像特性処理方法の全体動作を示すフローチャートである。同図において、先ず、処理のスキャナなどで画像を２値画像入力部で２値化データに変換し、高速化のために圧縮画像を作成する（ステップＳ８０１，Ｓ８０２）。次に、罫線認識処理で罫線を認識する（ステップＳ８０３）。抽出された罫線を使って罫線特徴判別処理を実行する（ステップＳ８０４）。判別された罫線特徴情報を用いて画像の特性を判別する（ステップＳ８０５）。
【００２１】
図９のステップＳ８０３での罫線認識処理の一例を図１０に従って説明する。図９のステップＳ８０３で抽出された全罫線を計数し、他の罫線と接触せずに単独に存在しているか否かを判別する（ステップＳ９０１，Ｓ９０２）。単独存在していれば単独罫線として登録する（ステップＳ９０２；ＹＥＳ、ステップＳ９０３）。以上の処理を全ての罫線に対して行う（ステップＳ９０４）。次に、単独罫線を計数し、単独罫線を判別された単独罫線周囲近傍の黒画素密度を求めてその値が所定値のＧ値以上であれば（ステップＳ９０５、ステップＳ９０６；ＹＥＳ）、つぶれ文字上に誤認識された罫線と判別して罫線として登録していたものを無効化する（ステップＳ９０７）。全単独罫線に対して判別処理が終了すれば（ステップＳ９０８；ＹＥＳ）、無効化罫線数が所定値のＨ値以上であれば、つぶれ画像であると判定する（ステップＳ９０９；ＹＥＳ、ステップＳ９１０）。
【００２２】
また、図９のステップＳ８０３での罫線認識処理の別の例を図１１に従って説明する。図９のステップＳ８０３で抽出された全罫線を計数し、他の罫線と接触せずに単独に存在しているか否かを判別する（ステップＳ１００１，Ｓ１００２）。単独存在していれば単独罫線として登録する（ステップＳ１００２；ＹＥＳ、ステップＳ１００３）。以上の処理を全ての罫線に対して行う（ステップＳ１００４）。次に、単独罫線と判別された罫線に対して、その罫線範囲内を処理対象としてラン抽出処理を行う（ステップＳ１００５，Ｓ１００６）。その結果、抽出されたランの数が所定値のＦ値以上であれば、かすれ罫線であると判定して登録する（ステップＳ１０７；ＹＥＳ、ステップＳ１００８）。以上の処理を全ての単独罫線に対して行う（ステップＳ１００９）。次に、かすれ罫線と判別された罫線の数が所定値のＹ値以上であれば、かすれ画像であると判定する（ステップＳ１０１０；ＹＥＳ、ステップＳ１０１１）。
【００２３】
次に、図１２は本発明のシステム構成を示すブロック図である。つまり、同図は上記各実施例における画像特性判別処理方法によるソフトウェアを実行するマイクロプロセッサ等から構築されるハードウェアを示すものである。同図において、画像特性判別処理システムはインターフェース（以下Ｉ／Ｆと略す）１２１、ＣＰＵ１２２、ＲＯＭ１２３、ＲＡＭ１２４、表示装置１２５、ハードディスク１２６、キーボード１２７及びＣＤ−ＲＯＭドライブ１２８を含んで構成されている。また、汎用の処理装置を用意し、ＣＤ−ＲＯＭなどの読取可能な記憶媒体１２９には、本発明の画像特性判別処理方法を実行するプログラムが記録されている。更に、Ｉ／Ｆ１２１を介して外部装置から制御信号が入力され、キーボード１２７によって操作者による指令又は自動的に本発明のプログラムが起動される。そして、ＣＰＵ１２２は当該プログラムに従って上述の画像特性判別処理方法を施し、その処理結果をＲＡＭ１２４やハードディスク１２６等の記録装置に格納し、必要により表示装置１２５などに出力する。以上のように、本発明の画像特性判別処理方法を実行するプログラムが記憶した媒体を用いることにより、既存のシステムを変えることなく、かつ画像特性判別処理システムを構築する装置を汎用的に使用することができる。
【００２４】
なお、本発明は上記実施例に限定されるものではなく、特許請求の範囲内の記載であれば多種の変形や置換可能であることは言うまでもない。
【００２５】
【発明の効果】
以上説明したように、本発明に係る画像特性判別処理装置は、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出手段と、該ラン抽出手段により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別手段と、該罫線ラン判別手段により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出手段と、該罫線抽出手段により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別手段と、該短罫線判別手段により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出手段と、該領域抽出手段により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別手段と、該つぶれ画像判別手段により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別手段とを有する。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【００２６】
また、つぶれ画像判別手段は、領域抽出手段により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の面積の分散値を求め、求めた分散値が所定値以上であれば当該領域をつぶれ画像とすることにより、つぶれ画像特性判別の精度を向上させ、罫線や文字の認識精度を向上させるための有益な情報を提供できる。
【００２７】
更に、別の発明としての画像特性判別処理方法は、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出工程と、該ラン抽出工程により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別工程と、該罫線ラン判別工程により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出工程と、該罫線抽出工程により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別工程と、該短罫線判別工程により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出工程と、該領域抽出工程により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別工程と、該つぶれ画像判別工程により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別工程とを有することに特徴がある。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【００２８】
また、つぶれ画像判別工程では、領域抽出工程により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の面積の分散値を求め、求めた分散値が所定値以上であれば当該領域をつぶれ画像とすることにより、つぶれ画像特性判別の精度を向上させ、罫線や文字の認識精度を向上させるための有益な情報を提供できる。
【００２９】
更に、別の発明として、コンピュータに、光学的に読み取った画像データの主走査方向及び副走査方向の１ライン毎に黒画素連結を調べ、黒画素の連結が途切れた時に黒画素一塊をランとするラン抽出手順と、該ラン抽出手順により抽出したランの長さが所定値以上であるランを罫線ランとする罫線ラン判別手順と、該罫線ラン判別手順により判別された罫線ランの数を計数して罫線ラン同士が所定値以内に存在している罫線ラン同士を罫線とする罫線抽出手順と、該罫線抽出手順により抽出された罫線の長さが所定値より小さい罫線を短罫線とする短罫線判別手順と、該短罫線判別手順により判別された短罫線同士が交差している短罫線を含む領域を抽出する領域抽出手順と、該領域抽出手順により抽出した領域の面積が所定値以上である領域をつぶれ領域とし、当該つぶれ領域の数が所定値以上であるつぶれ領域をつぶれ画像とするつぶれ画像判別手順と、該つぶれ画像判別手順により判別されたつぶれ画像が占める割合に基づいて認識対象の画像の特性を判別する画像特性判別手順とを実行させるプログラムに特徴がある。よって、接触しあう短実線を利用して画像特性を判別し、罫線や文字の識別精度を向上させるための有益な情報を提供できる。
【００３０】
また、別の発明として、上記記載の画像特性判別処理方法を実行させるためのプログラムを格納したコンピュータ読み取り可能な記憶媒体に特徴がある。よって、既存のシステムを変えることなく、かつ画像特性判別処理システムを構築する装置を汎用的に使用することができる。
【図面の簡単な説明】
【図１】本発明の第１の実施例に係る画像特性処理方法の全体動作を示すフローチャートである。
【図２】第１の実施例におけるラン抽出動作の一例を示すフローチャートである。
【図３】第１の実施例における罫線認識処理動作の一例を示すフローチャートである。
【図４】第１の実施例におけるつぶれ領域候補抽出処理動作の一例を示すフローチャートである。
【図５】第１の実施例における画像認識対象の一例を示す図である。
【図６】第１の実施例における画像特性判別処理動作の一例を示すフローチャートである。
【図７】第１の実施例における画像特性判別処理動作の別の例を示すフローチャートである。
【図８】第１の実施例におけるつぶれ領域候補抽出処理動作の別の例を示すフローチャートである。
【図９】本発明の第２の実施例に係る画像特性処理方法の全体動作を示すフローチャートである。
【図１０】第２の実施例における罫線認識処理動作の一例を示すフローチャートである。
【図１１】第２の実施例における罫線認識処理動作の別の例を示すフローチャートである。
【図１２】本発明のシステム構成を示すブロック図である。
【符号の説明】
１２１；Ｉ／Ｆ、１２２；ＣＰＵ、１２３；ＲＯＭ、１２４；ＲＡＭ、
１２５；表示装置、１２６；ハードディスク、１２７；キーボード、
１２８；ＣＤ−ＲＯＭドライブ、１２９；記憶媒体。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image characteristic determining processing apparatus, an image characteristic determining processing method, a program and a computer-readable storage medium storing the program for causing execution of said method, ruled lines and characters as preprocessing for character recognition device in detail The present invention relates to a processing method for discriminating the characteristics of a recognition target image including.
[0002]
[Prior art]
In the ruled line or character identifying process of the image, the algorithm or threshold value used for the identifying process cannot be supported due to the characteristics of the processing target image such as dark or faint, and the identification accuracy may be lowered. The image characteristics can be obtained by changing the binarization threshold when inputting with an input device such as a scanner, thereby correcting the density of the input image and obtaining image characteristics corresponding to the identification algorithm. .
[0003]
[Problems to be solved by the invention]
However, for binary data that has already been input, it is not practical to input it again because it is a characteristic that does not match the algorithm, and the efficiency is poor. As a pre-processing for ruled line and character identification processing, if the characteristics of the processed image can be grasped and information on the characteristics can be provided, it can be changed to an algorithm or threshold corresponding to the information, and the identification accuracy can be improved. It becomes possible to improve. Also, if a case where the image is dark and crushed is defined as a crushed image, the collapsed image is often characterized in that the characters are crushed and the image portion is in contact with a thick ruled line. Further, in the solid line identification processing in a collapsed image, since the images are closely coupled, a short solid line is often erroneously recognized in the collapsed character portion.
[0004]
An object of the present invention is to provide useful information for improving ruled line and character identification accuracy by providing an image characteristic determination unit, which is one of the features of an input image, as preprocessing for ruled line and character recognition processing. To do.
[0005]
[Means for Solving the Problems]
In order to solve the above problems, the image characteristic determination processing device according to the present invention checks the black pixel connection for each line in the main scanning direction and the sub-scanning direction of the optically read image data, and connects the black pixels. A run extracting unit that uses a lump of black pixels as a run when the run is interrupted, a ruled line run determining unit that uses a run whose run length extracted by the run extracting unit is a predetermined value or more as a ruled line run, and the ruled line run determining unit The ruled line extraction means that counts the number of ruled line runs discriminated by the above and counts the ruled line runs between the ruled line runs within a predetermined value, and the length of the ruled line extracted by the ruled line extraction means is predetermined. A short ruled line discriminating unit that uses a ruled line smaller than the value as a short ruled line, a region extracting unit that extracts a region including a short ruled line that intersects the short ruled lines discriminated by the short ruled line discriminating unit, and the region extracting unit. Extracted region A collapsed image discriminating means having a collapsed area in which the area of the area is equal to or greater than a predetermined value and a collapsed image in which the number of the collapsed areas is greater than or equal to a predetermined value; Image characteristic determining means for determining the characteristics of the image to be recognized based on the proportion occupied. Thus, it is possible to provide useful information for determining image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification.
[0006]
Further, the collapsed image discriminating unit determines a region where the area of the region extracted by the region extracting unit is equal to or greater than a predetermined value as a collapsed region, obtains a variance value of the area of the collapsed region, and if the calculated variance value is equal to or greater than the predetermined value. if by the image collapse the region, one shake improve the accuracy of the image characteristic discrimination, can provide valuable information for improving the recognition accuracy of the ruled lines and characters.
[0007]
Further, according to another aspect of the present invention, there is provided an image characteristic determination processing method for checking black pixel connection for each line in the main scanning direction and sub-scanning direction of optically read image data. A run extraction step using a lump as a run, a ruled line determination step using a run whose length extracted in the run extraction step is a predetermined value or more as a ruled line run, and a ruled line run determined by the ruled line run determination step A ruled line extraction process that uses ruled line runs between ruled line runs that are within a predetermined value and a ruled line length extracted by the ruled line extraction process is shorter than a predetermined value. A short ruled line determining step to be a ruled line, a region extracting step for extracting a region including a short ruled line intersecting the short ruled lines determined by the short ruled line determining step, and an area of the region extracted by the region extracting step More than predetermined value Based on the ratio of the collapsed image determined by the collapsed image discriminating step and the collapsed image discriminating step where the number of the collapsed regions is a predetermined value or more And an image characteristic determining step for determining characteristics of the target image. Thus, it is possible to provide useful information for determining image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification.
[0008]
In the collapsed image discrimination step, a region where the area of the region extracted in the region extraction step is equal to or greater than a predetermined value is determined as a collapsed region, and a variance value of the area of the collapsed region is obtained. For example, by making the area a collapsed image, it is possible to improve the accuracy of the collapsed image characteristic determination and provide useful information for improving the recognition accuracy of ruled lines and characters.
[0009]
Furthermore, as another invention, the computer checks the black pixel connection for each line in the main scanning direction and the sub scanning direction of the optically read image data, and when the black pixel connection is interrupted, the black pixel block is run. The run extraction procedure to be performed, the ruled line run determination procedure using the run whose length extracted by the run extraction procedure is a predetermined value or more as the ruled line run, and the number of ruled line runs determined by the ruled line run determination procedure A ruled line extraction procedure in which ruled line runs that are within a predetermined value are ruled lines, and a ruled line extracted by the ruled line extraction procedure is shorter than a predetermined value. A ruled line discriminating procedure, a region extracting procedure for extracting a region including a short ruled line intersecting the short ruled lines determined by the short ruled line discriminating procedure, and an area of the region extracted by the region extracting procedure is a predetermined value or more is there Based on the collapsed image determination procedure in which the area is a collapsed area, and the number of the collapsed areas is a predetermined value or more, and the ratio of the collapsed images determined by the collapsed image determination procedure The program is characterized by an image characteristic determination procedure for determining image characteristics. Thus, it is possible to provide useful information for determining image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification .
[0010]
Further, as another invention is characterized in a computer-readable storage medium storing a program for causing execution of the image characteristic determination processing method described above. Therefore, an apparatus for constructing an image characteristic discrimination processing system can be used for a general purpose without changing an existing system.
[0011]
DETAILED DESCRIPTION OF THE INVENTION
According to the image characteristic determination processing method according to the present invention, the black pixel connection is checked for each line in the main scanning direction and the sub-scanning direction of the optically read image data. And a run whose extracted run length is a predetermined value or more is a ruled line run. Then, the number of ruled line runs determined is counted, ruled line runs that are within a predetermined value are defined as ruled lines, and a ruled line whose extracted ruled line length is smaller than a predetermined value is further defined as a short ruled line. And Then, an area including a short ruled line where the determined short ruled lines intersect is extracted, an area where the area of the extracted area is equal to or greater than a predetermined value is defined as a collapsed area, and the number of the collapsed areas is equal to or greater than a predetermined value. Let the collapsed area be a collapsed image. Next, to determine the characteristics of the recognition target image based on the proportion of the discriminated collapsed image.
[0012]
【Example】
First, terms used in the description of the configuration and operation in the present invention are defined below. “Rectangle” defines an area surrounded by a circumscribing rectangle in which image portions (for example, black pixel portions) of binary image data are grouped together and included in contact with each other. Therefore, the circumscribed range including all the connected black pixels is a rectangle. “Rectangle extraction” refers to extracting rectangle position coordinates as rectangle extraction. “Processing direction” means that if the image is two-dimensional, X and Y, there are two directions of processing in the X direction (main scanning direction during scanning) and processing in the Y direction (sub-scanning direction during scanning). . In the case of the dotted line extraction, the dotted line may extend in the X direction and may extend in the Y direction, and processing is performed in two directions, the X direction and the Y direction, respectively. As an example of the “solid line identification process”, after extracting a rectangle, the process starts with the target rectangle, and the rectangles on the same vector and within a predetermined distance are combined.
[0013]
The operation of the embodiment of the present invention will be described below with reference to the drawings.
FIG. 1 is a flowchart showing the overall operation of the image characteristic processing method according to the first embodiment of the present invention. In the figure, first, an image is converted into binarized data by a binary image input unit by a processing scanner or the like, and a compressed image is created for speeding up (steps S101 and S102). Here, the compressed image is replaced with one white pixel only when all of the four pixels are white pixels in the case of 1/4 compression, and one black pixel is included if at least one of the four pixels is included. The compression process such as replacing with. Next, the ruled line is recognized by the ruled line recognition process (step S103). Using the extracted ruled line, a collapsed area candidate extraction process is executed (step S104). The characteristics of the image are discriminated from the extracted features of the collapsed area candidate (step S105).
[0014]
Next, an example of the process and run extraction commonly used in the process of step 103 in FIG. 1 will be described with reference to FIG. The run extraction process may be any method. However, it is necessary that the process provides the position information of the run. In addition, it is free to use the input binary data as it is or use compressed image data to shorten the processing time. However, in the present invention, compressed image (pixel) data is used in the case of ruled line recognition processing, and binary (pixel) data is used as it is in the case of rectangle extraction processing. In FIG. 2, the total number of lines is counted, and the number of pixels per line is further counted (steps S201 and S202). Then, the black pixel connection is checked for each line in the main scanning direction, and when the black pixel connection is interrupted, the black pixel block is registered as a run (step S203; YES, step S204). Similar processing is repeated for all lines (step S205).
[0015]
An example of the ruled line recognition process in step S103 in FIG. 1 will be described with reference to FIG. Any method may be used as the ruled line recognition method, but it is necessary that the ruled line recognition information be processing for providing address information indicating the position of the recognized ruled line. In this case, the ruled line is recognized only in one direction. However, in order to recognize the ruled line in both the X and Y directions, the process shown in FIG. 3 may be repeated. In FIG. 3, the number of all the runs extracted in FIG. 2 is counted, and it is determined whether the length of the run is equal to or greater than a predetermined value A (steps S301 and S302). Those that are equal to or greater than the predetermined value A are registered as ruled line runs (step S302; YES, step S303). The above process is performed for all the runs (step S304). Next, the number of ruled line runs is counted, and it is checked whether or not the ruled line runs are within a predetermined value. If they are within the predetermined value, they are registered as ruled lines (step S305, step S306; YES, step S307). Similar processing is repeated by the number of all ruled line runs (step S308).
[0016]
Further, an example of the collapsed area candidate extraction process in step S104 of FIG. 1 will be described with reference to FIG. In all the ruled lines extracted in step S103 of FIG. 1, the number of all ruled lines is counted, and it is determined whether the length of the ruled lines is equal to or less than a predetermined B value (steps S401 and S402). Those that are equal to or less than the predetermined B value are registered as short ruled lines (step S402; YES, step S403). The above processing is performed for all ruled lines (step S404). Next, the total number of short ruled lines is counted, and it is searched whether there are any short ruled lines that completely intersect (steps S405 and S406). Taking FIG. 5A as an example, the short ruled lines A and B not only touch but completely intersect. For example, as shown in FIG. 5 (c), there is a case where the frame is a frame only when it is in cross contact with each other. If there is a positional relationship as shown in FIG. 5A, a region including the positional relationship is registered (step S406; YES, step S407). Next, if the short ruled line C also completely intersects as shown in FIG. 5B, the region registered in step S407 in FIG. 4 is grown. The same process is performed on all the short ruled lines to extract a region (step S408).
[0017]
Next, the image characteristic determination processing in step 105 of FIG. 1 will be described with reference to FIG. The collapsed area candidates extracted in step S104 in FIG. 1 are extracted on approximately characters (see (d) in FIG. 5). Therefore, the normal character size is predicted, or the size is given as information using any means in advance, and in the case of an area larger than that, `` the character is crushed and adjacent to each other and connected ". Therefore, a region whose area is equal to or greater than a predetermined C value is registered as a collapsed region among all the extracted regions in FIG. 6 (step S501, step S502; YES, step S503). The above processing is performed for all regions (step S504). If there are D or more registered as collapsed areas in step S503, it is determined as a collapsed image (step S505; YES, step S506).
[0018]
Further, another example of the image characteristic determination process in step 105 of FIG. 1 will be described with reference to FIG. By evaluating the reliability of the ruled line included in each area for the collapsed area candidates extracted in step S104 of FIG. 1, it is determined whether the area is a collapsed area (steps S601 to S605). Further, the accuracy of image characteristic discrimination is improved. For example, taking FIG. 5A as an example, it is determined whether the ruled lines A and B are misrecognized ruled lines. The determination method may be any method. As an example of the determination method, black pixels are used in the vicinity of the ruled line A. For example, when the black pixel density is high, it is determined that the ruled line is erroneously recognized on the character. In this way, if the proportion of the ruled lines misrecognized among the ruled lines constituting the collapsed area candidates is equal to or greater than the predetermined value Z value, the area is determined to be a collapsed area (step S603; YES). If there are D or more registered as collapsed areas in step S604, the image is determined to be a collapsed image (step S606; YES, step S607).
[0019]
Further, in the above embodiment, the condition that the area of the collapsed area candidate is equal to or larger than the character size is used for the determination reason. However, the character size is unknown or the assumed character size is different from the actual recognition target character size. In some cases, the accuracy of determining image characteristics may be deteriorated. Therefore, this is solved by using the calculation (variance value) of the variation in the area of all the collapsed area candidates. That is, if the extracted collapsed area candidates are constant without variation, it can be determined that the extracted area is an area extracted on each character that is not collapsed. In step S701 in FIG. 8, if the number of collapsed area candidates is equal to or larger than a predetermined value Y (step S701; YES), a variance value is calculated. If the number is equal to or greater than the predetermined value E, it is determined as a collapsed image (step S702). Step S703; YES, Step S704).
[0020]
FIG. 9 is a flowchart showing the overall operation of the image characteristic processing method according to the second embodiment of the present invention. In the figure, first, an image is converted into binarized data by a binary image input unit by a processing scanner or the like, and a compressed image is created for speeding up (steps S801 and S802). Next, a ruled line is recognized by the ruled line recognition process (step S803). A ruled line feature discrimination process is executed using the extracted ruled line (step S804). The characteristics of the image are determined using the determined ruled line feature information (step S805).
[0021]
An example of the ruled line recognition process in step S803 in FIG. 9 will be described with reference to FIG. All the ruled lines extracted in step S803 of FIG. 9 are counted, and it is determined whether or not they exist independently without contacting other ruled lines (steps S901 and S902). If it exists alone, it is registered as a single ruled line (step S902; YES, step S903). The above processing is performed for all ruled lines (step S904). Next, the number of single ruled lines is counted, the black pixel density in the vicinity of the single ruled line around which the single ruled line has been determined is obtained, and if that value is equal to or greater than the predetermined G value (step S905, step S906; YES), the collapsed character The ruled line that has been misrecognized above and registered as a ruled line is invalidated (step S907). If the discrimination processing is completed for all the single ruled lines (step S908; YES), if the number of invalidated rulelines is equal to or greater than a predetermined value H, it is determined that the image is a collapsed image (step S909; YES, step S910). .
[0022]
Further, another example of the ruled line recognition process in step S803 in FIG. 9 will be described with reference to FIG. All the ruled lines extracted in step S803 in FIG. 9 are counted, and it is determined whether or not they exist independently without contacting other ruled lines (steps S1001 and S1002). If it exists alone, it is registered as a single ruled line (step S1002; YES, step S1003). The above processing is performed for all ruled lines (step S1004). Next, a run extraction process is performed on the ruled line determined to be a single ruled line with the ruled line range as a processing target (steps S1005 and S1006). As a result, if the number of extracted runs is equal to or larger than the predetermined F value, it is determined that the line is a blurred ruled line (step S107; YES, step S1008). The above processing is performed for all the single ruled lines (step S1009). Next, if the number of ruled lines determined to be blurred lines is equal to or greater than a predetermined Y value, it is determined that the image is a blurred image (step S1010; YES, step S1011).
[0023]
Next, FIG. 12 is a block diagram showing a system configuration of the present invention. That is, this figure shows hardware constructed from a microprocessor or the like that executes software according to the image characteristic determination processing method in each of the above embodiments. In the figure, the image characteristic determination processing system includes an interface (hereinafter abbreviated as I / F) 121, a CPU 122, a ROM 123, a RAM 124, a display device 125, a hard disk 126, a keyboard 127, and a CD-ROM drive 128. A general-purpose processing device is prepared, and a program for executing the image characteristic determination processing method of the present invention is recorded in a readable storage medium 129 such as a CD-ROM. Furthermore, a control signal is input from an external device via the I / F 121, and an instruction from the operator or a program of the present invention is automatically activated by the keyboard 127. Then, the CPU 122 performs the above-described image characteristic determination processing method according to the program, stores the processing result in a recording device such as the RAM 124 or the hard disk 126, and outputs it to the display device 125 or the like as necessary. As described above, by using a medium storing a program for executing the image characteristic determination processing method of the present invention, an apparatus for constructing an image characteristic determination processing system can be used universally without changing an existing system. be able to.
[0024]
In addition, this invention is not limited to the said Example, It cannot be overemphasized that various deformation | transformation and substitution are possible if it is description in a claim.
[0025]
【The invention's effect】
As described above, the image characteristic determination processing device according to the present invention checks the black pixel connection for each line in the main scanning direction and the sub-scanning direction of the optically read image data, and the black pixel connection is interrupted. Sometimes, a run extracting means that uses a batch of black pixels as a run, a ruled line run determining means that uses a run whose length extracted by the run extracting means is a predetermined value or more as a ruled line run, and the ruled line run determining means The ruled line extraction means that counts the number of ruled line runs and the ruled line runs between the ruled line runs are ruled lines, and the length of the ruled line extracted by the ruled line extraction means is smaller than the predetermined value. A short ruled line discriminating unit that uses the ruled line as a short ruled line, a region extracting unit that extracts a short ruled line that intersects the short ruled lines discriminated by the short ruled line discriminating unit, and an area extracted by the region extracting unit Area of An area that is equal to or greater than a predetermined value is set as a collapsed area, a collapsed image determination unit that includes a number of collapsed areas that are equal to or greater than a predetermined value, and a ratio of the collapsed image determined by the collapsed image determination unit. Image characteristic determining means for determining characteristics of an image to be recognized based on the image characteristics. Thus, it is possible to provide useful information for determining image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification.
[0026]
Further, the collapsed image discriminating unit determines a region where the area of the region extracted by the region extracting unit is equal to or greater than a predetermined value as a collapsed region, obtains a variance value of the area of the collapsed region, and if the calculated variance value is equal to or greater than the predetermined value. if by the image collapse the region, one shake improve the accuracy of the image characteristic discrimination, can provide valuable information for improving the recognition accuracy of the ruled lines and characters.
[0027]
Further, according to another aspect of the present invention, there is provided an image characteristic determination processing method for checking black pixel connection for each line in the main scanning direction and sub-scanning direction of optically read image data. A run extraction step using a lump as a run, a ruled line determination step using a run whose length extracted in the run extraction step is a predetermined value or more as a ruled line run, and a ruled line run determined by the ruled line run determination step A ruled line extraction process that uses ruled line runs between ruled line runs that are within a predetermined value and a ruled line length extracted by the ruled line extraction process is shorter than a predetermined value. A short ruled line determining step to be a ruled line, a region extracting step of extracting a region including a short ruled line intersecting the short ruled lines determined by the short ruled line determining step, and an area of the region extracted by the region extracting step More than predetermined value Based on the ratio of the collapsed image determined by the collapsed image discriminating step and the collapsed image discriminating step where the number of the collapsed regions is a predetermined value or more And an image characteristic determining step for determining characteristics of the target image. Thus, it is possible to provide useful information for determining image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification.
[0028]
In the collapsed image discrimination step, a region where the area of the region extracted in the region extraction step is equal to or greater than a predetermined value is determined as a collapsed region, and a variance value of the area of the collapsed region is obtained. For example, by making the area a collapsed image, it is possible to improve the accuracy of the collapsed image characteristic determination and provide useful information for improving the recognition accuracy of ruled lines and characters.
[0029]
Furthermore, as another invention, the computer checks the black pixel connection for each line in the main scanning direction and the sub scanning direction of the optically read image data, and when the black pixel connection is interrupted, the black pixel block is run. The run extraction procedure to be performed, the ruled line run determination procedure using the run whose length extracted by the run extraction procedure is a predetermined value or more as the ruled line run, and the number of ruled line runs determined by the ruled line run determination procedure A ruled line extraction procedure in which ruled line runs that are within a predetermined value are ruled lines, and a ruled line extracted by the ruled line extraction procedure is shorter than a predetermined value. A ruled line discriminating procedure, a region extracting procedure for extracting a region including a short ruled line intersecting the short ruled lines determined by the short ruled line discriminating procedure, and an area of the region extracted by the region extracting procedure is a predetermined value or more is there Based on the collapsed image determination procedure in which the area is a collapsed area, and the number of the collapsed areas is a predetermined value or more, and the ratio of the collapsed images determined by the collapsed image determination procedure The program is characterized by an image characteristic determination procedure for determining image characteristics. Therefore, it is possible to provide useful information for discriminating image characteristics using short solid lines that come into contact with each other and improving the accuracy of ruled line and character identification .
[0030]
Further, as another invention is characterized in a computer-readable storage medium storing a program for causing execution of the image characteristic determination processing method described above. Therefore, an apparatus for constructing an image characteristic discrimination processing system can be used for a general purpose without changing an existing system.
[Brief description of the drawings]
FIG. 1 is a flowchart showing an overall operation of an image characteristic processing method according to a first embodiment of the present invention.
FIG. 2 is a flowchart showing an example of a run extraction operation in the first embodiment.
FIG. 3 is a flowchart showing an example of ruled line recognition processing operation in the first embodiment.
FIG. 4 is a flowchart showing an example of a collapsed area candidate extraction processing operation in the first embodiment.
FIG. 5 is a diagram illustrating an example of an image recognition target in the first embodiment.
FIG. 6 is a flowchart illustrating an example of an image characteristic determination processing operation in the first embodiment.
FIG. 7 is a flowchart illustrating another example of the image characteristic determination processing operation in the first embodiment.
FIG. 8 is a flowchart illustrating another example of a collapsed area candidate extraction processing operation according to the first embodiment.
FIG. 9 is a flowchart showing an overall operation of an image characteristic processing method according to a second embodiment of the present invention.
FIG. 10 is a flowchart showing an example of ruled line recognition processing operation in the second embodiment.
FIG. 11 is a flowchart showing another example of ruled line recognition processing operation in the second embodiment.
FIG. 12 is a block diagram showing a system configuration of the present invention.
[Explanation of symbols]
121; I / F, 122; CPU, 123; ROM, 124; RAM,
125; display device; 126; hard disk; 127; keyboard;
128; CD-ROM drive, 129; storage medium.

Claims

Run extraction means for checking the black pixel connection for each line in the main scanning direction and the sub-scanning direction of the optically read image data, and running the black pixel block as a run when the black pixel connection is interrupted;
Ruled line run determining means that uses a run whose length extracted by the run extracting means is a predetermined value or more as a ruled line run;
A ruled line extraction unit that counts the number of ruled line runs determined by the ruled line run determination unit and sets ruled line runs between ruled line runs within a predetermined value as ruled lines;
A short ruled line discriminating unit that uses a ruled line extracted by the ruled line extracting unit as a short ruled line.
An area extracting means for extracting an area including a short ruled line where the short ruled lines determined by the short ruled line determining means intersect;
A collapsed image discriminating unit in which a region where the area of the region extracted by the region extracting unit is a predetermined value or more is a collapsed region, and a collapsed region in which the number of the collapsed regions is a predetermined value or more is a collapsed image;
Image characteristic determining means for determining the characteristics of the image to be recognized based on the proportion of the collapsed image determined by the collapsed image determining means;
An image characteristic determination processing device comprising:

The collapsed image discriminating unit determines a region where the area of the region extracted by the region extracting unit is equal to or greater than a predetermined value as a collapsed region, obtains a variance value of the area of the collapsed region, and if the calculated variance value is equal to or greater than the predetermined value. The image characteristic determination processing apparatus according to claim 1, wherein the area is a collapsed image .

The black pixel connection is checked for each line in the main scanning direction and the sub-scanning direction of the optically read image data, and when the connection of the black pixels is interrupted, a run extraction step in which the black pixel block is run,
A ruled line run determining step in which a run whose length extracted in the run extracting step is a predetermined value or more is a ruled line run;
A ruled line extraction step in which the number of ruled line runs determined in the ruled line run determining step is counted and ruled line runs in which the ruled line runs are within a predetermined value are ruled lines;
A short ruled line discriminating step in which a ruled line extracted by the ruled line extracting step has a ruled line shorter than a predetermined value as a short ruled line;
A region extracting step of extracting a region including a short ruled line where the short ruled lines determined by the short ruled line determining step intersect;
A collapsed image discriminating step in which a region where the area of the region extracted by the region extraction step is a predetermined value or more is a collapsed region, and a collapsed region in which the number of the collapsed regions is a predetermined value or more is a collapsed image;
An image characteristic discriminating step for discriminating the characteristics of the image to be recognized on the basis of the proportion of the crushing image determined by the crushing image determining step;
An image characteristic determination processing method characterized by comprising:

In the collapsed image discrimination step, a region where the area of the region extracted in the region extraction step is a predetermined value or more is determined as a collapsed region, a variance value of the area of the collapsed region is obtained, and the obtained variance value is equal to or greater than a predetermined value. 4. The image characteristic discrimination processing method according to claim 3, wherein the area is a collapsed image .

The computer extracts the black pixel connection for each line in the main scanning direction and the sub-scanning direction of the optically read image data, and when the connection of the black pixels is interrupted, a run extraction procedure in which a block of black pixels is run,
A ruled line run determination procedure in which a run in which the length of the run extracted by the run extraction procedure is a predetermined value or more is a ruled line run;
A ruled line extraction procedure ruled run together by counting the number of discriminated borders run is a ruled borders run each other are present in the predetermined value in the following by該罫lines run determination procedure,
A short ruled line discriminating procedure in which a ruled line whose length is extracted by the ruled line extracting procedure is shorter than a predetermined value;
An area extraction procedure for extracting an area including a short ruled line where the short ruled lines determined by the short ruled line determination procedure intersect;
A collapsed image determination procedure in which a region where the area of the region extracted by the region extraction procedure is a predetermined value or more is a collapsed region, and a collapsed region in which the number of the collapsed regions is a predetermined value or more is a collapsed image;
An image characteristic determination procedure for determining characteristics of an image to be recognized based on a ratio occupied by the collapsed image determined by the collapsed image determination procedure;
A program for running

A computer-readable storage medium storing the program according to claim 5 .