JP3671682B2

JP3671682B2 - Image recognition device

Info

Publication number: JP3671682B2
Application number: JP20491298A
Authority: JP
Inventors: 和弘上田; 祥二今泉
Original assignee: Konica Minolta Business Technologies Inc
Current assignee: Konica Minolta Business Technologies Inc
Priority date: 1998-07-21
Filing date: 1998-07-21
Publication date: 2005-07-13
Anticipated expiration: 2018-07-21
Also published as: JP2000036908A

Description

【０００１】
【発明の属する技術分野】
本発明は、デジタル複写機、ファクシミリなどにおいて画像入力装置から読み取られた原稿画像を認識する画像認識装置に関する。
【０００２】
【従来の技術】
従来、例えば、自動原稿搬送装置（Auto Document Feeder、以下、「ＡＤＦ」という。）を備えたデジタル複写機において、複数枚の原稿を連続して読み取ってコピーする場合、読み取られた原稿の向きに応じてコピーが排出されるので、原稿をＡＤＦにセットする際に、各原稿が一律に同じ方向を向いているか否かを確認する必要があった。しかし、このような確認作業は結構手間がかかるものであり、原稿枚数が多ければ、それだけ反対方向の原稿を見落とすおそれも大きくなる。
【０００３】
このような不都合を避けるため、原稿を読み取り、生成した画像データからまず原稿の方向を判別し（このような原稿の方向の判別を、以下、「天地認識」という。）、その画像出力が紙面に対して適切な方向になるように画像データを回転処理して出力する方法が考えられている。天地認識の方法は種々考案されており、例えば、特開平９−９０４０号公報には、原稿の中で文書画像と認識される部分に関し、主走査方向及び副走査方向についてのヒストグラムを取得することにより、天地認識を行う技術が開示されている。
【０００４】
図１０は、上記従来の天地認識の方法について説明するための図である。同図に示されるように、画像入力装置に入力された文書画像データから、主走査方法及び副走査方向に関して取得されたヒストグラムを用いて、文字行の向きや、行の先頭の文字位置を認識することにより、天地認識を行うことができる。
【０００５】
より具体的には、取得されたヒストグラムの立ち上がりエッジの数と立ち下がりエッジの数、両者の和及び差を、主走査方向のヒストグラム及び副走査方向のヒストグラムについてそれぞれ算出する。図１０に示したような横書きの文書画像であれば、主走査方向のヒストグラム（ａ）の立ち上がりエッジの数と立ち下がりエッジの数がほぼ等しくなる他、両者の和は、副走査方向のヒストグラム（ｂ）と比較して多くなる。主走査方向のヒストグラム（ａ）には、行を表すピークと、行間に相当するピークのない部分が交互に検出されるからである。
【０００６】
また、図１０の例における副走査方向のヒストグラム（ｂ）からは、行の先頭の文字位置を認識することができる。即ち、行の先頭の文字位置は比較的一定しているため、ヒストグラムの立ち上がりエッジが数箇所に集中するのに対し、行の最後の文字位置は、文章によってまちまちであり、従って行の最後の文字位置によるヒストグラムの立ち下がりエッジは、比較的分散して発生するので、その立ち上がりエッジの数と立ち下がりエッジの数との差が大きければ当該原稿の上下を認識できる。
【０００７】
以上のような認識により、文字行の向きと、行の先頭の文字位置を検出することができ、かかる検出結果に基づいて天地認識を行うことができる。これらの認識は、縦書きの原稿についても適用することが可能である。
【０００８】
【発明が解決しようとする課題】
しかしながら、上記従来の技術は、例えば原稿がＡＤＦを介してセットされる際に傾いてしまったような場合に、正確な天地認識ができない場合があるという問題を有していた。
【０００９】
即ち、原稿が傾いていしまったような場合には、図１１に示すように行間や、先頭の文字位置を正確に認識することが可能なヒストグラムを取得することができず、取得されたヒストグラムを用いて天地認識を行っても正確な認識を行うことができないからである。このような問題は、原稿に記載された文字行自体が傾いているような場合にも起こりうる。
【００１０】
本発明は、上記の問題点に鑑み、原稿が傾いている場合においても、天地認識を行うことができる画像認識装置を提供することを目的とする。
【００１１】
本発明の第２の目的は、原稿に記載された文字行自体が傾いているような場合においても、より正確な天地認識を行うことができる画像認識装置を提供することである。
【００１２】
【課題を解決するための手段】
上記課題を解決するために、本発明の第１の画像認識装置は、原稿を読み取り第１の画像データを生成する画像読み取り手段と、前記原稿の傾きを検出する原稿傾き検出手段と、前記原稿傾き検出手段により検出された原稿の傾きに基づいて、前記画像読み取り手段により生成された第１の画像データの傾きを補正して、第２の画像データを生成する傾き補正手段と、前記画像読み取り手段により生成された第１の画像データ、及び、前記傾き補正手段により生成された第２の画像データのうち、何れの画像データに基づいて原稿の方向を認識するかを判断する判断手段と、前記判断手段により判断された第１の画像データ、及び第２の画像データのうちの何れかの画像データに基づいて、原稿の方向を認識する原稿方向認識手段とを備えることを特徴とする。
【００１３】
また、本発明の第２の画像認識装置は、原稿を読み取り画像データを生成する画像読み取り手段と、前記原稿の傾きを検出する原稿傾き検出手段と、前記原稿傾き検出手段により検出された原稿の傾きに基づいて、前記画像読み取り手段により生成された第１の画像データの傾きを補正して、第２の画像データを生成する傾き補正手段と、前記画像読み取り手段により生成された第１の画像データ、及び前記傾き補正手段により生成された第２の画像データのそれぞれに基づいて、原稿の方向の認識を行う場合の、認識の信頼度に関する情報を取得する信頼度取得手段と、前記信頼度取得手段により第１の画像データに基づいて取得された信頼度に関する情報と、前記信頼度取得手段により第２の画像データに基づいて取得された信頼度に関する情報とを比較する比較手段と、前記比較手段による比較の結果、信頼度が高い方の画像データに基づいて、原稿の方向を認識する原稿方向認識手段とを備えることを特徴とする。
【００１４】
【発明の実施の形態】
本発明に係る画像認識装置の一適用例であるデジタル複写機（以下、単に「複写機」という。）について、以下に添付の図面を参照しながら説明する。
【００１５】
（１）複写機全体の構成
まず、本発明に係る画像認識装置が適用される複写機の全体の構成を図１により説明する。
同図に示すように、この複写機は、ＡＤＦ１０と、画像読取部３０と、プリンタ部５０と、給紙部７０とからなる。
【００１６】
ＡＤＦ１０は、原稿を自動的に画像読取部３０に搬送する装置であって、原稿給紙トレイ１１に載置された原稿は、給紙ローラ１２、捌きローラ１３により１枚ずつ分離されて下方に送られ、搬送ベルト１４によって、プラテンガラス３１上の原稿読取位置まで搬送される。
原稿読取位置に搬送された原稿は、画像読取部３０のスキャナ３２によりスキャンされた後、再び、搬送ベルト１４により図の右方向に送られ、排紙ローラ１５を経て原稿排紙トレイ１６上に排出される。
【００１７】
画像読取部３０は、上記プラテンガラス３１の原稿読取位置に搬送された原稿の画像を光学的に読み取るものであって、スキャナ３２、ＣＣＤカラーイメージセンサ（以下、単に「ＣＣＤセンサ」という。）３８などから構成される。
【００１８】
スキャナ３２には、露光ランプ３３と、この露光ランプ３３の照射による原稿からの反射光をプラテンガラス３１に平行な方向に光路変更するミラー３４が設置され、図の矢印方向に移動することによりプラテンガラス３１上の原稿をスキャンする。原稿からの反射光はミラー３４に反射された後、さらにミラー３５、３６および集光レンズ３７を介してＣＣＤセンサ３８まで導かれ、ここで電気信号に変換されて画像データが生成される。
【００１９】
当該画像データは、制御部１００内の画像信号処理部１１０（図２参照）においてＡ／Ｄ変換されてデジタル信号となり、シェーディング補正や濃度変換処理を加えられ、さらに公知の誤差拡散処理等を加えられた後、高解像度画像メモリ１２０（同図２）に格納される。高解像度画像メモリ１２０に格納された画像データは、後述するようにＣＰＵ１９０でなされた天地認識の結果に応じて回転処理され、プリンタ部５０のレーザダイオード（以下、「ＬＤ」とも表記する。）５１の駆動信号となる。
【００２０】
プリンタ部５０は、公知の電子写真方式により記録シート上に画像を形成するものであって、上記駆動信号を受信するとレーザダイオード５１を駆動してレーザ光を出射させる。レーザ光は、所定の角速度で回転するポリゴンミラー５２側面のミラー面で反射され、ｆθレンズ５３、ミラー５４、５５を介して、感光体ドラム５６の表面を露光走査する。
【００２１】
この感光体ドラム５６は、上記露光を受ける前にクリーニング部５７で感光体表面の残留トナーを除去され、さらにイレーサランプ（図示せず）の照射を受けて除電された後、帯電チャージャ５８により一様に帯電されており、このように一様に帯電した状態で上記露光を受けると、感光体ドラム５６表面に静電潜像が形成される。
【００２２】
現像器５９は、感光体ドラム５６表面に形成された上記静電潜像を可視化する。
【００２３】
一方、給紙部７０には、２つの用紙カセット７１、７２が設けられており、上述の感光体ドラム５６の露光および現像の動作と同期して、必要なサイズの記録シートが、用紙カセット７１、７２のいずれかから、給紙ローラ７１１もしくは７２１の駆動により給紙される。給紙された記録シートは、感光体ドラム５６の下方で当該感光体ドラム５６の表面に接触し、転写チャージャ６０の静電力により、感光体ドラム５６表面に形成されていたトナー像が当該記録シート表面に転写される。
【００２４】
その後、記録シートは、分離チャージャ６１の静電力によって感光体ドラム５６の表面から引き剥され、搬送ベルト６２により定着部６３に搬送される。
記録シートに転写されたトナー像は、定着部６３において内部にヒータを備えた定着ローラ６４で加熱されながら押圧されることにより定着される。定着後の記録シートは、排出ローラ６５により排紙トレイ６６上に排出される。
【００２５】
また、画像読取部３０の前面の操作しやすい位置には、操作パネル９０が設けられており、コピー枚数を入力するテンキーやコピー開始を指示するスタートキー、各種のコピーモードを設定するための設定キー、上記設定キーなどにより設定されたモードをメッセージで表示する表示部などが設けられている。
【００２６】
（２）制御部１００の構成
次に、本実施の形態において、上記複写機の内部に設置される制御部１００の構成を、図２のブロック図を参照しながらより詳細に説明する。
同図に示すように制御部１００は、画像信号処理部１１０と、高解像度画像メモリ１２０と、傾き検出部１２５と、回転処理部１３０と、ＬＤ駆動部１４０と、解像度変換部１５０と、第１低解像度認識用メモリ１６０と、傾き補正部１７０と、第２低解像度認識用メモリ１８０と、ＣＰＵ１９０等からなる。
【００２７】
画像信号処理部１１０は、Ａ／Ｄコンバータ、シェーディング補正部、ＭＴＦ補正部や、変倍部、γ補正部などを備えており、ＣＣＤセンサ３８より入力された原稿の画像データは、Ａ／Ｄコンバータでデジタルの多値信号に変換され、シェーディング補正部で露光ランプ３３の照度ムラやＣＣＤセンサ３８の感度ムラが補正された後、ＭＴＦ補正部でエッジ強調などの画質改善のための処理を受け、さらに変倍部やγ補正部でそれぞれ変倍処理、γ補正処理を加えられた後に、高解像度画像メモリ１２０および解像度変換部１５０に送られる。
【００２８】
なお、本実施の形態のＣＣＤセンサ３８からは、反射率濃度変換された濃度データが出力されているものとする。
【００２９】
傾き検出部１２５は、読み取られた原稿の傾き検出処理を行う。傾き検出処理の詳細については後述する。
回転処理部１３０は、高解像度画像メモリ１２０から目的のページの画像データを読み出して、ＣＰＵ１９０からの回転角情報に基づき、必要に応じて画像データを回転処理してから、ＬＤ駆動部１４０に転送する。なお、この回転処理は、画像データのメモリアドレスを変更する公知の技術（例えば、特開昭６０ー１２６７６９号公報参照）によってなされる。
【００３０】
ＬＤ駆動部１４０は、高解像度画像メモリ１２０から出力された画像データについて、内部のＲＯＭに格納された制御プログラムに基づいてＬＤ５１へ送り、記録シートへの画像形成を実行する。
【００３１】
解像度変換処理部１５０は、画像信号処理部１１０を経由した高解像度画像データを低解像度の画像データに変換する。解像度変換された画像データは、まず第１低解像度認識用メモリ１６０に書き込まれる。本実施の形態では、ＣＣＤセンサ３８で読み取られた４００ＤＰＩまたは６００ＤＰＩの画像データを、２５ＤＰＩまたは４０ＤＰＩの低解像度に変換する。解像度変換は、具体的には、例えば縦４画素×横４画素の１６画素を取り出し、取り出された１６画素の濃度の最大値を取得して、それを１画素の濃度とする処理を、所定の解像度となるまで繰り返し実行することにより行うことができる。
【００３２】
傾き補正部１７０は、傾き検出部１２５において検出された原稿の傾きに基づいて、第１低解像度認識用メモリ１６０に格納された画像データの補正を行う。ここで、補正とは、具体的には、検出された原稿の傾き分だけ、画像データを回転させることをいう。なお、本実施の形態の説明では、回転処理部１３０において行われる天地認識に基づく回転処理と区別するため、傾き補正のための回転処理を「傾き補正処理」という。
【００３３】
傾き補正部１７０において傾き補正処理された低解像度画像データは、第２低解像度認識用メモリ１８０に格納される。
本実施の形態の画像認識装置では、第２低解像度認識用メモリ１８０に格納される傾き補正処理された低解像度画像データに基づいて天地認識を行うこともできるし、後述の如く、第１低解像度認識用メモリ１６０に格納された画像データ、及び第２低解像度認識用メモリ１８０に格納された画像データについて、それぞれの画像データを天地認識に用いた場合の認識信頼度を求め、より信頼度が高いと判定された方の画像データを用いて天地認識を行うようにすることもできる。
【００３４】
即ち、本実施の形態の画像認識装置では、前述の特開平９−９０４０号公報に開示されている方法を用いて天地認識を行うこととしているが、当該天地認識に用いるヒストグラムの取得にあたっては、原稿の傾き分を補正すれば、正確な天地認識を行うことが可能なヒストグラムを取得することができる場合が多いと考えられる。原稿中の文字行は、原稿と平行に記載されているのが通常であるからである。
【００３５】
しかしながら、例えば原稿に記載されている文字行自体が、原稿に対して傾いているような場合には、原稿の傾き分を補正しても、却って正確な天地認識が妨げられる場合も有り得る。従って、本実施の形態の画像認識装置では、傾き補正前の画像データと、傾き補正後の画像データとについて、それぞれを天地認識に用いた場合の認識信頼度を求め、信頼度の高い方の画像データを用いて天地認識を行うようにすることもできるようにしている。信頼度の算出の方法については後述する。
【００３６】
（３）制御部１００の処理
本実施の形態の制御部１００は、ＣＣＤセンサ３８より読み取られた画像データについて、原稿の傾きの検出処理（以下、「傾き検出処理」という。）、傾きの補正処理（以下、「傾き補正処理」という。）、傾き補正処理が行われていない画像データ、及び傾き補正処理を行った画像データからのヒストグラム取得処理（ヒストグラム取得処理は、傾き補正処理を行った画像データのみについて実行するようにしてもよい。）、取得されたヒストグラムから、原稿方向の認識に用いる場合の認識の信頼度を取得する信頼度取得処理、取得された信頼度を比較する信頼度比較処理、比較の結果、より信頼度の高いヒストグラムに基づいて原稿の方向を認識する天地認識処理を行う。この天地認識処理自体は、上記従来技術と同様にヒストグラムのエッジカウントにより行う。
【００３７】
傾き検出処理、傾き補正処理、ヒストグラム取得処理、信頼度取得処理、信頼度比較処理、天地認識処理は、第１低解像度認識用メモリ１６０、及び第２低解像度認識用メモリ１８０に格納された解像度変換処理後の画像データに基づいてＣＰＵ１９０で実行される。
【００３８】
以下、ＣＰＵ１９０の処理内容について詳細に説明するが、これらの処理を指示するプログラムはＲＯＭ１９２に格納されており、また必要に応じてＲＡＭ１９１が作業用のメモリ領域として利用される。
【００３９】
なお、ヒストグラム取得処理を、傾き補正後の画像データのみについて実行した場合には、信頼度取得処理、信頼度比較処理は不要となり、天地認識処理を、ヒストグラム取得処理にて得られた、傾き補正後の画像データについてのヒストグラムに基づいて行えばよい。以下、本実施の形態の制御部１００の処理内容について説明する。
【００４０】
図３は、信頼度取得処理等を行う場合の本実施の形態における制御部１００の処理内容を示すフローチャートである。以下、同図に基づいて、制御部１００の処理内容について詳細に説明する。
【００４１】
（３−１）原稿の傾き検出処理
制御部１００は、まずＣＣＤセンサ３８による原稿読み取り結果に基づいて原稿の傾き検出処理を行う（Ｓ３０１）。
ここで、原稿ガラスで読み取られる原稿が傾いている場合の傾き検出処理について詳細に説明する。この処理は、傾き検出部１２５によって行われる。図４に示すように、原稿４００は、右上端を基準とする原稿ガラス３１の上に原稿４００の複写面を下向きに置かれている。原稿ガラス３１の長手方向がスキャン読み取りの副走査方向であり、それに垂直な方向が主走査方向である。同図に示した例では、原稿４００は、画像基準から離れて置かれ、その位置は副走査方向と平行ではない。この例では、原稿４００は読み取り領域からはみ出ていない。
【００４２】
また、本実施の形態の複写機では、原稿押さえ（ＡＤＦの搬送ベルト１４）を橙色に着色してあり、露光ランプの光の原稿押さえによる反射光が、ＣＣＤセンサ３８にとって、分光感度が小さい色となるようにしている。即ち、原稿押さえの橙色は、ＣＣＤセンサ３８にとって黒色であるのと同じに考えることができる一方、原稿の地肌は、通常白色であるため、原稿と原稿押さえとを区別することが可能である。このことを利用して、原稿のエッジを検出し、原稿の傾き角度を検出することができる。
【００４３】
図５と図６は、原稿４００（破線で示す）が読み取り領域（実線で示す長方形）からはみ出ていない場合の読み取った画像のイメージを示す。
画像信号処理部１１０では、少なくとも原稿の範囲を含む長方形領域の画像データを処理して、原稿が存在する原稿領域の検出を行い、原稿の周囲（すなわちエッジ部）のすべての座標から、図に示すように長方形の原稿の四隅の座標を検出する。ここで、主走査方向がＸ軸であり、副走査方向がＹ軸であるとする。ＸmaxとＸminとは、最大と最小のＸ座標であり、ＹmaxとＹminとは、最大と最小のＹ座標である。
【００４４】
残りの２つのＸ座標については、Ｙ座標がＹminである頂点のＸ座標をＸ1とし、Ｙ座標がＹmaxである頂点のＸ座標をＸ2とする。また、残りの２つのＹ座標のうちＸ座標がＸminである頂点のＹ座標をＹ1とし、Ｘ座標がＸmaxである頂点のＹ座標をＹ2とする。
従って、図５の例では、原稿の四隅のＸ座標とＹ座標は、（Ｘ1，Ｙmin）、（Ｘmax，Ｙ2）、（Ｘmin，Ｙ1）、（Ｘ2，Ｙmax）となる。原稿４００の４辺の長さを、ａ、ｂ、ｃ、ｄとすると、各辺の長さは原稿の四隅の座標から下記の（数１）を用いて計算できる。
【００４５】
【数１】

【００４６】
また、図６の例では、原稿の傾きの方向は図５の場合と異なるが、原稿の四隅の座標は、（Ｘmin，Ｙ1）、（Ｘ1，Ｙmin）、（Ｘ2，Ｙmax）、（Ｘmax，Ｙ2）となる。従って、この場合にも、四辺の長さａ、ｂ、ｃ、ｄは上記（数１）に示した式で計算できる。
【００４７】
ここで、原稿が傾いていた場合の画像データを補正するための編集処理パラメータは、次のようになる。まず、Ｘ1−Ｘmin＜Ｙ1−Ｙminの場合（図５）、
回転座標：（Ｘ1，Ｙmin）
回転角θ： tan^-1｛（Ｘ1−Ｘmin）／（Ｙ1−Ｙmin）｝
【００４８】
また、Ｘ1−Ｘmin＞Ｙ1−Ｙminの場合（図６）、
回転座標：（Ｘmin，Ｙ1）
回転角θ： tan^-1｛（Ｙ1−Ｙmin）／（Ｘ1−Ｘmin）｝
【００４９】
ここで、回転座標とは、図の左上端に近い隅の座標である。また、回転角θは、上記回転座標の位置を基準として画像データを回転し、読み取り領域に対して平行にするための回転角度である。ここで、検出された編集処理パラメータが傾き補正処理に用いられる。
【００５０】
以上に説明した内容に基づいた、傾き検出処理における制御部１００の処理内容について以下に説明する。
【００５１】
図７は、傾き検出処理の詳細な処理内容を示すフローチャートである。同図に示されるように、傾き検出処理では、まず原稿領域においてエッジ部の座標を抽出し、抽出した複数の座標からなる線分を原稿領域の各辺とみなす（Ｓ７０１）。次に、Ｘmin点とＸmax点の座標及びＹmin点とＹmax点の座標を抽出する（Ｓ７０２、Ｓ７０３）。
【００５２】
Ｘ1−Ｘmin＜Ｙ1−Ｙminであれば（Ｓ７０４：Ｙｅｓ）、−ｔａｎ^-1（（Ｘ1−Ｘmin）／（Ｙ1−Ｙmin））を傾き角θ’とする（Ｓ７０５）。
傾き角θ’は、上記編集処理パラメータとして求められた回転角θに加えて回転方向を考慮するようにしたものである。
【００５３】
また、Ｘ1−Ｘmin＞Ｙ1−Ｙminであれば（Ｓ７０６：Ｙｅｓ）、ｔａｎ^-1（（Ｙ1−Ｙmin）／（Ｘ1−Ｘmin））を傾き角θ’とする（Ｓ７０７）。
また、Ｘ1−Ｘmin＝Ｙ1−Ｙmin＝０であれば（Ｓ７０８：Ｙｅｓ）、主走査方向及び副走査方向のいずれについても原稿が傾いていないと考えられる場合であるから、傾き角θ’を０とする（Ｓ７０９）。
【００５４】
以上のいずれにも該当しない場合、即ち、Ｘ1−Ｘmin＝Ｙ1−Ｙminであって、両者の値が０でない場合には、傾き角θ’を４５°と設定する（Ｓ７１０）。
【００５５】
（３−２）任意角回転による傾き補正処理
傾き検出処理を終了すると、図３のフローチャートに戻って、制御部１００は、検出された傾きに基づく傾き補正処理を行う。即ち、傾き角θ’が０°でない場合（Ｓ３０２：Ｙｅｓ）について、傾き補正処理を行う（Ｓ３０３）。
【００５６】
前述の如く、傾き補正とは、原稿の傾きが検出された場合に、画像データを傾斜角だけ回転処理して、傾いていない画像を得ることをいう。従って、読み取られた原稿が傾いていたときでも、原稿が傾いていない場合と同様の画像データを得ることができる。
【００５７】
傾き補正処理は、傾き検出部１２５において取得された編集パラメータに基づいて、傾き補正部１７０において行われるが、この傾き補正処理は、第１低解像度認識用メモリ１６０に格納されている低解像度画像データに対する、傾き角θ’分の任意角回転処理であり、その処理方法は、既に公知の技術である。従って、ここでの詳細な説明は省略するが、極めて簡略に説明すると、下記の（数２）に示す式を用いることにより、座標（Ｘ，Ｙ）が角度θだけ回転されて座標（Ｕ，Ｖ）となることを利用する。
【００５８】
【数２】

【００５９】
（３−３）認識信頼度に関する制御部１００の処理
制御部１００は、ＣＣＤセンサ３８により読み取られた原稿の画像データに基づいて、上述のごとく原稿の傾きを検出し、傾き補正処理を行った場合、及び傾き補正処理を行わない場合について、それぞれ信頼度を取得する（図３、Ｓ３０４）。ここで、傾き補正を行った画像データのみを天地認識に利用する場合は信頼度に関する処理が不要となることは前述の通りであるが、信頼度を取得した場合には、より信頼度の高い方の画像データを用いて、以後の天地認識処理を行うようにする（図３、Ｓ３０５以下）。
【００６０】
天地認識処理の結果として、適切な方向で文書画像データが出力されるように、回転処理部１３０に対して回転角情報が出力される。
【００６１】
ここで、信頼度取得処理について、二つの方法を挙げて説明する。まず、第１の方法として、ヒストグラムから算出されるＭＴＦ値を用いる信頼度の取得方法について説明する。
ここで、「ＭＴＦ値」とは、ヒストグラムを取得した場合に、そのヒストグラムの数ライン毎の高さの最大値（以下、「ｍａｘ値」という。）、及び最小値（以下、「ｍｉｎ値」という。）を取得した場合に、以下の式（数３）により算出される値をいう（以下、ＭＴＦ値を取得するために分割された数ラインにより形成される領域を「ライン領域」と称する）。
【００６２】
【数３】

【００６３】
図８は、ＭＴＦ値の算出について説明するための図である。同図（ａ）は、原稿４００に記載された文字行が原稿の向きに平行である場合の例を示す。同図において、４１０は取得された主走査方向のヒストグラムを表し、Ｒはライン領域を示す。
【００６４】
同図（ａ）に示されるように、文字行が原稿の向きに平行である場合には、主走査方向のヒストグラムに、行を表すピークが検出される。一方、文字が存在しない部分では、ヒストグラムのピークは検出されないため、ライン領域毎に見ると、いずれのライン領域でも、ｍｉｎ値は０となる。即ち、上記（数３）より、いずれのライン領域でもＭＴＦ値は１となる。上記（数３）からわかるように、１はＭＴＦ値の最大値である。従って、ライン領域毎のＭＴＦ値の、原稿内の平均値を取ると、文字行に傾きがない場合、即ち、天地認識を行うのに適当なヒストグラムが得られる状態においては、ＭＴＦ値の平均値は高くなるといえる。
【００６５】
一方、図８（ｂ）に示されるように、文字行が傾いている場合には、主走査方向のヒストグラムのピークの幅が広がる場合があるため、ライン領域におけるｍａｘ値とｍｉｎ値の差が小さい場合、即ち、ＭＴＦ値が小さい場合が発生する。従って、原稿内のＭＴＦ値の平均値を取得すると、文字行が傾いている場合、即ち、天地認識を行うのに適当なヒストグラムが得られない状態においては、ＭＴＦ値の平均値が低くなる場合が多いと考えられる。以上の内容から、原稿内のＭＴＦ値の平均値の高い方が、認識信頼度は高いと判断できる。
【００６６】
次に、第２の方法として、ヒストグラムのエッジカウントを利用した認識信頼度の取得方法について説明する。
ヒストグラムのエッジカウントを使用した方法とは、ヒストグラムを取得した場合に、そのヒストグラム値が増える方向の変化点（以下、「立ち上がりエッジ」という。）と、減る方向の変化点（以下、「立ち下がりエッジ」という。）の数をそれぞれカウントすると、通常の原稿では立ち下がりエッジの数の方が多くなることを利用したものである。これは、従来技術として説明した、特開平９−９０４０号公報に記載された天地認識でも用いている方法である。
【００６７】
図９は、エッジカウントを利用した認識信頼度の取得方法について説明するための図である。同図の例において、４００は原稿、４２０は取得された副走査方向のヒストグラムを表す。また、同図（ａ）は、文字行に傾きがない場合の例、同図（ｂ）は、文字行が原稿方向と平行であるが、センタリングされている場合の例、同図（ｃ）は、文字行が傾いている場合の例を示すものである。
【００６８】
同図（ａ）のように、文字行が傾いていない場合、即ち、天地認識を行うために適切なヒストグラムを取得することができる場合においては、立ち上がりエッジの数と、立ち下がりエッジの数との差が大きくなる。文章の先頭位置がある程度一定しているため、立ち上がりエッジが２となるのに対し、文章の終わりの位置は分散していることから、立ち下がりエッジが４となるからである。
【００６９】
しかし、同図（ｂ）に示されるように文字行がセンタリングされている場合や、同図（ｃ）に示されるように文字行が傾きを持った場合には、立ち上がりエッジの数と、立ち下がりエッジの数との差があまり顕著に現れない。
エッジカウントを利用した信頼度認識とは、以上に説明したような内容に基づき、立ち上がりエッジの数と立ち下がりエッジの数との差が大きいほど、認識信頼度が高いと判定するものである。
【００７０】
以上に説明したような方法のいずれか、または両方を用いて認識信頼度を取得すると、取得された信頼度を比較し（Ｓ３０５）、信頼度が大きいと判定された方の低解像度画像データを用いて、以下の天地認識処理を行う。即ち、補正後の画像データの信頼度の方が大きい場合には（Ｓ３０５：Ｙｅｓ）、補正後の情報を用いて天地認識処理を行い（Ｓ３０６）、それ以外の場合には（Ｓ３０５：Ｎｏ）、補正前の情報を用いて天地認識処理を行う（Ｓ３０７）。なお、ステップＳ３０２において、傾き角θ’が０であった場合には、傾き補正や、信頼度に関する処理を行う必要がないため、そのままの情報を用いて、天地認識処理を行う（Ｓ３０８）。
【００７１】
以上のような処理を行うことにより、原稿ガラス上にセットされた原稿が傾いている場合や、原稿に記載された文字行自体が傾いているような場合においても、より正確な天地認識を行うことができる。
【００７２】
なお、本実施の形態では、主として認識信頼度の取得を行う場合の例について、詳細に説明を行ったが、上述の如く、認識信頼度を考慮せず、原稿の傾き補正処理を行った画像データを用いて天地認識処理を行っても、原稿ガラス上にセットされた原稿が傾いている場合に、より正確な天地認識を行うことが可能となる。その場合には、具体的には、図３のフローチャートにおいて、ステップＳ３０３からステップＳ３０６に進むようにすればよい。
【００７３】
また、本実施の形態では、モノクロ複写機に適用される画像認識装置の場合について説明したが、フルカラー複写機にも適用することは可能である。ただし、この場合には、原稿から生成した画像データから、有彩色のカラーデータを予めキャンセルする回路を組み込んでおき、モノクロのみの画像データのみから天地認識を行う方が望ましい。文字部分のほとんどはモノクロだからである。
【００７４】
さらに、上記認識信頼度の取得に際して、第１の方法及び第２の方法を併用する場合であれば、それぞれの方法により取得された認識信頼度を同等に取り扱うだけでなく、一定の重み付けをして最終的な認識信頼度を算出するようにすることもできる。
【００７５】
また、本実施の形態では、認識信頼度に関して、横書きの原稿を用いた場合について説明したが、認識信頼度の取得に関する第１の方法及び第２の方法のいずれを用いる場合でも、主走査方向及び副走査方向の両方のヒストグラムを取得することにより、原稿が横書きである場合でも、縦書きである場合でも適用することが可能である。具体的には、例えば、主走査方向及び副走査方向のヒストグラムから、認識信頼度を表す値をそれぞれ取得し、信頼度の大きい方を採用するようにすればよい。
【００７６】
また、本実施の形態では、原稿方向の判別に際して、画像データのヒストグラムのエッジカウントを求める方法を用いたが、ヒストグラムを利用する他の方法を用いることも可能である。例えば、ヒストグラムを利用して画像データから文字を切り出し、パターンマッチングする方法により、原稿方向の判別を行うようにしてもよい。
【００７７】
【発明の効果】
以上に説明したように、本発明の第１の画像認識装置によれば、原稿傾き検出手段により検出された原稿の傾きに基づいて、画像データを補正する傾き補正手段と、画像読み取り手段により生成された傾き補正前の画像データ、及び前記傾き補正手段により補正された画像データのうち、何れの画像データに基づいて原稿の方向を認識するかを判断する判断手段と、前記判断手段により判断された画像データに基づいて、原稿の方向を認識する原稿方向認識手段とを備えることにより、原稿ガラス上にセットされた原稿が傾いている場合や、原稿に記載された文字行自体が傾いているような場合においても、的確に天地認識を行うことができるという効果がある。
【００７８】
また、本発明の第２の画像認識装置によれば、原稿傾き検出手段により検出された原稿の傾きに基づいて、画像データを補正する傾き補正手段と、画像読み取り手段により生成された傾き補正前の画像データ、及び傾き補正手段により補正された画像データのそれぞれに基づいて、原稿の方向の認識を行う場合の、認識の信頼度に関する情報を取得する信頼度取得手段と、傾き補正前の画像データに基づいて取得された信頼度に関する情報と、傾きが補正された画像データに基づいて取得された信頼度に関する情報とを比較する比較手段と、前記比較手段による比較の結果、信頼度が高い方の画像データに基づいて、原稿の方向を認識する原稿方向認識手段とを備えることにより、原稿に記載された文字行自体が傾いているような場合でも、より正確な天地認識を行うことができるという効果を奏する。
【図面の簡単な説明】
【図１】本発明に係る画像認識装置が適用される複写機の全体の構成を示す図である。
【図２】本発明に係る画像認識装置が適用される複写機の制御部の構成を示すブロック図である。
【図３】本実施の形態の画像認識装置における制御部の処理内容を示すフローチャートである。
【図４】本発明に係る画像認識装置における原稿の傾き検出の方法について説明するための図である。
【図５】本発明に係る画像認識装置における原稿の傾き検出の方法について説明するための図である。
【図６】本発明に係る画像認識装置における原稿の傾き検出の方法について説明するための図である。
【図７】傾き検出処理の詳細な内容を示すフローチャートである。
【図８】ＭＴＦ値の算出について説明するための図である。
【図９】ヒストグラムのエッジカウントを利用した認識信頼度の取得方法について説明するための図である。
【図１０】従来の天地認識処理について説明するための図である。
【図１１】従来の天地認識処理の問題点について説明するための図である。
【符号の説明】
３８ＣＣＤセンサ
１００制御部
１１０画像信号処理部
１２０高解像度画像メモリ
１２５傾き検出部
１３０回転処理部
１４０ＬＤ駆動部
１５０解像度変換部
１６０第１低解像度認識用メモリ
１７０傾き補正部
１８０第２低解像度用認識メモリ
１９０ＣＰＵ
１９１ＲＡＭ
１９２ＲＯＭ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image recognition apparatus that recognizes a document image read from an image input apparatus in a digital copying machine, a facsimile, or the like.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, for example, in a digital copying machine equipped with an automatic document feeder (hereinafter referred to as “ADF”), when a plurality of originals are continuously read and copied, the direction of the read original is set. Since the copy is discharged accordingly, it is necessary to check whether or not each document is uniformly oriented in the same direction when the document is set on the ADF. However, such a confirmation operation is quite time-consuming, and the larger the number of documents, the greater the possibility of overlooking the document in the opposite direction.
[0003]
In order to avoid such inconvenience, the original is read, and the direction of the original is first determined from the generated image data (the determination of the direction of the original is hereinafter referred to as “top / bottom recognition”). For example, a method of rotating and outputting image data so as to be in an appropriate direction is considered. Various methods for recognizing the top and bottom have been devised. For example, Japanese Patent Application Laid-Open No. 9-9040 obtains histograms in the main scanning direction and the sub-scanning direction for a portion of a manuscript recognized as a document image. Thus, a technique for performing top-and-bottom recognition is disclosed.
[0004]
FIG. 10 is a diagram for explaining the conventional method of upside down recognition. As shown in the figure, using the histogram acquired for the main scanning method and sub-scanning direction from the document image data input to the image input device, the direction of the character line and the character position at the beginning of the line are recognized. By doing so, the top and bottom can be recognized.
[0005]
More specifically, the number of rising edges and the number of falling edges of the acquired histogram, the sum and difference thereof are calculated for the histogram in the main scanning direction and the histogram in the sub scanning direction, respectively. In the case of a horizontally written document image as shown in FIG. 10, the number of rising edges and the number of falling edges in the histogram (a) in the main scanning direction are substantially equal, and the sum of the two is the histogram in the sub-scanning direction. More than (b). This is because, in the histogram (a) in the main scanning direction, peaks representing rows and portions having no peaks corresponding to the rows are detected alternately.
[0006]
Further, from the histogram (b) in the sub-scanning direction in the example of FIG. 10, the character position at the beginning of the line can be recognized. That is, since the character position at the beginning of the line is relatively constant, the rising edge of the histogram is concentrated in several places, whereas the character position at the end of the line varies depending on the sentence, and thus the end of the line Since the falling edges of the histogram depending on the character positions are relatively dispersed, the upper and lower sides of the document can be recognized if the difference between the number of rising edges and the number of falling edges is large.
[0007]
Through the recognition as described above, the direction of the character line and the character position at the beginning of the line can be detected, and the top-and-bottom recognition can be performed based on the detection result. These recognitions can also be applied to a vertically written document.
[0008]
[Problems to be solved by the invention]
However, the above conventional technique has a problem that, for example, when the document is tilted when being set via the ADF, accurate top-and-bottom recognition may not be performed.
[0009]
That is, when the document is tilted, it is not possible to acquire a histogram that can accurately recognize the line spacing or the leading character position as shown in FIG. This is because accurate recognition cannot be performed even if the top-and-bottom recognition is performed. Such a problem may also occur when the character line itself described in the document is tilted.
[0010]
In view of the above problems, an object of the present invention is to provide an image recognition apparatus capable of performing top-and-bottom recognition even when a document is inclined.
[0011]
A second object of the present invention is to provide an image recognition apparatus capable of performing more accurate top-and-bottom recognition even when the character line itself described in the document is tilted.
[0012]
[Means for Solving the Problems]
In order to solve the above problems, the first image recognition apparatus of the present invention reads a document. First Image reading means for generating image data; ,Previous Based on the document inclination detection means for detecting the inclination of the document, and the document inclination detected by the document inclination detection means, The inclination of the first image data generated by the image reading means is corrected, and the second Image data Generation Tilt correction means for First image data generated by the image reading means; and By the inclination correction means Generation Was Second image data Of which image data is used to determine the direction of the document to be recognized, and any one of the first image data and the second image data determined by the determination unit. data And a document direction recognition means for recognizing the direction of the document.
[0013]
The second image recognition apparatus of the present invention includes an image reading unit that reads an original and generates image data. ,Previous Based on the document inclination detection means for detecting the inclination of the document, and the document inclination detected by the document inclination detection means, The inclination of the first image data generated by the image reading means is corrected, and the second Image data Generation An inclination correction unit, first image data generated by the image reading unit, and the inclination correction unit. Generation Second image data obtained Each of A reliability acquisition means for acquiring information related to the reliability of recognition when the orientation of the document is recognized based on By the reliability acquisition means First image data On the basis of the Information about the obtained confidence, By the reliability acquisition means Second image data On the basis of the Comparing means for comparing information about the obtained reliability, and document direction recognizing means for recognizing the direction of the document based on image data with higher reliability as a result of comparison by the comparing means. Features.
[0014]
DETAILED DESCRIPTION OF THE INVENTION
A digital copying machine (hereinafter simply referred to as “copying machine”) as an application example of an image recognition apparatus according to the present invention will be described below with reference to the accompanying drawings.
[0015]
(1) Configuration of the entire copying machine
First, an overall configuration of a copying machine to which an image recognition apparatus according to the present invention is applied will be described with reference to FIG.
As shown in the figure, the copier includes an ADF 10, an image reading unit 30, a printer unit 50, and a paper feeding unit 70.
[0016]
The ADF 10 is a device that automatically conveys a document to the image reading unit 30, and the document placed on the document feed tray 11 is separated one by one by a feed roller 12 and a separation roller 13 and is moved downward. Then, the paper is conveyed to the original reading position on the platen glass 31 by the conveying belt 14.
The document transported to the document reading position is scanned by the scanner 32 of the image reading unit 30 and then sent again to the right in the drawing by the transport belt 14, and passes through the paper discharge roller 15 and onto the document discharge tray 16. Discharged.
[0017]
The image reading unit 30 optically reads an image of a document conveyed to the document reading position of the platen glass 31, and includes a scanner 32 and a CCD color image sensor (hereinafter simply referred to as “CCD sensor”) 38. Etc.
[0018]
The scanner 32 is provided with an exposure lamp 33 and a mirror 34 for changing the optical path of light reflected from the original by irradiation of the exposure lamp 33 in a direction parallel to the platen glass 31. The original on the glass 31 is scanned. The reflected light from the original is reflected by the mirror 34 and then guided to the CCD sensor 38 via the

mirrors

35 and 36 and the condenser lens 37, where it is converted into an electrical signal to generate image data.
[0019]
The image data is A / D converted into a digital signal by an image signal processing unit 110 (see FIG. 2) in the control unit 100, subjected to shading correction and density conversion processing, and further subjected to known error diffusion processing and the like. Is stored in the high-resolution image memory 120 (FIG. 2). The image data stored in the high-resolution image memory 120 is rotated according to the result of the top / bottom recognition performed by the CPU 190 as will be described later, and a laser diode (hereinafter also referred to as “LD”) 51 of the printer unit 50. Drive signal.
[0020]
The printer unit 50 forms an image on a recording sheet by a known electrophotographic method. Upon receiving the drive signal, the printer unit 50 drives the laser diode 51 to emit laser light. The laser beam is reflected by the mirror surface on the side of the polygon mirror 52 that rotates at a predetermined angular velocity, and exposes and scans the surface of the photosensitive drum 56 via the fθ lens 53 and the mirrors 54 and 55.
[0021]
The photosensitive drum 56 is subjected to removal of residual toner on the surface of the photosensitive member by the cleaning unit 57 before being subjected to the above-described exposure, and further discharged by irradiating an eraser lamp (not shown). When the exposure is performed in such a uniformly charged state, an electrostatic latent image is formed on the surface of the photosensitive drum 56.
[0022]
The developing device 59 visualizes the electrostatic latent image formed on the surface of the photosensitive drum 56.
[0023]
On the other hand, the paper feed unit 70 is provided with two

paper cassettes

71 and 72, and a recording sheet having a required size is fed into the paper cassette 71 in synchronization with the above-described exposure and development operations of the photosensitive drum 56. , 72 is fed by driving a

paper feed roller

711 or 721. The fed recording sheet comes into contact with the surface of the photosensitive drum 56 below the photosensitive drum 56, and the toner image formed on the surface of the photosensitive drum 56 is formed by the electrostatic force of the transfer charger 60. Transferred to the surface.
[0024]
Thereafter, the recording sheet is peeled off from the surface of the photosensitive drum 56 by the electrostatic force of the separation charger 61 and is transported to the fixing unit 63 by the transport belt 62.
The toner image transferred to the recording sheet is fixed by being pressed by the fixing unit 63 while being heated by a fixing roller 64 having a heater therein. The recording sheet after fixing is discharged onto a discharge tray 66 by a discharge roller 65.
[0025]
An operation panel 90 is provided at an easy-to-operate position on the front side of the image reading unit 30. A numeric keypad for inputting the number of copies, a start key for instructing start of copying, and settings for setting various copy modes. A display unit or the like for displaying a mode set by a key, the setting key, or the like by a message is provided.
[0026]
(2) Configuration of control unit 100
Next, in the present embodiment, the configuration of the control unit 100 installed in the copying machine will be described in more detail with reference to the block diagram of FIG.
As shown in the figure, the control unit 100 includes an image signal processing unit 110, a high resolution image memory 120, an inclination detection unit 125, a rotation processing unit 130, an LD driving unit 140, a

resolution conversion unit

150, 1 includes a low-resolution recognition memory 160, an inclination correction unit 170, a second low-resolution recognition memory 180, a CPU 190, and the like.
[0027]
The image signal processing unit 110 includes an A / D converter, a shading correction unit, an MTF correction unit, a scaling unit, a γ correction unit, and the like. The image data of the document input from the CCD sensor 38 is A / D After being converted into a digital multi-value signal by the converter and the shading correction unit correcting the illuminance unevenness of the exposure lamp 33 and the sensitivity unevenness of the CCD sensor 38, the MTF correction unit receives processing for improving image quality such as edge enhancement. Further, after the scaling process and the γ correction process are respectively added by the scaling unit and the γ correction unit, they are sent to the high resolution image memory 120 and the resolution conversion unit 150.
[0028]
It is assumed that density data that has undergone reflectance density conversion is output from the CCD sensor 38 of the present embodiment.
[0029]
The inclination detection unit 125 performs an inclination detection process on the read document. Details of the tilt detection process will be described later.
The rotation processing unit 130 reads the image data of the target page from the high resolution image memory 120, rotates the image data as necessary based on the rotation angle information from the CPU 190, and then transfers the image data to the LD driving unit 140. To do. This rotation process is performed by a known technique for changing the memory address of image data (for example, see Japanese Patent Application Laid-Open No. 60-126769).
[0030]
The LD driving unit 140 sends the image data output from the high resolution image memory 120 to the LD 51 based on a control program stored in the internal ROM, and executes image formation on a recording sheet.
[0031]
The resolution conversion processing unit 150 converts the high resolution image data that has passed through the image signal processing unit 110 into low resolution image data. The resolution-converted image data is first written in the first low resolution recognition memory 160. In the present embodiment, 400 DPI or 600 DPI image data read by the CCD sensor 38 is converted to a low resolution of 25 DPI or 40 DPI. Specifically, the resolution conversion is, for example, a process in which 16 pixels of vertical 4 pixels × horizontal 4 pixels are extracted, the maximum density value of the extracted 16 pixels is acquired, and the density is set to 1 pixel density. This can be done by repeatedly executing until the resolution is reached.
[0032]
The inclination correction unit 170 corrects the image data stored in the first low-resolution recognition memory 160 based on the document inclination detected by the inclination detection unit 125. Here, the correction specifically means that the image data is rotated by the detected document inclination. In the description of the present embodiment, the rotation processing for tilt correction is referred to as “tilt correction processing” in order to distinguish it from rotation processing based on top / bottom recognition performed in the rotation processing unit 130.
[0033]
The low resolution image data that has been subjected to the tilt correction processing in the tilt correction unit 170 is stored in the second low resolution recognition memory 180.
In the image recognition apparatus of the present embodiment, the top-and-bottom recognition can be performed based on the low-resolution image data subjected to the tilt correction process stored in the second low-resolution recognition memory 180. With respect to the image data stored in the resolution recognition memory 160 and the image data stored in the second low resolution recognition memory 180, the recognition reliability when the respective image data is used for the top and bottom recognition is obtained, and the reliability is further increased. It is also possible to perform top-and-bottom recognition using the image data that is determined to be high.
[0034]
That is, in the image recognition apparatus of the present embodiment, the top and bottom recognition is performed using the method disclosed in the above-mentioned Japanese Patent Application Laid-Open No. 9-9040, but in acquiring the histogram used for the top and bottom recognition, It is considered that it is often possible to obtain a histogram capable of performing accurate top-and-bottom recognition by correcting the inclination of the document. This is because character lines in a document are usually written in parallel with the document.
[0035]
However, for example, when the character line itself described in the original is inclined with respect to the original, even if the inclination of the original is corrected, accurate vertical recognition may be hindered. Therefore, in the image recognition apparatus according to the present embodiment, for each of the image data before tilt correction and the image data after tilt correction, the recognition reliability is obtained when each is used for top-and-bottom recognition. It is also possible to perform top-and-bottom recognition using image data. A method of calculating the reliability will be described later.
[0036]
(3) Processing of control unit 100
The control unit 100 according to the present embodiment performs document inclination detection processing (hereinafter referred to as “tilt detection processing”) and inclination correction processing (hereinafter referred to as “tilt correction processing”) for image data read by the CCD sensor 38. ”), Histogram acquisition processing from image data that has not been subjected to inclination correction processing and image data that has been subjected to inclination correction processing (histogram acquisition processing is executed only for image data that has been subjected to inclination correction processing. A reliability acquisition process for acquiring the reliability of recognition when used for the recognition of the document direction from the acquired histogram, a reliability comparison process for comparing the acquired reliability, a result of the comparison, A top-and-bottom recognition process is performed for recognizing the direction of the document based on a highly reliable histogram. This top-and-bottom recognition process itself is performed by histogram edge counting in the same manner as in the prior art.
[0037]
Inclination detection processing, inclination correction processing, histogram acquisition processing, reliability acquisition processing, reliability comparison processing, and top / bottom recognition processing are resolutions stored in the first low resolution recognition memory 160 and the second low resolution recognition memory 180. It is executed by the CPU 190 based on the image data after the conversion process.
[0038]
Hereinafter, the processing contents of the CPU 190 will be described in detail. A program for instructing these processes is stored in the ROM 192, and the RAM 191 is used as a working memory area as necessary.
[0039]
When the histogram acquisition process is executed only for the image data after the inclination correction, the reliability acquisition process and the reliability comparison process are not required, and the top and bottom recognition process is the inclination correction obtained by the histogram acquisition process. What is necessary is just to perform based on the histogram about later image data. Hereinafter, the processing content of the control part 100 of this Embodiment is demonstrated.
[0040]
FIG. 3 is a flowchart showing the processing contents of the control unit 100 in the present embodiment when performing the reliability acquisition processing and the like. Hereinafter, the processing content of the control unit 100 will be described in detail with reference to FIG.
[0041]
(3-1) Document skew detection processing
The control unit 100 first performs document inclination detection processing based on the document reading result by the CCD sensor 38 (S301).
Here, the tilt detection process when the document read by the document glass is tilted will be described in detail. This process is performed by the inclination detection unit 125. As shown in FIG. 4, the document 400 is placed with the copy surface of the document 400 facing downward on the document glass 31 with the upper right end as a reference. The longitudinal direction of the original glass 31 is the sub-scanning direction for scanning reading, and the direction perpendicular thereto is the main scanning direction. In the example shown in the figure, the document 400 is placed away from the image reference, and its position is not parallel to the sub-scanning direction. In this example, the document 400 does not protrude from the reading area.
[0042]
In the copying machine of the present embodiment, the document presser (ADF transport belt 14) is colored orange, and the reflected light of the exposure lamp light from the document presser has a color with low spectral sensitivity for the CCD sensor 38. It is trying to become. In other words, the orange color of the document press can be considered the same as black for the CCD sensor 38, while the background of the document is normally white, so it is possible to distinguish the document from the document press. By utilizing this fact, the edge of the document can be detected and the tilt angle of the document can be detected.
[0043]
5 and 6 show images of the read image when the document 400 (shown by a broken line) does not protrude from the reading area (a rectangle shown by a solid line).
The image signal processing unit 110 processes image data in a rectangular area including at least the range of the document, detects the document region where the document exists, and displays all the coordinates around the document (that is, the edge portion) from the coordinates shown in FIG. As shown, the coordinates of the four corners of the rectangular document are detected. Here, it is assumed that the main scanning direction is the X axis and the sub scanning direction is the Y axis. Xmax and Xmin are the maximum and minimum X coordinates, and Ymax and Ymin are the maximum and minimum Y coordinates.
[0044]
For the remaining two X coordinates, the X coordinate of the vertex whose Y coordinate is Ymin is X1, and the X coordinate of the vertex whose Y coordinate is Ymax is X2. Of the remaining two Y coordinates, the Y coordinate of the vertex whose X coordinate is Xmin is Y1, and the Y coordinate of the vertex whose X coordinate is Xmax is Y2.
Therefore, in the example of FIG. 5, the X and Y coordinates of the four corners of the document are (X1, Ymin), (Xmax, Y2), (Xmin, Y1), (X2, Ymax). If the lengths of the four sides of the document 400 are a, b, c, and d, the length of each side can be calculated from the coordinates of the four corners of the document using the following (Equation 1).
[0045]
[Expression 1]

[0046]
In the example of FIG. 6, the direction of the inclination of the document is different from that in FIG. 5, but the coordinates of the four corners of the document are (Xmin, Y1), (X1, Ymin), (X2, Ymax), (Xmax, Y2). Accordingly, also in this case, the lengths a, b, c, and d of the four sides can be calculated by the equation shown in the above (Equation 1).
[0047]
Here, the editing process parameters for correcting the image data when the document is tilted are as follows. First, when X1−Xmin <Y1−Ymin (FIG. 5),
Rotation coordinates: (X1, Ymin)
Rotation angle θ: tan ^-1 {(X1-Xmin) / (Y1-Ymin)}
[0048]
Further, when X1-Xmin> Y1-Ymin (FIG. 6),
Rotation coordinates: (Xmin, Y1)
Rotation angle θ: tan ^-1 {(Y1-Ymin) / (X1-Xmin)}
[0049]
Here, the rotation coordinates are the coordinates of the corner close to the upper left corner of the figure. The rotation angle θ is a rotation angle for rotating the image data on the basis of the position of the rotation coordinate and making it parallel to the reading area. Here, the detected editing process parameter is used for the inclination correction process.
[0050]
The processing content of the control unit 100 in the tilt detection processing based on the content described above will be described below.
[0051]
FIG. 7 is a flowchart showing the detailed processing contents of the tilt detection processing. As shown in the figure, in the tilt detection process, first, the coordinates of the edge portion are extracted from the document area, and the line segment composed of the extracted plurality of coordinates is regarded as each side of the document area (S701). Next, the coordinates of the Xmin point and the Xmax point and the coordinates of the Ymin point and the Ymax point are extracted (S702, S703).
[0052]
If X1-Xmin <Y1-Ymin (S704: Yes), -tan ^-1 ((X1−Xmin) / (Y1−Ymin)) is set as an inclination angle θ ′ (S705).
The inclination angle θ ′ is obtained by considering the rotation direction in addition to the rotation angle θ obtained as the editing processing parameter.
[0053]
If X1-Xmin> Y1-Ymin (S706: Yes), tan ^-1 Let ((Y1-Ymin) / (X1-Xmin)) be the inclination angle θ '(S707).
If X1-Xmin = Y1-Ymin = 0 (S708: Yes), it is considered that the document is not inclined in both the main scanning direction and the sub-scanning direction. (S709).
[0054]
If none of the above applies, that is, if X1−Xmin = Y1−Ymin and both values are not 0, the inclination angle θ ′ is set to 45 ° (S710).
[0055]
(3-2) Inclination correction processing by arbitrary angle rotation
When the tilt detection process ends, the control unit 100 returns to the flowchart of FIG. 3 and performs a tilt correction process based on the detected tilt. That is, when the inclination angle θ ′ is not 0 ° (S302: Yes), the inclination correction process is performed (S303).
[0056]
As described above, the tilt correction refers to obtaining an image that is not tilted by rotating the image data by the tilt angle when the tilt of the document is detected. Therefore, even when the read document is tilted, the same image data as when the document is not tilted can be obtained.
[0057]
The tilt correction process is performed in the tilt correction unit 170 based on the editing parameters acquired in the tilt detection unit 125. This tilt correction process is performed by the low-resolution image stored in the first low-resolution recognition memory 160. This is arbitrary angle rotation processing for the inclination angle θ ′ for the data, and the processing method is already known. Therefore, although detailed explanation here is omitted, in a very simple explanation, the coordinate (X, Y) is rotated by the angle θ by using the equation shown in the following (Equation 2), and the coordinates (U, V) is used.
[0058]
[Expression 2]

[0059]
(3-3) Processing of control unit 100 regarding recognition reliability
The control unit 100 detects the inclination of the document based on the image data of the document read by the CCD sensor 38 as described above, and performs reliability when the tilt correction process is performed and when the tilt correction process is not performed. The degree is acquired (FIG. 3, S304). Here, as described above, when only image data that has been subjected to tilt correction is used for top-and-bottom recognition, processing relating to reliability is not necessary. However, when reliability is acquired, reliability is higher. Thereafter, the top and bottom recognition processing is performed using the image data of the other side (FIG. 3, S305 and subsequent steps).
[0060]
As a result of the top / bottom recognition processing, rotation angle information is output to the rotation processing unit 130 so that document image data is output in an appropriate direction.
[0061]
Here, the reliability acquisition process will be described with two methods. First, as a first method, a reliability acquisition method using an MTF value calculated from a histogram will be described.
Here, the “MTF value” refers to the maximum value (hereinafter referred to as “max value”) and the minimum value (hereinafter referred to as “min value”) of several lines of the histogram when the histogram is acquired. (Referred to below as an area formed by several lines divided to acquire the MTF value). ).
[0062]
[Equation 3]

[0063]
FIG. 8 is a diagram for explaining the calculation of the MTF value. FIG. 4A shows an example in which the character line written on the original 400 is parallel to the direction of the original. In the figure, 410 represents the acquired histogram in the main scanning direction, and R represents a line area.
[0064]
As shown in FIG. 5A, when the character line is parallel to the document direction, a peak representing the line is detected in the histogram in the main scanning direction. On the other hand, since the peak of the histogram is not detected in a portion where no character exists, the min value is 0 in any line region when viewed for each line region. That is, from the above (Equation 3), the MTF value is 1 in any line region. As can be seen from the above (Equation 3), 1 is the maximum value of the MTF value. Accordingly, when the average value of the MTF values for each line area in the original is taken, the average value of the MTF values is obtained when there is no inclination in the character line, that is, in a state where a histogram suitable for the top / bottom recognition is obtained. Can be said to be expensive.
[0065]
On the other hand, as shown in FIG. 8B, when the character line is inclined, the peak width of the histogram in the main scanning direction may be widened, so that the difference between the max value and the min value in the line region is A small case occurs, that is, a case where the MTF value is small. Therefore, when the average value of the MTF values in the document is acquired, the average value of the MTF values is low when the character line is tilted, that is, when a suitable histogram for performing top-and-bottom recognition cannot be obtained. It is thought that there are many. From the above contents, it can be determined that the higher the average value of the MTF values in the document, the higher the recognition reliability.
[0066]
Next, as a second method, a method for acquiring the recognition reliability using the edge count of the histogram will be described.
When the histogram is acquired, the method using the histogram edge count is a change point in the direction in which the histogram value increases (hereinafter referred to as “rising edge”) and a change point in the direction in which the histogram value decreases (hereinafter referred to as “falling edge”). When the number of “edges” is counted, the fact that the number of falling edges is larger in a normal document is used. This is a method used in the top-and-bottom recognition described in Japanese Patent Application Laid-Open No. 9-9040 described as the prior art.
[0067]
FIG. 9 is a diagram for explaining a recognition reliability acquisition method using an edge count. In the example of FIG. 4, reference numeral 400 denotes a document, and 420 denotes an acquired histogram in the sub-scanning direction. FIG. 10A shows an example in which the character line has no inclination, FIG. 10B shows an example in which the character line is parallel to the document direction but is centered, and FIG. Shows an example when the character line is tilted.
[0068]
As shown in FIG. 6A, when the character line is not tilted, that is, when an appropriate histogram can be obtained for the top-and-bottom recognition, the number of rising edges, the number of falling edges, The difference becomes larger. This is because the leading edge of the sentence is fixed to some extent, so that the rising edge is 2, whereas the end position of the sentence is dispersed, so the falling edge is 4.
[0069]
However, when the character line is centered as shown in FIG. 7B or when the character line has a slope as shown in FIG. The difference from the number of falling edges does not appear so noticeably.
The reliability recognition using the edge count is based on the contents as described above and determines that the recognition reliability is higher as the difference between the number of rising edges and the number of falling edges is larger.
[0070]
When the recognition reliability is acquired using one or both of the methods described above, the acquired reliability is compared (S305), and the low-resolution image data that has been determined to have a high reliability is obtained. The following top-and-bottom recognition processing is performed. That is, when the reliability of the corrected image data is higher (S305: Yes), the top / bottom recognition process is performed using the corrected information (S306), and otherwise (S305: No). Then, the top and bottom recognition processing is performed using the information before correction (S307). In step S302, if the tilt angle θ ′ is 0, it is not necessary to perform tilt correction or reliability-related processing, so the top and bottom recognition processing is performed using the information as it is (S308).
[0071]
By performing the processing as described above, even when the document set on the document glass is tilted or when the character line itself described in the document is tilted, more accurate top-and-bottom recognition is performed. be able to.
[0072]
In the present embodiment, an example in which recognition reliability is mainly acquired has been described in detail. However, as described above, an image obtained by performing document skew correction processing without considering recognition reliability. Even if the top / bottom recognition processing is performed using the data, it is possible to perform the top / bottom recognition more accurately when the document set on the document glass is tilted. In that case, specifically, in the flowchart of FIG. 3, the process may proceed from step S303 to step S306.
[0073]
In this embodiment, the case of an image recognition apparatus applied to a monochrome copying machine has been described. However, the present invention can also be applied to a full-color copying machine. However, in this case, it is desirable to incorporate a circuit for canceling chromatic color data in advance from image data generated from a document, and to perform top-and-bottom recognition only from monochrome image data. This is because most of the text is monochrome.
[0074]
Further, when the first method and the second method are used together in acquiring the recognition reliability, not only the recognition reliability acquired by each method is handled equally, but also a constant weight is applied. It is also possible to calculate the final recognition reliability.
[0075]
In this embodiment, the case where a horizontally written document is used for the recognition reliability has been described. However, in the case of using either the first method or the second method related to acquisition of the recognition reliability, the main scanning direction is used. By acquiring both histograms in the sub-scanning direction and the original, it is possible to apply the document regardless of whether the document is written horizontally or vertically. Specifically, for example, values representing the recognition reliability may be acquired from the histograms in the main scanning direction and the sub-scanning direction, and the higher reliability may be adopted.
[0076]
In the present embodiment, the method of obtaining the edge count of the histogram of the image data is used for determining the document direction, but other methods using the histogram can be used. For example, the document orientation may be determined by a method of cutting out characters from image data using a histogram and performing pattern matching.
[0077]
【The invention's effect】
As described above, according to the first image recognition apparatus of the present invention, the inclination correction unit that corrects the image data based on the inclination of the document detected by the document inclination detection unit, Image data before tilt correction generated by the image reading means, and Image data corrected by the inclination correcting means Determination means for determining which image data to recognize the direction of the document, and image data determined by the determination means Based on the document direction recognition means for recognizing the direction of the document, Placed on the original glass When the document is tilted Or the text line in the manuscript is tilted Also, there is an effect that the top and bottom can be accurately recognized.
[0078]
Further, according to the second image recognition apparatus of the present invention, the inclination correction means for correcting the image data based on the inclination of the original detected by the original inclination detection means, and the inclination correction generated by the image reading means. Image data and image data corrected by the inclination correction means Each of , A reliability acquisition means for acquiring information related to the reliability of recognition when the orientation of the document is recognized, and image data before tilt correction On the basis of the Information about the obtained reliability and image data with corrected tilt On the basis of the Comparing means for comparing the acquired information relating to reliability, and original direction recognition means for recognizing the direction of the original based on image data with higher reliability as a result of comparison by the comparing means Even when the character line itself described in the manuscript is tilted, there is an effect that more accurate vertical recognition can be performed.
[Brief description of the drawings]
FIG. 1 is a diagram showing an overall configuration of a copying machine to which an image recognition apparatus according to the present invention is applied.
FIG. 2 is a block diagram illustrating a configuration of a control unit of a copier to which an image recognition apparatus according to the present invention is applied.
FIG. 3 is a flowchart showing processing contents of a control unit in the image recognition apparatus of the present embodiment.
FIG. 4 is a diagram for explaining a method of detecting the inclination of an original in the image recognition apparatus according to the present invention.
FIG. 5 is a diagram for explaining a method of detecting an inclination of an original in the image recognition apparatus according to the present invention.
FIG. 6 is a diagram for explaining a method of detecting the inclination of a document in the image recognition apparatus according to the present invention.
FIG. 7 is a flowchart showing detailed contents of tilt detection processing;
FIG. 8 is a diagram for explaining calculation of an MTF value.
FIG. 9 is a diagram for explaining a recognition reliability acquisition method using an edge count of a histogram.
FIG. 10 is a diagram for explaining conventional top-and-bottom recognition processing.
FIG. 11 is a diagram for explaining a problem of conventional upside-down recognition processing.
[Explanation of symbols]
38 CCD sensor
100 Control unit
110 Image signal processor
120 high resolution image memory
125 Tilt detector
130 Rotation processing part
140 LD driver
150 Resolution converter
160 First low-resolution recognition memory
170 Tilt correction unit
180 Second Low Resolution Recognition Memory
190 CPU
191 RAM
192 ROM

Claims

Image reading means for reading a document and generating first image data;
And the document skew detection means for detecting the inclination of the previous Symbol manuscript,
An inclination correcting means for correcting the inclination of the first image data generated by the image reading means based on the inclination of the original detected by the original inclination detecting means and generating second image data;
Judgment for determining which of the first image data generated by the image reading unit and the second image data generated by the tilt correction unit is to recognize the direction of the document. Means,
And an original direction recognizing means for recognizing the direction of the original based on any one of the first image data and the second image data determined by the determining means. Recognition device.

The determination means includes
Information on the reliability of recognition when the orientation of the document is recognized based on each of the first image data generated by the image reading unit and the second image data generated by the inclination correction unit. A reliability acquisition means for acquiring
Comparing means for comparing the information about the reliability that is obtained based on the first image data by the reliability obtaining means, and information about the confidence level obtained based on the second image data by the reliability obtaining means Including
The document direction recognition means includes:
The comparison result of the comparison by means, on the basis of image data having a higher reliability, the image recognition apparatus according to claim 1, wherein the recognizing the direction of the document.

The reliability acquisition means includes
The image recognition apparatus according to claim 2 , wherein information on reliability is acquired based on an MTF value calculated from a histogram of image data.

The reliability acquisition means includes
The image recognition apparatus according to claim 2 , wherein information on reliability is acquired based on the number of rising edges and the number of falling edges of a histogram for image data.

The inclination correction means and the determination means are:
If there is no document inclination detected by the document inclination detection means, no processing is performed.
The document direction recognition means includes:
If there is no tilt of the detected document by document skew detection means, on the basis of the first image data generated by the image reading means, from claim 1, characterized in that to recognize the direction of the document 5. The image recognition device according to any one of 4 to 4 .