JP2004310292A

JP2004310292A - Document image inspection method and device and program

Info

Publication number: JP2004310292A
Application number: JP2003100639A
Authority: JP
Inventors: Taiichi Saito; 泰一斉藤; Takeshi Masuda; 健増田
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2003-04-03
Filing date: 2003-04-03
Publication date: 2004-11-04

Abstract

<P>PROBLEM TO BE SOLVED: To quickly detect the angle of inclination(rotation) of a document image fetched in an inclined(rotated) state whose pixel values are unknown, that is, either concentration or luminance. <P>SOLUTION: A binary threshold is selected for input image data on an x-y plane, and whether the pixel values of the input image data are concentration or luminance is decided from the number of pixels in an upper rank and lower rank with the value as a reference. Also, the background part of an image is decided based on the binary threshold, and an added pixel value is set for pixels corresponding to an object whose background part is removed according as the pixel values are concentration or luminance, and parameter transformation to (angle, distance) coordinates called Radon transformation or Hough transformation is executed from the (x, y) coordinates of each pixel and the set added pixel value, and an angle/distance parameter chart whose linearity is easily extractable is prepared. The angle where straight lines concentrate is detected as the angle of inclination(rotation) of the image from the angle/distance parameter chart. Thus, it is possible to quicken a processing speed without decreasing performance by removing the background part. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、コンピュータに取り込まれた文書画像の、傾き（回転）角度を検出する検査方法、及び、検査装置、並びに、検査プログラムに関する。
【０００２】
【従来の技術】
文書をスキャナ（走査装置）で取り込んだとき、文書原稿が傾いた（回転した）状態で入力されたり、文書原稿に書かれている文字列自体が傾いて（回転して）いると、入力された文書は、見苦しいだけでなく、画像を圧縮する場合の圧縮率の低下や、文字認識処理を行うときの誤認識の原因にもなりうる。従って、文字列の並びを、水平、あるいは、垂直にしておくことが必要である。そのために、文字列の並びが、水平、あるいは、垂直から傾いて（回転して）いる角度を検出し、その角度分、文書画像全体を逆に回転させることにより、文字列の並びを水平、あるいは、垂直にすることができる。
【０００３】
文書内の画像情報から直線性を検査する方法として、パラメータ変換処理による方法がある。まずは、このパラメータ変換処理の原理を図５、図６で説明する。
図５のｘ−ｙ画像平面上に、入力画像データｇ（ｘ，ｙ）があるとする。（ｘ，ｙ）は各画素の位置座標を示し、ｇ（ｘ，ｙ）が入力画像データの（ｘ，ｙ）座標上の画素値を示す。ｘ−ｙ画像平面上にある直線ｌ（エル）は、式（１）で表すことができる。
【０００４】
ρ＝ｘ・ｃｏｓ（θ）＋ｙ・ｓｉｎ（θ） ……（１）
ここで、基準点（ここでは、原点（０，０））から直線ｌにおろした垂線をｖ（ヴイ）とすると、ρ（ｒｈｏロー）は垂線ｖの長さ、θ（ｔｈｅｔａシータ）（単位はｒａｄｉａｎラジアン）はｘ軸とのなす角度となる。この式（１）を使い、パラメータ変換処理を行うことで、図５のｘ−ｙ画像平面から図６の角度・距離パラメータ平面への対応付けが行われる。このパラメータ変換処理は、次のようにして行う。
【０００５】
図５の各点（ｘ，ｙ）に対し、角度θを一定刻みで変化させ、各θとそのθで式（１）を計算したρのペアを求める。θとρのペアは、角度・距離パラメータ表の座標（θ，ρ）に対応する。求められた全てのθ−ρのペアで示される角度・距離パラメータ表Ｓ（θ，ρ）に、（ｘ，ｙ）座標上の画素値ｇ（ｘ，ｙ）を加算する。ただし、背景に対し、直線上の画素値は値が大きいものとする。例えば、図５の点ｐ１（ｘ１，ｙ１）について、この処理を行うことにより、Ｓ（θ，ρ）の値が大きい点を表示すると、図６の正弦曲線ｐ１になる。これをｘ−ｙ画像平面上の全変域、あるいは、部分変域内の座標点（ｘ，ｙ）について行うことにより、画像全体、あるいは、部分画像に対する、画素値の累積加算値である、角度・距離パラメータ表Ｓ（θ，ρ）が求まる。ｘ−ｙ画像平面上にある入力画像が、図５の直線ｌ上の点ｐ１（ｘ１，ｙ１）、ｐ２（ｘ２，ｙ２）、ｐ３（ｘ３，ｙ３）からなる場合、角度・距離パラメータ平面上では、３本の正弦曲線が現れ、それぞれ、図６の正弦曲線ｐ１、ｐ２、ｐ３に対応し、これら３つの曲線は図６の点ｌで交差する。図６のこの交差点ｌの座標値（θ，ρ）は、基準点からの長さがρでｘ軸に対する角度がθとなる垂線に垂直な、図５の直線ｌを示しており、図５で直線上にある点が多いほど、図６の座標（θ，ρ）での交差回数が多いため、交差点での累積加算値は大きくなる。従って、累積加算値Ｓ（θ，ρ）の値が大きい座標点（θ，ρ）を見つけることにより、角度・距離パラメータ平面上での座標点（θ，ρ）で示される、ｘ−ｙ画像平面上での直線の存在を検出することができる。このパラメータ変換処理は、ラドン（Ｒａｄｏｎ）変換、あるいは、ハフ（Ｈｏｕｇｈ）変換と呼ばれている。
【０００６】
パラメータ変換処理が以上のような方法であることから、検出される直線は実線だけでなく、つながりが途切れている破線に対してもその直線性を検出することができる。文字が規則正しく並んだ文書を見ると、人間は文字列方向に平行な直線を知覚する。知覚される直線は、各文字を構成する線分が破線状に並んでいるかのように見られる。このことから、文書画像に対しパラメータ変換処理を行うことで、文字が規則正しく並んでいる文書の直線方向を検出することができる。
このパラメータ変換処理を利用し、文書画像の傾き（回転）角度を検出する、従来技術の具体例として、「特願２００１−２６４８２４」があり、この方法を図４で説明する。
【０００７】
画像６を画像入力処理１０１により入力して、入力画像データｇ（ｘ，ｙ）１０２を得る。入力画像データ１０２の位置座標（ｘ，ｙ）と画素値ｇ（ｘ，ｙ）から、パラメータ変換処理３０１により、濃度としての角度・距離パラメータ表Ｓ（θ，ρ）３０２とカウンタ表Ｃ（θ，ρ）３０３を作成する。ここでは、前述の式（１）を使用してρを計算する。角度・距離パラメータ表には画素値ｇ（ｘ，ｙ）を加算するのに対し、このカウンタ表には１を加算する。
【０００８】
ここで、画像の濃度と輝度の関係を述べておく。画像データは、画面の多くを占める背景があり、その背景の中に対象物が存在すると見られる場合が多い。特に線図形や文字がある画像では、その区別が明確である。対象物と背景との観点から、入力された画像の画素値に対して、次の２種類の意味付けができる。画素値が大きいとき対象物に対応し、画素値が小さいとき背景に対応する場合、このとき、画素値は濃度を示しているとされる。逆に、画素値が小さいとき対象物に対応し、画素値が大きいとき背景に対応する場合、このとき、画素値は輝度を示しているとされる。
【０００９】
濃度と輝度とは、このように値が逆の関係にあり、取り得る最大画素値をＭとするとｇ（ｘ，ｙ）を輝度値としたとき、式（２）により、濃度値ｇ’（ｘ，ｙ）を求めることができる。ビット数が８の多値画像では、画素値が０から２５５であり、Ｍ＝２５５となる。ビット数が１の２値画像では、Ｍ＝１となる。
ｇ’（ｘ，ｙ）＝Ｍ−ｇ（ｘ，ｙ） ……（２）
【００１０】
図４の従来技術においては、入力画像データ１０２の画素値が、濃度であるか輝度であるか、未知の場合を考えており、パラメータ変換処理３０１は、画素値が仮に濃度であるとして、濃度としての角度・距離パラメータ表３０２を作成している。入力画像データの画素値が輝度であった場合も考慮し、式（２）で反転させた画素値に対しても、パラメータ変換処理を行い、輝度としての角度・距離パラメータ表を作成する必要がある。しかし、ここで行うパラメータ変換処理は、処理量が非常に多いため、濃度と輝度について２回、パラメータ変換処理をすることは避けたい。
【００１１】
図４において、パラメータ表反転処理３０４により、濃度としての角度・距離パラメータ表３０２とカウンタ表３０３から、輝度としての角度・距離パラメータ表３０５を作成する。
Ｓ（θ，ρ）を濃度としての角度・距離パラメータ表とすると、式（３）により、輝度としての角度・距離パラメータ表Ｔ（θ，ρ）が求まり、これは、パラメータ変換処理を１回実行するだけで、濃度と輝度に対する２つの角度・距離パラメータ表を作成する方法を示している。
Ｔ（θ，ρ）＝Ｍ・Ｃ（θ，ρ）−Ｓ（θ，ρ） ……（３）
【００１２】
次に、画素値を濃度として見た場合と輝度として見た場合の角度・距離パラメータ表Ｓ（θ，ρ）とＴ（θ，ρ）から、それぞれについて、最大直線集中度検出処理３０６により、濃度としての最大直線集中度３０７と輝度としての最大直線集中度３０８を検出する。濃度・輝度判定処理３０９により、濃度としての最大直線集中度３０７と輝度としての最大直線集中度３０８から、濃度・輝度判定情報１１２を得る。最後に、傾き（回転）角度検出処理３１０により、濃度・輝度判定情報１１２と、濃度としての角度・距離パラメータ表３０２、あるいは、輝度としての角度・距離パラメータ表３０５から、傾き（回転）角度１１６を検出する。
【００１３】
【発明が解決しようとする課題】
図４の従来技術では、カウンタ表を用いることにより、処理量が多いパラメータ変換処理の実行回数を１回で済ませている。しかし、そのパラメータ変換処理の処理量の多さは、「入力画像データの処理画素数」×「角度θの個数」回、式（１）を実行しなければならないことにある。
そこで、本発明は、係る問題点を解決するため、パラメータ変換処理の実行回数である「入力画像データの処理画素数」×「角度θの個数」のうち、「入力画像データの処理画素数」を、傾き（回転）角度検出の性能を落とさずに減らすことにより、実行時間を短くすることを目的としている。
【００１４】
【課題を解決するための手段】
パラメータ変換処理の処理時間を短くするためには、入力画像データの処理画素数と角度θの個数を減らせばよい。ただし、角度θの個数を減らすことは、検出すべき傾き（回転）角度の精度を下げることであり、検出性能を悪くすることになる。従って、パラメータ変換処理を実行するときの入力画像データの処理画素数を、検出性能を下げずに、減らすことができればよい。
【００１５】
入力画像の処理画素数を減らす方法として、処理する画素を１画素、２画素、あるいは、数画素おきに間引きすることは、よく使われる方法である。ｘ，ｙ方向共１画素おきで１／４、２画素おきで１／９に処理量を減らすことができる。ただし、間引き間隔を大きくし過ぎると、処理に必要な情報まで省いてしまうおそれがある。この方法に加え、更に処理画素数を減らす方法が本発明である。
対象とする文書画像が多値の場合を考える。文書画像の例が図７であり、図８が図７の各画素値（ｐｉｘｅｌｖａｌｕｅ）の頻度（ｆｒｅｑｕｅｎｃｙ）をグラフにした画素値ヒストグラムである。図８において、画素値５０付近で高い頻度を示す部分が図７の文書の背景部分であり、それより高い画素値を持つ画素が対象物となる文字部分である。この背景部分は、パラメータ変換処理に不要な情報であることから、これを除くことができ、従って、性能を落とさずに処理画素数を減らすことができる。
【００１６】
図４の従来技術では、入力画像が濃度か輝度か不明のままパラメータ変換処理３０１を行い、濃度としての角度・距離パラメータ表３０２と輝度としての角度・距離パラメータ表３０５を求め、２つの角度・距離パラメータ表から、濃度か輝度かの判定を行っている。１回のパラメータ変換処理で、２つの角度・距離パラメータ表を作成するためには、入力画像データの全画素数を処理することが必要であり、これが式（３）を成立させるための条件となる。しかし、改良したいことは、処理する画素数を減らすことであり、間引きではない方法で処理画素数を減らす方法では、１回のパラメータ変換処理で２つの角度・距離パラメータ表を作成することが不可能になる。従って、本発明においては、パラメータ変換処理を行う前に、入力画像データの画素値が濃度であるか輝度であるかの判定を行う。
【００１７】
まず、入力画像データの２値化しきい値ｋを選定する。図７の例では、図８で表されているように、しきい値（ｔｈｒｅｓｈｏｌｄ）は１１３であった。画像の背景部分は対象物より多くの部分を占めているのが普通であることから、２値化しきい値以下の画素値の頻度ｎｌと２値化しきい値を超える画素値の頻度ｎｕを比較し、ｎｌ≦ｎｕのときは濃度、ｎｌ＞ｎｕのときは輝度であると判定する。図８においては、ｎｌ≦ｎｕであることから、濃度と判定される。場合によって、ｋ−α以下の画素値の頻度をｎｌ、ｋ＋β以上の画素値の頻度をｎｕとする濃度・輝度判定方法もある。
【００１８】
入力画像データの画素値が濃度であるか輝度であるかが分かれば、濃度の場合は画素値が小さい部分を背景、輝度の場合は画素値が大きい部分を背景として、パラメータ変換処理で背景部分を除いて処理することにより、処理時間を短くすることができる。画素値が濃度を示している場合、背景と対象物を分ける境目に２値化しきい値ｋそのものを使い、２値化しきい値ｋより小さい画素値の画素を全て背景部分とすれば、処理量を大きく減らすことができる。しかし、画像によっては２値化しきい値の選定に失敗することがあるため、本発明においては、２値化しきい値選定の失敗に対処できるよう、画素値が濃度の場合は２値化しきい値からある値γ（ｇａｍｍａガンマ）以下を背景部分に、画素値が輝度の場合は２値化しきい値からある値γ以上を背景部分とする方法を取る。次のように、対象物の下限画素値をｋｌ＝ｋ＋１−γ、上限画素値をｋｕ＝ｋ＋１＋γとして、２値化しきい値ｋを中心とする画素値の範囲を対象物の画素値として設定する。入力画像データの画素値が濃度を示していれば、上限画素値は無くても構わないが、上限画素値を設けるのは、輝度であった場合に必要となるからでもある。γの値は、固定値とする方法や、画像に応じて変化させる方法がある。画像に応じて変化させる方法では、γとして画素値ヒストグラムの分布の広がりに応じた値を取る方法がある。この方法は、画素値ヒストグラムの分布において、その全分散の平方根である標準偏差をσ（ｓｉｇｍａシグマ）とし、そのＤ倍を取って、γ＝Ｄ・σとするものである。Ｄは１倍、０．５倍、２倍などとすることができ、値が小さいほど処理画素数を少なくすることができるが、あまり小さくし過ぎると、情報を失って、傾き（回転）角度検出処理の性能を悪化させてしまうおそれがある。図８の場合、標準偏差が５７であり、γを標準偏差の１倍とすると、画素値が１１３−５７＝５６以下を背景部分と見なすことになる。従って、対象物の下限画素値ｋｌ（ｌｏｗｅｒｌｉｍｉｔ）は５７（＝５６＋１）となる。この場合、背景部分を除くことで、処理量は約１／３．５になる。
【００１９】
パラメータ変換処理を実行する場合に加算する加算画素値ｐは、濃度の場合、入力画像データの画素値ｇがｋｌ未満のときはｐ＝０、ｋｌ以上のときはｐ＝ｇ−ｋｌ＋１に、輝度の場合、ｇがｋｕを超えるときはｐ＝０、ｋｕ以下のときはｐ＝ｋｕ−ｇ＋１にする。以上の設定に加え、濃度の場合のｇがｋｕを超えるときと、輝度の場合のｇがｋｌ未満のときに、加算画素値をｐ＝ｋｕ−ｋｌ＋１にする方法もある。
【００２０】
次に、入力画像データが２値の場合を考える。画素値が０の画素数をｎｌ、１の画素数をｎｕとし、入力画像データが多値の場合と同様に、濃度・輝度の判定を行う。濃度の場合（ｎｌ≦ｎｕ）は、パラメータ変換処理において、入力画像データの画素値ｇが１の画素についてのみ処理を行い、そのときの加算画素値を１とする。輝度の場合（ｎｌ＞ｎｕ）は、画素値ｇが０の画素についてのみ処理を行い、そのときの加算画素値を１とする。２値画像データでは、画素値０は背景、１が対象物とするのが普通であるが、本発明では、処理画素数を減らす目的で、単純に、画素数が多い方を背景、少ない方を対象物とする。
【００２１】
入力画像データが多値の場合、２値の場合、どちらも、前述の方法で、濃度・輝度の判定を行い、濃度・輝度の判定情報、及び、下限画素値と上限画素値から、入力画像データの画素値に対して加算画素値を決定し、加算画素値が０である画素を処理対象からはずして、角度・距離パラメータ表に加算画素値を加算する処理を行う。最終的にできた角度・距離パラメータ表から、直線性が高い角度を調査し、入力画像の傾き（回転）角度を検出する方法が本発明である。
【００２２】
【発明の実施の形態】
以下で、入力画像が多値の場合の、本発明に基づく、処理画素数を減らした処理方法を図１で例示する。
画像６を画像入力処理１０１により入力し、入力画像データｇ（ｘ，ｙ）１０２を得る。２値化しきい値選定処理１０３により、入力画像データ１０２から２値化しきい値（ｋ）１０４、及び、標準偏差（σ）１０５を求める。対象物画素値範囲設定処理１０６により、２値化しきい値（ｋ）１０４から、下位画素数（ｋ以下の画素数）（ｎｌ）１０７と上位画素数（ｋを超える画素数）（ｎｕ）１０８、及び、下限画素値（ｋｌ）１０９と上限画素値（ｋｕ）１１０を計算する。下限画素値（ｋｌ）１０９と上限画素値（ｋｕ）１１０は、ｋｌ＝ｋ＋１−Ｄ・σ、ｋｕ＝ｋ＋１＋Ｄ・σとする。ただし、Ｄは定数。濃度・輝度判定処理１１１により、下位画素数（ｎｌ）１０７と上位画素数（ｎｕ）１０８から、ｎｌ≦ｎｕのときは濃度、ｎｌ＞ｎｕのときは輝度、という濃度・輝度判定情報１１２を得る。
【００２３】
パラメータ変換処理１１３により、濃度・輝度判定情報１１２と下限画素値（ｋｌ）１０９、上限画素値（ｋｕ）１１０、及び、入力画像データｇ（ｘ，ｙ）１０２から角度・距離パラメータ表１１４を作成する。
傾き（回転）角度を検出する場合、垂直付近のみ、水平付近のみ、あるいは、垂直・水平両方を使って、直線性を検査する方法がある。従って、これらに対処できるよう、作成する角度・距離パラメータ表は、垂直付近と水平付近検査用に２つ作成することにする。
【００２４】
角度をθ（単位ｒａｄｉａｎラジアン）の代わりにａｉ（単位ｄｅｇｒｅｅ度）を使い、垂直付近、水平付近の角度・距離パラメータ表を、それぞれ、Ｓｖ（ａｉ，ρｊ）、Ｓｈ（ａｉ，ρｊ）とする。ただし、Ｎａを角度の分割個数とすると、ｉ＝０，１，２，．．．，Ｎａ−１である。
なお、ρｊを実数値のまま計算機で処理しようとすると、使用メモリが膨大になるため、量子化しておく必要があり、距離に関して一定範囲の区画を作って量子化し、各区画には番号ｊを付ける。この距離に関する区画の個数をＮｒとすると、ｊ＝０，１，２，．．．，Ｎｒ−１である。距離を量子化するのは、実数値のままでは使用メモリが膨大になるのを防ぐためだけではなく、直線性を検出するために累積加算値をある程度大きくする必要があるからでもある。
【００２５】
文書の傾き（回転）角度を検出する場合、通常であれば、角度の範囲ａｉは０．１度（ｄｅｇｒｅｅ）刻みで−５〜５度位あれば充分と考えられる。この場合、Ｎａ＝１０１となる。ａｉからθへの変換は、式（４）で行える。
θ＝ａｉ・π／１８０ ……（４）
【００２６】
このパラメータ変換処理では、まず、角度・距離パラメータ表への加算画素値ｐを計算する。入力画像の画素値ｇが濃度の場合、ｇ＜ｋｌのときｐ＝０、ｇ≧ｋｌのときｐ＝ｇ−ｋｌ＋１とし、輝度の場合、ｇ＞ｋｕのときｐ＝０、ｇ≦ｋｕのときｐ＝ｋｕ−ｇ＋１とする。ｐ＝０のときは、角度・距離パラメータ表への加算処理は行わず、ｐ＞０なるすべてのｇ（ｘ，ｙ）に対し、座標（ｘ，ｙ）と指定した全ての角度ａｉについて、式（４）と式（１）を計算して角度ａｉと距離ρｊのペアで示される（ａｉ，ρｊ）座標を求め、その座標に対する角度・距離パラメータ表Ｓｖ（ａｉ，ρｊ）とＳｈ（ａｉ，ρｊ）へｐの値を加算して行く。
【００２７】
最後に、図１の傾き（回転）角度検出処理１１５によって、パラメータ変換処理１１３で最終的に作成された角度・距離パラメータ表１１４から、直線が集中している角度を検出することにより、傾き（回転）角度１１６を検出する。
この傾き（回転）角度検出処理１１５では、垂直付近の角度・距離パラメータ表Ｓｖ（ａｉ，ρｊ）と水平付近の角度・距離パラメータ表Ｓｈ（ａｉ，ρｊ）について、直線が集中している角度ａｉを検査する。このとき、片方だけを使って検査する方法と、両方を使う方法とがあり、両方を使う場合も、垂直と水平の直線性が高い方の角度ａｉを採用する方法と、垂直と水平の平均の直線性が高い角度ａｉを採用する方法とがある。以下では、片方だけを使って検査する方法で説明を行うことにし、角度・距離パラメータ表をＳ（ａｉ，ρｊ）とする。また、角度ａｉのみについての角度・距離パラメータ表をＳａ（ｊ）、Ｓａ（ｊ）を大きい順に並べ変えたものをＳｓ（ｊ）とする。ただし、ｊ＝０，１，．．．Ｎｒ−１。
傾き（回転）角度検出処理において、まず、各角度ａｉについて、直線が集中している度合いとして、直線集中度Ｃを計算する。この直線集中度を計算する方法には、次のようなものがある。また、下記の方法を組み合わせた方法も考えられる。
【００２８】
（ａ）直線集中度を、Ｓｓ（ｊ）の上位Ｎｍ個の和とする方法。これは、Ｃ＝Ｃａ＝ΣＳｓ（ｊ）で計算され、Ｃａが大きいほど直線性が高いとされる。ただし、ｊ＝０，１，．．．，Ｎｍ−１。
（ｂ）直線集中度を、Ｓｓ（ｊ）の分散とする方法。これは、ｖ（ｊ）＝｛０，１， −１，２， −２，．．．｝とすることにより、Ｃ＝Ｃｂ＝ΣＳｓ（ｊ）・｛ｖ（ｊ）−μｓ｝＊＊２で計算され、Ｃｂが小さいほど直線性が高いとされる。ただし、μｓ＝Σｖ（ｊ）・Ｓｓ（ｊ）／Ｎｒ、ｊ＝０，１，．．．，Ｎｒ−１。
（ｃ）直線集中度を、前記Ｃｂを前記Ｃａで割った値とする方法。Ｃ＝Ｃｃ＝Ｃｂ／Ｃａで計算され、Ｃｃが小さいほど直線性が高いとされる。
（ｄ）直線集中度を、前記Ｃｂを前記Ｃａの２乗で割った値とする方法。Ｃ＝Ｃｄ＝Ｃｂ／｛Ｃａ｝＊＊２で計算され、Ｃｄが小さいほど直線性が高いとされる。
（ｅ）直線集中度を、Ｓａ（ｊ）の隣りとの値の差の総和とする方法。これは、Ｃ＝Ｃｅ＝Σ｜Ｓａ（ｊ＋１）−Ｓａ（ｊ）｜で計算され、Ｃｅが大きいほど直線性が高いとされる。ただし、ｊ＝１，２，３，．．．，Ｎｒ−１。
【００２９】
各角度ａｉについて、以上のような直線集中度を計算し、直線集中度が最も高い値を取る角度ａｉを、入力画像データの傾き（回転）角度として出力する。以上が、入力画像データが多値の場合における、本発明による、傾き（回転）角度を検出する方法である。
【００３０】
次に、入力画像データが２値の場合を図２で説明する。
画像入力処理１０１により、画像６を入力して、入力画像データ１０２を得る。下位・上位画素数計算処理２０１により、下位画素数（ｎｌ）１０７、上位画素数（ｎｕ）１０８を計算する。入力画像データが２値であるため、２値化しきい値を求める必要はなく、下位画素数は画素値が０の画素数、上位画素数は画素値が１の画素数とすればよい。濃度・輝度判定処理１１１により、下位画素数１０７と上位画素数１０８を比較することによって、濃度・輝度判定情報１１２を得る。以降は、多値の場合と同様、パラメータ変換処理２０２、傾き（回転）角度検出処理１１５によって、傾き（回転）角度１１６を検出する。
以上の処理を特徴とする方法が本発明である。
【００３１】
【実施例】
図３が、傾き（回転）角度検出を行う装置の構成を示すブロック図である。
装置の各部はバス１に接続され、全体の動作は制御部２により制御される。処理の種類を選択する場合は、コンソール３でその指示を与える。処理方法を記述したプログラムは、ハードディスク４からメモリ５０に格納され、制御部２がメモリ５０内のプログラムに従って処理を制御する。あるいは、メモリ５０をＲＯＭ（リード・オンリー・メモリ）として、プログラムを作り付けにしておいてもよい。
【００３２】
まず、最初の処理では、画像６をスキャナ７で入力し、ハードディスク４に格納する。あるいは、画像が他の入力装置、例えば、ネットワーク８を通してハードディスク４に格納される場合もある。ハードディスクに格納された画像は、処理を行うためメモリ５１に格納される。この格納された画像が図１の入力画像データ１０２である。図３のメモリ５２は、図１の入力画像データ１０２から傾き（回転）角度１１６を検出するまでに、各処理で扱う情報を格納するワークエリアである。図３のメモリ５１に格納されている入力画像データを入力とし、図１の２値化しきい値選定処理１０３により、２値化しきい値１０４、及び、標準偏差１０５を求める。対象物画素値範囲設定処理１０６により、下位画素数１０７、上位画素数１０８、下限画素値１０９、上限画素値１１０を設定する。濃度・輝度判定処理１１１により、濃度・輝度判定情報１１２を得る。パラメータ変換処理１１３により、濃度・輝度判定情報１１２と下限画素値１０９、上限画素値１１０、及び、入力画像データ１０２から、角度・距離パラメータ表１１４を作成する。傾き（回転）角度検出処理１１５により、傾き（回転）角度１１６を検出する。これらの情報は、図３のメモリ５２に格納される。
【００３３】
傾き（回転）角度（ａ度）が得られれば、図３のメモリ５１、あるいは、ハードディスク４に格納されている入力画像データを−ａ度回転して、傾きを補正した画像をハードディスク４などに格納すればよい。
【００３４】
【発明の効果】
本発明による傾き（回転）角度検出処理は、不要な情報を除いて処理しているだけであるため、検出性能の低下を起こさず、ハフ変換（ラドン変換）を使った手法としては、従来技術に比べ、数倍高速に処理することができる。
【図面の簡単な説明】
【図１】本発明である、対象とする画像が多値の場合における、傾き（回転）角度を検出する方法を示すブロック図。
【図２】本発明である、対象とする画像が２値の場合における、傾き（回転）角度を検出する方法を示すブロック図。
【図３】本発明である、傾き（回転）角度検出を行う装置の構成を示すブロック図。
【図４】従来技術の、傾き（回転）角度を検出する方法を示すブロック図。
【図５】パラメータ変換処理の原理を説明するための、ｘ−ｙ画像平面上の直線を示した図。
【図６】パラメータ変換処理の原理を説明するための、角度・距離パラメータ平面上の正弦曲線を示した図。
【図７】入力用文書画像の例。
【図８】図７の文書画像に関する、画素値ヒストグラムを表示した図。
【符号の説明】
１：バス
２：制御部
３：コンソール
４：ハードディスク
５、５０、５１、５２：メモリ
６：画像
７：スキャナ
８：ネットワーク
１０１：画像入力処理
１０２：入力画像データ
１０３：２値化しきい値選定処理
１０４：２値化しきい値
１０５：標準偏差
１０６：対象物画素値範囲設定処理
１０７：下位画素数
１０８：上位画素数
１０９：下限画素値
１１０：上限画素値
１１１、３０９：濃度・輝度判定処理
１１２：濃度・輝度判定情報
１１３、２０２、３０１：パラメータ変換処理
１１４：角度・距離パラメータ表
１１５、３１０：傾き（回転）角度検出処理
１１６：傾き（回転）角度
２０１：下位・上位画素数計算処理
３０２：濃度としての角度・距離パラメータ表
３０３：カウンタ表
３０４：パラメータ表反転処理
３０５：輝度としての角度・距離パラメータ表
３０６：最大直線集中度検出処理
３０７：濃度としての最大直線集中度
３０８：輝度としての最大直線集中度Ｔ[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an inspection method, an inspection apparatus, and an inspection program for detecting a tilt (rotation) angle of a document image captured by a computer.
[0002]
[Prior art]
When a document is scanned by a scanner (scanning device), the document is input if the document is tilted (rotated) or if the character string written on the document is tilted (rotated). Such a document is not only unsightly, but can also cause a reduction in the compression ratio when compressing an image and cause erroneous recognition when performing character recognition processing. Therefore, it is necessary to arrange the character strings horizontally or vertically. For this purpose, an angle at which the character string is inclined (rotated) from horizontal or vertical is detected, and by rotating the entire document image by the angle, the character string is arranged horizontally, Alternatively, it can be vertical.
[0003]
As a method of checking linearity from image information in a document, there is a method based on parameter conversion processing. First, the principle of this parameter conversion processing will be described with reference to FIGS.
It is assumed that input image data g (x, y) exists on the xy image plane in FIG. (X, y) indicates the position coordinates of each pixel, and g (x, y) indicates the pixel value on the (x, y) coordinates of the input image data. A straight line 1 (ell) on the xy image plane can be expressed by equation (1).
[0004]
ρ = x · cos (θ) + y · sin (θ) (1)
Here, assuming that a perpendicular drawn from the reference point (here, the origin (0, 0)) to the straight line 1 is v (v), ρ (rho low) is the length of the perpendicular v, θ (theta) theta (unit) Is the radian) is the angle made with the x-axis. By performing the parameter conversion process using the equation (1), the xy image plane in FIG. 5 is associated with the angle / distance parameter plane in FIG. This parameter conversion process is performed as follows.
[0005]
For each point (x, y) in FIG. 5, the angle θ is changed at regular intervals, and a pair of each θ and ρ obtained by calculating the equation (1) with the θ is obtained. The pair of θ and ρ corresponds to the coordinates (θ, ρ) in the angle / distance parameter table. The pixel value g (x, y) on the (x, y) coordinate is added to the angle / distance parameter table S (θ, ρ) indicated by all the obtained θ-ρ pairs. However, the pixel value on a straight line with respect to the background has a large value. For example, when this processing is performed on the point p1 (x1, y1) in FIG. 5 to display a point having a large value of S (θ, ρ), a sine curve p1 in FIG. 6 is obtained. This is performed for the coordinate points (x, y) in the entire domain or the partial domain on the xy image plane, so that the angle, which is the cumulative addition value of pixel values for the entire image or the partial image, is obtained. The distance parameter table S (θ, ρ) is obtained. When the input image on the xy image plane is composed of points p1 (x1, y1), p2 (x2, y2), and p3 (x3, y3) on the straight line l in FIG. In FIG. 6, three sinusoidal curves appear, corresponding respectively to the sinusoidal curves p1, p2, p3 in FIG. 6, which intersect at point l in FIG. The coordinate value (θ, ρ) of the intersection l in FIG. 6 indicates a straight line l in FIG. 5 which is perpendicular to a perpendicular line having a length ρ from the reference point and an angle θ with respect to the x-axis. As the number of points on the straight line increases, the number of intersections at the coordinates (θ, ρ) in FIG. 6 increases, so that the cumulative addition value at the intersection increases. Therefore, by finding a coordinate point (θ, ρ) where the value of the cumulative addition value S (θ, ρ) is large, an xy image represented by the coordinate point (θ, ρ) on the angle / distance parameter plane is obtained. The presence of a straight line on a plane can be detected. This parameter conversion process is called Radon conversion or Hough conversion.
[0006]
Since the parameter conversion processing is a method as described above, it is possible to detect the linearity of not only a solid line but also a broken line with a discontinuous connection. When looking at a document in which characters are regularly arranged, a human perceives a straight line parallel to the character string direction. The perceived straight line appears as if the line segments constituting each character are arranged in a broken line. Therefore, by performing the parameter conversion process on the document image, it is possible to detect the linear direction of the document in which characters are regularly arranged.
As a specific example of the prior art for detecting the inclination (rotation) angle of a document image using this parameter conversion processing, there is “Japanese Patent Application No. 2001-264824”, and this method will be described with reference to FIG.
[0007]
The image 6 is input by the image input processing 101 to obtain input image data g (x, y) 102. From the position coordinates (x, y) and the pixel value g (x, y) of the input image data 102, an angle / distance parameter table S (θ, ρ) 302 as a density and a counter table C (θ) are obtained by parameter conversion processing 301. , Ρ) 303 is created. Here, ρ is calculated using the above-described equation (1). While the pixel value g (x, y) is added to the angle / distance parameter table, 1 is added to this counter table.
[0008]
Here, the relationship between image density and luminance will be described. Image data has a background that occupies most of the screen, and it is often seen that an object exists in the background. In particular, in an image having a line figure or a character, the distinction is clear. From the viewpoint of the object and the background, the following two types of meanings can be given to the pixel values of the input image. When the pixel value is large when the pixel value corresponds to the object, and when the pixel value is small when the pixel value corresponds to the background, the pixel value indicates the density at this time. Conversely, when the pixel value is small, the pixel value corresponds to the object, and when the pixel value is large, the pixel value corresponds to the background. At this time, the pixel value indicates luminance.
[0009]
As described above, the density and the luminance have the opposite relationship. When the maximum pixel value that can be taken is M, and g (x, y) is the luminance value, the density value g ′ ( x, y). In a multi-level image having 8 bits, the pixel value is from 0 to 255, and M = 255. In a binary image having one bit, M = 1.
g ′ (x, y) = M−g (x, y) (2)
[0010]
In the prior art shown in FIG. 4, it is considered that the pixel value of the input image data 102 is unknown, whether it is density or luminance, and the parameter conversion processing 301 determines that the pixel value is density , An angle / distance parameter table 302 is created. In consideration of the case where the pixel value of the input image data is luminance, it is necessary to perform a parameter conversion process on the pixel value inverted by the equation (2) and create an angle / distance parameter table as luminance. is there. However, since the parameter conversion processing performed here has a very large processing amount, it is desirable to avoid performing the parameter conversion processing twice for the density and the luminance.
[0011]
In FIG. 4, an angle / distance parameter table 305 as luminance is created from an angle / distance parameter table 302 as density and a counter table 303 by a parameter table inversion process 304.
Assuming that S (θ, ρ) is an angle / distance parameter table as a density, an angle / distance parameter table as luminance T (θ, ρ) is obtained from Expression (3). It shows a method of creating two angle / distance parameter tables for density and luminance simply by executing.
T (θ, ρ) = M · C (θ, ρ) −S (θ, ρ) (3)
[0012]
Next, from the angle / distance parameter tables S (θ, ρ) and T (θ, ρ) when the pixel value is viewed as the density and when viewed as the luminance, the maximum linear concentration detection process 306 is performed for each of them. The maximum linear concentration 307 as the density and the maximum linear concentration 308 as the luminance are detected. By the density / luminance determination processing 309, the density / luminance determination information 112 is obtained from the maximum linear concentration 307 as the density and the maximum linear concentration 308 as the luminance. Finally, the inclination (rotation) angle 116 is obtained from the density / luminance determination information 112 and the angle / distance parameter table 302 as density or the angle / distance parameter table 305 as luminance by the inclination (rotation) angle detection processing 310. Is detected.
[0013]
[Problems to be solved by the invention]
In the prior art shown in FIG. 4, the number of executions of the parameter conversion processing having a large processing amount is reduced to one by using a counter table. However, the large amount of processing of the parameter conversion processing is that equation (1) must be executed “the number of processed pixels of input image data” × “the number of angles θ” times.
Therefore, the present invention solves the above problem by setting “the number of processed pixels of input image data” in “the number of processed pixels of input image data” × “the number of angles θ” which is the number of times of performing the parameter conversion process. Is intended to reduce the execution time by reducing the performance of tilt (rotation) angle detection without lowering the performance.
[0014]
[Means for Solving the Problems]
In order to shorten the processing time of the parameter conversion processing, the number of processing pixels and the number of angles θ of the input image data may be reduced. However, reducing the number of angles θ means lowering the accuracy of the inclination (rotation) angle to be detected, which deteriorates the detection performance. Therefore, it suffices if the number of processed pixels of the input image data when executing the parameter conversion processing can be reduced without lowering the detection performance.
[0015]
As a method of reducing the number of pixels to be processed of an input image, it is a commonly used method to thin out pixels to be processed every one pixel, every two pixels, or every few pixels. In both the x and y directions, the processing amount can be reduced to 1/4 for every other pixel and 1/9 for every 2 pixels. However, if the thinning interval is too large, there is a possibility that information necessary for processing may be omitted. In addition to this method, a method for further reducing the number of pixels to be processed is the present invention.
Consider the case where the target document image is multi-valued. FIG. 7 shows an example of a document image, and FIG. 8 is a pixel value histogram in which the frequency (frequency) of each pixel value (pixel value) in FIG. 7 is graphed. In FIG. 8, a portion having a high frequency near the pixel value 50 is a background portion of the document in FIG. 7, and a pixel having a higher pixel value is a character portion to be an object. Since this background portion is unnecessary information for the parameter conversion process, it can be removed, and therefore the number of pixels to be processed can be reduced without lowering the performance.
[0016]
In the prior art shown in FIG. 4, the parameter conversion processing 301 is performed without knowing whether the input image is density or luminance, and an angle / distance parameter table 302 as density and an angle / distance parameter table 305 as luminance are obtained. From the distance parameter table, it is determined whether density or luminance. In order to create two angle / distance parameter tables in one parameter conversion process, it is necessary to process the total number of pixels of the input image data, which is a condition for satisfying Expression (3). Become. However, what is desired to be improved is to reduce the number of pixels to be processed. In the method of reducing the number of pixels to be processed by a method other than thinning, it is not possible to create two angle / distance parameter tables in one parameter conversion process. Will be possible. Therefore, in the present invention, before performing the parameter conversion processing, it is determined whether the pixel value of the input image data is density or luminance.
[0017]
First, a binarization threshold k of the input image data is selected. In the example of FIG. 7, the threshold was 113 as shown in FIG. Since the background portion of the image usually occupies a larger portion than the object, the frequency nl of the pixel value below the binarization threshold and the frequency nu of the pixel value above the binarization threshold are compared. Then, when nl ≦ nu, the density is determined, and when nl> nu, the brightness is determined. In FIG. 8, since nl ≦ nu, the density is determined. In some cases, there is a density / luminance determination method in which the frequency of pixel values equal to or less than k−α is nl and the frequency of pixel values equal to or greater than k + β is nu.
[0018]
If it is known whether the pixel value of the input image data is a density or a luminance, the density conversion is performed by setting a portion having a small pixel value as a background, and in the case of luminance, a portion having a large pixel value is set as a background. , The processing time can be shortened. In the case where the pixel value indicates the density, if the binarization threshold k itself is used at the boundary between the background and the object and all pixels having a pixel value smaller than the binarization threshold k are set as the background portion, the processing amount Can be greatly reduced. However, since the selection of the binarization threshold value may fail depending on the image, in the present invention, in order to cope with the failure of the selection of the binarization threshold value, when the pixel value is the density, the binarization threshold value is set. From a certain value γ (gamma gamma) or less as a background portion, and when the pixel value is luminance, a value equal to or more than a certain value γ from the binarization threshold is used as a background portion. As described below, assuming that the lower limit pixel value of the object is kl = k + 1-γ and the upper limit pixel value is ku = k + 1 + γ, a range of pixel values centered on the binarization threshold k is set as the pixel value of the object. . If the pixel value of the input image data indicates the density, there is no need to set an upper limit pixel value, but the reason why the upper limit pixel value is set is that it is necessary in the case of luminance. The value of γ may be a fixed value, or may be changed according to the image. As a method of changing according to the image, there is a method of taking γ as a value according to the spread of the pixel value histogram distribution. In this method, in the distribution of the pixel value histogram, the standard deviation, which is the square root of the total variance, is set to σ (sigma sigma), and D times the standard deviation, and γ = D · σ. D can be 1 time, 0.5 time, 2 times, or the like. The smaller the value, the smaller the number of pixels to be processed. However, if the value is too small, information is lost and the tilt (rotation) angle is reduced. There is a possibility that the performance of the detection processing is deteriorated. In the case of FIG. 8, when the standard deviation is 57 and γ is one time of the standard deviation, a pixel value of 113-57 = 56 or less is regarded as a background portion. Therefore, the lower limit pixel value kl (lower limit) of the object is 57 (= 56 + 1). In this case, the processing amount is reduced to about 1 / 3.5 by removing the background portion.
[0019]
The added pixel value p to be added when executing the parameter conversion processing is: density: p = 0 when the pixel value g of the input image data is less than kl, p = g-kl + 1 when the pixel value g is more than kl, and luminance In the case of, p = 0 when g exceeds ku, and p = ku-g + 1 when g is less than ku. In addition to the above setting, there is a method of setting the added pixel value to p = ku−kl + 1 when g in the case of density exceeds ku and when g in the case of luminance is less than kl.
[0020]
Next, consider the case where the input image data is binary. The number of pixels having a pixel value of 0 is defined as nl, and the number of pixels having a pixel value of 1 is defined as nu. In the case of the density (nl ≦ nu), in the parameter conversion processing, the processing is performed only on the pixel of which the pixel value g of the input image data is 1, and the added pixel value at that time is set to 1. In the case of luminance (nl> nu), processing is performed only on pixels having a pixel value g of 0, and the added pixel value at that time is set to 1. In binary image data, pixel value 0 is usually a background and 1 is an object. However, in the present invention, in order to reduce the number of pixels to be processed, the pixel number 0 is simply set to the background and the pixel number 0 is set to the background. Is the object.
[0021]
In the case where the input image data is multi-valued or binary, the input image data is determined based on the density / luminance determination information and the density / luminance determination information and the lower limit pixel value and the upper limit pixel value. An addition pixel value is determined for the pixel value of the data, a pixel having an addition pixel value of 0 is excluded from the processing target, and a process of adding the addition pixel value to the angle / distance parameter table is performed. The present invention is a method of investigating an angle having high linearity from a finally formed angle / distance parameter table and detecting a tilt (rotation) angle of the input image.
[0022]
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 1 illustrates a processing method based on the present invention, in which the number of pixels to be processed is reduced, when the input image is multi-valued.
The image 6 is input by the image input processing 101, and input image data g (x, y) 102 is obtained. A binarization threshold value selection process 103 calculates a binarization threshold value (k) 104 and a standard deviation (σ) 105 from the input image data 102. From the binarization threshold (k) 104, the number of lower pixels (number of pixels less than k) (nl) 107 and the number of upper pixels (number of pixels exceeding k) (nu) 108 by the target pixel value range setting processing 106 , And the lower limit pixel value (kl) 109 and the upper limit pixel value (ku) 110 are calculated. The lower limit pixel value (kl) 109 and the upper limit pixel value (ku) 110 are kl = k + 1−D · σ and ku = k + 1 + D · σ. Where D is a constant. By the density / luminance determination processing 111, density / luminance determination information 112 is obtained from the number of lower pixels (nl) 107 and the number of upper pixels (nu) 108: density when nl ≦ nu, and brightness when nl> nu. .
[0023]
Through the parameter conversion processing 113, an angle / distance parameter table 114 is created from the density / luminance determination information 112, the lower limit pixel value (kl) 109, the upper limit pixel value (ku) 110, and the input image data g (x, y) 102. I do.
When detecting the inclination (rotation) angle, there is a method of inspecting the linearity using only the vicinity of the vertical, only the vicinity of the horizontal, or both the vertical and the horizontal. Therefore, in order to cope with these, two angle / distance parameter tables to be created are created for near vertical and near horizontal inspection.
[0024]
The angle is replaced by ai (unit: degree degrees) instead of θ (unit: radian radian), and the angle / distance parameter tables near the vertical and the horizontal are Sv (ai, ρj) and Sh (ai, ρj), respectively. . However, if Na is the number of divisions of the angle, i = 0, 1, 2,. . . , Na-1.
If ρj is to be processed by a computer with real values, the memory used will be enormous, and it will be necessary to quantize it. wear. Assuming that the number of sections related to this distance is Nr, j = 0, 1, 2,. . . , Nr-1. The reason why the distance is quantized is not only to prevent the memory to be used from becoming enormous if the real value is used, but also to increase the accumulated value to some extent in order to detect linearity.
[0025]
When detecting the inclination (rotation) angle of a document, it is generally considered sufficient if the angle range ai is about -5 to 5 degrees in increments of 0.1 degrees. In this case, Na = 101. Conversion from ai to θ can be performed by equation (4).
θ = ai · π / 180 (4)
[0026]
In this parameter conversion process, first, the pixel value p added to the angle / distance parameter table is calculated. When the pixel value g of the input image is the density, p = 0 when g <kl, p = g-kl + 1 when g ≧ kl, and when the luminance is g> ku, p = 0 when g> ku and g ≦ ku Let p = ku−g + 1. When p = 0, addition processing to the angle / distance parameter table is not performed, and for all g (x, y) where p> 0, for all angles ai designated as coordinates (x, y), Equations (4) and (1) are calculated to obtain (ai, ρj) coordinates represented by a pair of an angle ai and a distance ρj, and an angle / distance parameter table Sv (ai, ρj) and Sh (ai) for the coordinates are obtained. , Ρj).
[0027]
Finally, the inclination (rotation) angle detection processing 115 in FIG. 1 detects the angle at which the straight lines are concentrated from the angle / distance parameter table 114 finally created in the parameter conversion processing 113, thereby obtaining the inclination (rotation). (Rotation) Angle 116 is detected.
In the inclination (rotation) angle detection processing 115, the angle ai at which the straight lines are concentrated is obtained from the angle / distance parameter table Sv (ai, ρj) near the vertical and the angle / distance parameter table Sh (ai, ρj) near the horizontal. To inspect. At this time, there is a method of inspecting using only one of them, and a method of using both. In the case of using both, a method of adopting an angle ai having higher vertical and horizontal linearity and an average of vertical and horizontal There is a method of employing an angle ai having high linearity. In the following, description will be made with a method of inspecting using only one of them, and the angle / distance parameter table is set to S (ai, ρj). The angle / distance parameter table for only the angle ai is Sa (j), and S (j) is obtained by rearranging Sa (j) in descending order. Where j = 0, 1,. . . Nr-1.
In the inclination (rotation) angle detection processing, first, for each angle ai, a straight line concentration degree C is calculated as a degree at which straight lines are concentrated. There are the following methods for calculating the degree of linear concentration. A method combining the following methods is also conceivable.
[0028]
(A) A method in which the degree of straight line concentration is the sum of the upper Nm pieces of Ss (j). This is calculated by C = Ca = ΣSs (j), and it is considered that the larger the Ca, the higher the linearity. Where j = 0, 1,. . . , Nm-1.
(B) A method in which the degree of straight line concentration is set to the variance of Ss (j) This is because v (j) = {0, 1, −1, 2, −2,. . . ｝, C = Cb = {Ss (j) ｊ {v (j) -μs} ** 2, and the smaller the Cb, the higher the linearity. Where μs = {v (j) · Ss (j) / Nr, j = 0, 1,. . . , Nr-1.
(C) A method in which the degree of linear concentration is a value obtained by dividing the Cb by the Ca. It is calculated by C = Cc = Cb / Ca, and it is considered that the smaller the Cc, the higher the linearity.
(D) A method in which the degree of linear concentration is a value obtained by dividing the Cb by the square of the Ca. C = Cd = Cb / {Ca} ** 2, and it is considered that the smaller the Cd, the higher the linearity.
(E) A method in which the degree of straight line concentration is set to the sum of differences between values adjacent to Sa (j). This is calculated by C = Ce = Σ | Sa (j + 1) −Sa (j) |, and it is considered that the larger the Ce, the higher the linearity. Here, j = 1, 2, 3, 3,. . . , Nr-1.
[0029]
For each angle ai, the above-described straight line concentration is calculated, and the angle ai at which the straight line concentration takes the highest value is output as the inclination (rotation) angle of the input image data. The above is the method of detecting a tilt (rotation) angle according to the present invention when input image data is multi-valued.
[0030]
Next, a case where the input image data is binary will be described with reference to FIG.
The image 6 is input by the image input processing 101 to obtain input image data 102. The lower / upper pixel count calculation processing 201 calculates the lower pixel count (nl) 107 and the upper pixel count (nu) 108. Since the input image data is binary, there is no need to determine a binarization threshold, and the number of lower pixels may be the number of pixels having a pixel value of 0, and the number of upper pixels may be the number of pixels having a pixel value of 1. The density / luminance determination information 112 is obtained by comparing the number of lower pixels 107 and the number of upper pixels 108 by the density / luminance determination processing 111. Thereafter, similarly to the case of the multi-value, the inclination (rotation) angle 116 is detected by the parameter conversion processing 202 and the inclination (rotation) angle detection processing 115.
The present invention is a method characterized by the above processing.
[0031]
【Example】
FIG. 3 is a block diagram showing a configuration of an apparatus for detecting a tilt (rotation) angle.
Each part of the device is connected to a bus 1, and the overall operation is controlled by a control unit 2. When selecting the type of processing, the console 3 gives the instruction. The program describing the processing method is stored in the memory 50 from the hard disk 4, and the control unit 2 controls the processing according to the program in the memory 50. Alternatively, the memory 50 may be a ROM (Read Only Memory) and a program may be built in.
[0032]
First, in the first process, the image 6 is input by the scanner 7 and stored in the hard disk 4. Alternatively, the image may be stored on the hard disk 4 via another input device, for example, the network 8. The image stored on the hard disk is stored in the memory 51 for processing. This stored image is the input image data 102 in FIG. The memory 52 in FIG. 3 is a work area for storing information handled in each processing until the inclination (rotation) angle 116 is detected from the input image data 102 in FIG. With the input image data stored in the memory 51 of FIG. 3 as an input, a binarization threshold value 104 and a standard deviation 105 are obtained by a binarization threshold value selection process 103 of FIG. The target pixel value range setting processing 106 sets the number of lower pixels 107, the number of upper pixels 108, the lower pixel value 109, and the upper pixel value 110. The density / luminance determination information 112 is obtained by the density / luminance determination processing 111. The parameter conversion processing 113 creates an angle / distance parameter table 114 from the density / luminance determination information 112, the lower limit pixel value 109, the upper limit pixel value 110, and the input image data 102. The tilt (rotation) angle 116 is detected by the tilt (rotation) angle detection processing 115. These pieces of information are stored in the memory 52 of FIG.
[0033]
If the inclination (rotation) angle (a degree) is obtained, the input image data stored in the memory 51 of FIG. 3 or the hard disk 4 is rotated by −a degree, and the image whose inclination is corrected is stored in the hard disk 4 or the like. Just store it.
[0034]
【The invention's effect】
Since the tilt (rotation) angle detection processing according to the present invention is performed only by removing unnecessary information, the detection performance does not deteriorate, and as a method using the Hough transform (Radon transform), a conventional technique is used. Can be processed several times faster than.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a method of detecting a tilt (rotation) angle when a target image is multi-valued according to the present invention.
FIG. 2 is a block diagram showing a method of detecting a tilt (rotation) angle when a target image is binary according to the present invention.
FIG. 3 is a block diagram showing a configuration of an apparatus for detecting a tilt (rotation) angle according to the present invention.
FIG. 4 is a block diagram showing a method of detecting a tilt (rotation) angle according to the related art.
FIG. 5 is a diagram illustrating straight lines on an xy image plane for explaining the principle of parameter conversion processing.
FIG. 6 is a diagram illustrating a sine curve on an angle / distance parameter plane for explaining the principle of parameter conversion processing.
FIG. 7 is an example of an input document image.
FIG. 8 is a diagram showing a pixel value histogram for the document image of FIG. 7;
[Explanation of symbols]
1: Bus
2: Control unit
3: Console
4: Hard disk
5, 50, 51, 52: Memory
6: Image
7: Scanner
8: Network
101: Image input processing
102: Input image data
103: Binary threshold selection process
104: Binarization threshold
105: Standard deviation
106: Object pixel value range setting processing
107: Lower pixel count
108: Number of upper pixels
109: lower limit pixel value
110: Upper pixel value
111, 309: density / luminance determination processing
112: density / luminance determination information
113, 202, 301: Parameter conversion processing
114: Angle / Distance Parameter Table
115, 310: inclination (rotation) angle detection processing
116: Tilt (rotation) angle
201: Lower / upper pixel count calculation processing
302: Angle / distance parameter table as density
303: Counter table
304: Parameter table inversion processing
305: Angle / distance parameter table as luminance
306: Maximum linear concentration detection processing
307: Maximum linear concentration as density
308: Maximum linear concentration T as luminance

Claims

It is assumed that a binary or multi-valued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Area, or, from the pixel values on the coordinates in the partial domain, a step of determining whether the pixel value of the image is a density or brightness,
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting a tilt (rotation) angle of the image by finding an angle where straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that the multivalued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Selecting a binarization threshold from pixel values on coordinates in the domain;
Determining, from the number of lower pixels and the number of upper pixels calculated based on the binarization threshold, whether the pixel value of the image is density or luminance;
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting a tilt (rotation) angle of the image by finding an angle where straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that the binary image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance, and the entire domain or partial area on the xy image plane Determining, from the pixel values on the coordinates in the domain, whether the pixel value of the image is a density or a luminance from the number of pixels having a pixel value of 0 and the number of pixels having a value of 1;
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting a tilt (rotation) angle of the image by finding an angle where straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that a binary or multi-valued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Means for determining whether the pixel value of the image is a density or a luminance from a pixel value on coordinates in the area or the partial domain;
Means for calculating an added pixel value for performing addition, obtained by changing the pixel value, depending on whether the pixel value of the image is density or luminance.
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Means for performing parameter conversion processing;
Means for detecting a tilt (rotation) angle of the image by finding an angle at which straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that the multivalued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Means for selecting a binarization threshold from pixel values on coordinates in the domain;
Means for determining whether the pixel value of the image is a density or a luminance from the number of lower pixels and the number of upper pixels calculated based on the binarization threshold;
Means for calculating an added pixel value for performing addition, obtained by changing the pixel value, depending on whether the pixel value of the image is density or luminance.
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Means for performing parameter conversion processing;
Means for detecting a tilt (rotation) angle of the image by finding an angle at which straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that the binary image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance, and the entire domain or partial area on the xy image plane Means for determining whether the pixel value of the image is a density or a brightness from the number of pixels having a pixel value of 0 and the number of pixels having a pixel value of 1 from the pixel values on the coordinates in the domain; Means for calculating an added pixel value for performing addition, which is obtained by changing a pixel value, depending on whether it is density or luminance,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Means for performing parameter conversion processing;
Means for detecting a tilt (rotation) angle of the image by finding an angle at which straight lines are concentrated from the finally obtained angle / distance parameter table.

It is assumed that a binary or multi-valued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Area, or, from the pixel values on the coordinates in the partial domain, a step of determining whether the pixel value of the image is a density or brightness,
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting the inclination (rotation) angle of the image by finding the angle at which the straight line is concentrated from the finally obtained angle / distance parameter table.

It is assumed that the multivalued image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance. Selecting a binarization threshold from pixel values on coordinates in the domain;
Determining, from the number of lower pixels and the number of upper pixels calculated based on the binarization threshold, whether the pixel value of the image is density or luminance;
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting the inclination (rotation) angle of the image by finding the angle at which the straight line is concentrated from the finally obtained angle / distance parameter table.

It is assumed that the binary image to be processed is on the xy image plane, and it is unknown whether the pixel value of the image is density or luminance, and the entire domain or partial area on the xy image plane Determining, from the pixel values on the coordinates in the domain, whether the pixel value of the image is a density or a luminance from the number of pixels having a pixel value of 0 and the number of pixels having a value of 1;
A step of calculating an added pixel value for performing addition, which is obtained by changing the pixel value, depending on whether the pixel value of the image is density or brightness,
For each of the coordinate points (x, y) of the portion excluding the portion corresponding to the background of the image, from a plurality of angles θ taken at regular intervals, an equation ρ = x · cos (θ) + y · sin (θ) The distance ρ is calculated, the obtained (θ, ρ) pairs are used as (angle, distance) coordinates on the parameter plane, and the added pixel value is added on the parameter plane to create an angle / distance parameter table. Performing a parameter conversion process;
Detecting the inclination (rotation) angle of the image by finding the angle at which the straight line is concentrated from the finally obtained angle / distance parameter table.