JP2004005715A

JP2004005715A - Method and device for color image recognition

Info

Publication number: JP2004005715A
Application number: JP2003198874A
Authority: JP
Inventors: Michiyoshi Tachikawa; 立川　道義; Toshio Miyazawa; 宮澤　利夫; Akihiko Hirano; 平野　明彦
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1993-12-10
Filing date: 2003-07-18
Publication date: 2004-01-08

Abstract

<P>PROBLEM TO BE SOLVED: To compress a data volume to recognize an object precisely, while securing an information amount required for image recognition, when a color image is recognition-processed for the extracted object. <P>SOLUTION: An object extraction part 51 for extracting the object from a color image signal is executed in parallel to a vector quantization-processing part 55 for vector-quantizing the color image signal. An object recognizing part 61 conducts matching between the object extracted by the object extraction part 51 and a dictionary 64 created preliminarily, so as to determine the propriety of the object. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【産業上の利用分野】
本発明は、カラー画像および白黒画像から特定画像を抽出して認識するカラー画像認識方法および装置に関する。
【０００２】
【従来の技術】
カラー画像を処理する製品、例えばカラー複写機、カラープリンタ、カラースキャナ、カラー画像通信機器などは、今後ますます増加するものと予想される。カラー画像は、ハードウェアの進歩、特にメモリの低価格化および大容量化、通信コストの低下などにより、以前に比べて利用しやすくなってきたものの、カラー画像データはそのデータ量が膨大（例えば、Ａ３サイズで９６Ｍバイト）であるため、２値画像と同じような処理ができないのが現状である。
【０００３】
特に、画像認識（特定画像の認識、ＯＣＲなど）などの複雑な処理を要する技術においては、処理量が膨大になり、カラー画像における画像認識は実現がより困難である。
【０００４】
【発明が解決しようとする課題】
従来、特定のカラー画像を識別する方法として、例えば、画像を構成する各絵柄部分は固有の色空間上での分布を持っているので、各絵柄部分に現われる固有の色空間上での分布を特定し、この特定された特徴と同一の特徴を有する画像部分を抽出する方法がある（特開平４−１８０３４８号公報を参照）。しかし、この方法では、色空間中での拡がりが同じ画像については、その内部での色の分布が異なっていても識別することができず、つまり色空間の拡がりが同じであれば、拡がりの中での色の分布が異なる画像をも特定の画像として誤検出する可能性がある。
【０００５】
また一方、認識処理に必要な対象物の抽出方法として種々の方法が提案されているが、例えば画像から黒連結の矩形を抽出し、予め設定された閾値と比較することにより、文字の矩形と線図形の矩形とを判定する画像抽出方法がある（特開昭５５−１６２１７７号公報を参照）。この方法は、抽出された線図形をさらに詳細に水平罫線、垂直罫線、表、囲み枠などのように識別するものではなく、また回転した対象物の抽出に対応できない。
【０００６】
本発明の第１の目的は、カラー画像の認識処理を行う際に、画像認識に必要な情報量を確保しつつ、データ量を圧縮して対象物を高精度に認識するカラー画像認識方法および装置を提供することにある。
【０００７】
本発明の第２の目的は、カラー画像の認識処理を行う際に、画像認識に必要な情報量を確保しつつ、データ量をテーブル変換によって変換圧縮することにより、効率的にデータ圧縮を行い、対象物を高精度かつ高速に認識するカラー画像認識方法および装置を提供することにある。
【０００８】
本発明の第３の目的は、認識対象原稿のカラー画像が裏写りした場合の影響を抑制して、認識対象とされる画像情報のみにベクトル量子化を施すことにより、認識率と処理速度を向上させたカラー画像認識方法および装置を提供することにある。
【０００９】
本発明の第４の目的は、コードブックとの距離が所定の閾値より大きい場合に、ベクトル量子化値を割り当てないことにより、認識精度と処理速度を向上させたカラー画像認識方法および装置を提供することにある。
【００１０】
本発明の第５の目的は、入力画像中から対象物の画像領域を高精度に抽出するカラー画像認識方法および装置を提供することにある。
【００１１】
本発明の第６の目的は、抽出された対象物に対して、カラー画像の認識処理を行う際に、画像認識に必要な情報量を確保しつつ、データ量を圧縮して対象物を高精度に認識するカラー画像認識方法および装置を提供することにある。
【００１２】
【課題を解決するための手段】
前記各目的を達成するために、請求項１記載の発明では、入力カラー画像信号から特定画像を認識する処理と、前記入力カラー画像信号から所定形状の対象物を抽出する処理を並列に実行し、前記対象物の抽出結果に応じて、前記対象物について認識処理することを特徴としている。
【００１３】
請求項２記載の発明では、前記入力画像信号中から所定形状の対象物を抽出する処理は、前記入力された２値画像信号から黒連結成分の外接矩形を抽出し、該抽出された外接矩形と黒連結成分との接点情報に基づいて、前記対象物を抽出する処理であることを特徴としている。
【００１４】
請求項３記載の発明では、前記抽出される対象物は、所定の辺長を有する矩形であることを特徴としている。
【００１５】
請求項４記載の発明では、前記抽出される対象物は、スキャンラインに対して傾いている対象物を含むことを特徴としている。
【００１６】
請求項５記載の発明では、前記入力画像信号がカラー画像信号であるとき、前記カラー画像信号から明度を求め、該明度と複数の閾値とを比較することにより複数の２値画像を生成し、該複数の２値画像からそれぞれ前記外接矩形を抽出することを特徴としている。
【００１７】
請求項６記載の発明では、前記抽出される第１の外接矩形と、第２の外接矩形との包含関係を調べ、一方の外接矩形が他方の外接矩形を含むとき、一方の外接矩形と黒連結成分との接点情報に基づいて、前記対象物を抽出することを特徴としている。
【００１８】
請求項７記載の発明では、入力カラー画像信号から特定画像を認識する第１の処理手段と、前記入力カラー画像信号から所定形状の対象物を抽出する第２の処理手段と、前記対象物の抽出結果に応じて、前記対象物について認識処理する手段とを備え、前記第１の処理手段と第２の処理手段を並列に実行することを特徴としている。
【００１９】
【作用】
第１の実施例においては、入力されたカラー画像信号ＲＧＢはメッシュ分割部で小領域（メッシュ）に分割される。特徴量抽出部では、分割された小領域毎に、入力カラー画像信号ＲＧＢの色度ヒストグラムを作成して、特徴量メモリに格納する。識別対象物の色度ヒストグラムを予め作成して、コードブックに格納しておく。そして、ベクトル量子化部では、特徴量メモリの色度ヒストグラムと、コードブックとの距離を算出し、その距離が最小であるコードブックのコードを、その小領域のベクトル量子化値としてベクトル量子化値メモリに保持する。辞書には、識別対象物について予めベクトル量子化値ヒストグラムを求めて格納しておく。認識部では、入力画像のベクトル量子化値ヒストグラムと、辞書のベクトル量子化値ヒストグラムとの距離を算出し、入力カラー画像が識別対象であるか否かを判定する。これにより、データ量を圧縮できるとともに、高精度に対象物を認識することが可能となる。
【００２０】
【実施例】
以下、本発明の一実施例を図面を用いて具体的に説明する。
〈実施例１〉
図１は、本発明の実施例１の構成を示す。図１において、入力されたカラー画像信号（ＲＧＢ）１から得られたカラー画像を小領域（メッシュ）に分割するメッシュ分割部２と、小領域内のカラー画像データから特徴量を抽出する特徴量抽出部３と、抽出した特徴量を格納する特徴量メモリ４と、抽出した特徴量を予め作成してあるコードブック５と比較することによりベクトル量子化を行うベクトル量子化部６と、ベクトル量子化値を保持するベクトル量子化値メモリ７と、該メモリと識別対象物の辞書９とを照合して認識処理を行う認識部８と、メモリ管理やマッチング処理の距離計算などの全体の画像認識処理における各段階の制御を行う制御部１０とから構成されている。
【００２１】
入力カラー画像信号ＲＧＢを予め定められた小領域（メッシュ）に分割する。分割された小領域（メッシュ）毎に、入力カラー画像信号ＲＧＢの特徴量を抽出する。本実施例では、その特徴量として色度ヒストグラムを用いる。すなわち、特徴量抽出部３では、入力カラー画像信号ＲＧＢを以下に示す色度Ｐｒ、Ｐｇに変換し、予め定められた小領域（メッシュ）毎に色度Ｐｒ、Ｐｇの値のヒストグラムを作成し、特徴量メモリ４に格納する。このように、色度変換されたカラー画像は、色合い情報だけを持つので、照明むらが除去されて対象物を抽出する場合などに有効となる。
【００２２】
図２は、原画像を小領域（メッシュ）に分割した図を示し、この例では、小領域は６４画素×６４画素のサイズである。
【００２３】
Ｐｒ＝２５６＊Ｒ／（Ｒ＋Ｇ＋Ｂ）
Ｐｇ＝２５６＊Ｇ／（Ｒ＋Ｇ＋Ｂ）
ここで、Ｒ、Ｇ、Ｂは入力された各８ビットのカラー画像信号である。なお、Ｐｒ、Ｐｇを２５６倍しているのはＰｒ、Ｐｇも８ビットで表現するためである。
【００２４】
また、上記した例では、ｒ，ｇの色度ヒストグラムを特徴量としたが、本発明はこれに限定されるものではなく、ｂの色度ヒストグラムを用いてもよいし、色度の他に、カラー画像信号（ＲＧＢ）、色相、彩度などの特徴量を用いることができる。
【００２５】
図３は、小領域内の色度ヒストグラムを、コードブックを参照してベクトル量子化する例を示す図である。
【００２６】
図３において、１１は、特徴量メモリ４に格納された小領域内の色度ヒストグラム（Ｈｉ）を示す。ｉ＝０〜２５５の次元におけるＨ（ｉ）はＰｒを表し、ｉ＝２５６〜５１１の次元におけるＨ（ｉ）はＰｇを表す。また、１２は、予め作成されたコードブック（Ｃ０、Ｃ１、Ｃ２．．．）の内容を示す。
【００２７】
ここで、コードブックは、識別対象物あるいは一般の原稿を多数入力し、同様の条件で色度ヒストグラムのデータを大量に作成し、これらをクラスタリングすることで代表的な色度ヒストグラム（コードブック）を求めて作成する。
【００２８】
ベクトル量子化部６では、この色度ヒストグラム（Ｈｉ）１１と、コードブックの内容１２とをマッチングして距離（ＤＣｊ）を算出し、その距離（一般には、ユークリッド距離の２乗として定義された２乗ひずみ測度）が最小であるコードブックのコード（Ｃｊ）を、その小領域のベクトル量子化値（ＶＱ値）としてベクトル量子化値メモリ７に保持する。
【００２９】
図４は、入力画像のベクトル量子化値（ＶＱ値）の例を示す。各桝目は、前述した一つの小領域に対応し、各桝目内の数値はベクトル量子化値（ＶＱ値）である。そして、これらのベクトル量子化値（ＶＱ値）についてヒストグラムを作成する。
【００３０】
図５は、画像認識時における入力画像のベクトル量子化値のヒストグラムと辞書のベクトル量子化値のヒストグラムとのマッチングを説明する図である。図において１３は、作成されたベクトル量子化値のヒストグラム例を示す。また、１４は、識別対象物の辞書内容を示し、入力画像と同様に、ベクトル量子化値ヒストグラムで表現されて予め作成されている。図の場合は、例えば、識別対象物Ａ，Ｂ，Ｃ．．（コードブック数が６４）のベクトル量子化値ヒストグラムが格納されている。
【００３１】
認識部８では、入力画像のベクトル量子化値ヒストグラム１３と、辞書のベクトル量子化値ヒストグラム１４とマッチングして距離（ＤＴｋ）を算出し、距離（ＤＴｋ）が最小となる識別対象（ｋ）を、認識対象のカラー画像と判定する。
【００３２】
このように、本発明はカラー画像をベクトル量子化してから辞書と照合しているので、従来技術である特開平４−１８０３４８号公報における問題点が解決される。
【００３３】
〈実施例２〉
上記した実施例１では全画素を用いて色度ヒストグラムを作成しているが、これでは処理量が膨大になる。そこで、実施例２では、図６に示すように、色度ヒストグラムを求める画素をＭ画素間隔で間引いて行う。間引きの方法としては、例えば８画素間隔でサンプルして色度を求める画素を選択する方法を採る。また、Ｍ画素間隔で間引くとき、周囲の画素の画素値の平均を求め、この値を該間引き画素値としてもよい（この処理によって雑音が軽減される）。
【００３４】
〈実施例３〉
実施例１において、色度ヒストグラム作成時に、ｒ、ｇ各８ビットでヒストグラムを作成すると、５１２次元の特徴量になり、メモリ容量も増大し、マッチング処理にも時間がかかる。
【００３５】
そこで、本実施例では、例えば以下のような変換を行って、特徴量次元を６４次元に圧縮してから前述したと同様の処理を行う。
【００３６】

〈実施例４〉
図７は、本発明の実施例４の構成を示す。この実施例４の構成は、図１の構成に変換圧縮テーブル１５を付加して、データ量をテーブル変換によって変換圧縮する。つまり本実施例４は、実施例３のように変換式による演算処理を行うことなく効率的にデータ圧縮するものである。変換圧縮テーブル１５は、後述するように、特徴量抽出部３によって抽出された特徴量を変換圧縮する。また、特徴量メモリ４は圧縮された特徴量を保持し、ベクトル量子化部６は圧縮された特徴量を予め作成してあるコードブック５と比較することによりベクトル量子化を行う点が、図１の構成と若干異なる。他の構成要素は図１で説明したものと同様であるので説明を省略する。
【００３７】
実施例１と同様に、入力カラー画像信号は、図２に示すように小領域（メッシュ）に分割され、小領域毎に入力カラー画像信号の特徴量が抽出される。本実施例では、特徴量として入力カラー画像信号ＲＧＢの色度信号を用いる。すなわち、特徴量抽出部３では、小領域（メッシュ）毎に入力カラー画像信号のＲＧＢの色度信号を抽出して、変換圧縮テーブル１５を参照することによって、特徴量を変換圧縮し、圧縮されたＲＧＢの色度信号毎に小領域（メッシュ）内の色度ヒストグラムを作成して、特徴量メモリ４に結果を格納する。
【００３８】
図８は、小領域内の特徴量を変換圧縮してヒストグラムを生成する図である。変換圧縮テーブル１５は、入力される各色度信号を変換特性に従って変換出力する。例えば、入力信号の値が２５５であるとき、その出力値が１５として変換圧縮処理される。このように圧縮された特徴量のヒストグラムは特徴量メモリ４に生成される。このように、特徴量の変換圧縮によってメモリ容量の増加が抑制され、マッチング処理が高速化される。
【００３９】
図８に戻り、入力される各色度信号（Ｒ、Ｇ、Ｂ）は、変換圧縮テーブル１５によってＲ’、Ｇ’、Ｂ’に変換圧縮されて出力される。この例では、Ｒ’、
Ｇ’、Ｂ’はそれぞれ０から１５の値をとる。そして、Ｇ’に１６を加算し、Ｂ’に３２を加算して、０から４７の次元（図の横軸）で色度ヒストグラムＨ（ｉ）１６を作成する。すなわち、小領域内の色度ヒストグラムＨ（ｉ）１６において、次元ｉ＝０〜１５のＨ（ｉ）は圧縮変換されたＲ信号つまりＲ’信号の度数を表し、次元ｉ＝１６〜３１のＨ（ｉ）は圧縮変換されたＧ信号つまりＧ’信号の度数を表し、次元ｉ＝３２〜４７のＨ（ｉ）は圧縮変換されたＢ信号つまりＢ’信号の度数を表している。
【００４０】
なお、上記した実施例において、特徴量のヒストグラムの次元の総数（この例では４８次元）は、変換圧縮テーブルによる圧縮の度合いに応じて決定されるもので、適宜変更可能である。また、入力カラー画像信号のＲＧＢの色度信号からヒストグラムを作成しているが、本発明はこれに限定されるものではなく、入力カラー画像信号を変換処理したＹＭＣ信号やＬａｂ信号を特徴量として、変換圧縮処理とヒストグラム生成処理を行うようにしてもよい。
【００４１】
図９は、小領域内の色度ヒストグラムを、コードブックを参照してベクトル量子化する例を示す図であり、１７は、特徴量メモリ４に格納された小領域内の色度ヒストグラム（Ｈｉ）を示し、１８は、予め作成されたコードブック５（Ｃ０、Ｃ１、Ｃ２．．．）の内容を示す。前述した図３と異なる点は、ヒストグラムの次元数が圧縮されている点である。
【００４２】
ここで、コードブック５は、識別対象物あるいは一般の原稿を多数入力し、同様の条件で色度ヒストグラムのデータを大量に作成し、これらをクラスタリングすることで代表的な色度ヒストグラム（コードブック）を求めて作成する。
【００４３】
ベクトル量子化部６では、処理対象の小領域の色度ヒストグラム（Ｈｉ）１７と、コードブックの内容１８とをマッチングして距離（ＤＣｊ）を算出し、その距離（一般には、ユークリッド距離の２乗として定義された２乗ひずみ測度）が最小であるコードブックのコード（Ｃｊ）を、その小領域のベクトル量子化値
（ＶＱ値）として割り当て、ベクトル量子化値メモリ７に保持する。
【００４４】
入力画像のベクトル量子化値（ＶＱ値）の例は、実施例１で説明した図４の場合と同様であり、ベクトル量子化値メモリ７には各小領域のベクトル量子化値（ＶＱ値）が保持されていて、入力画像のベクトル量子化値（ＶＱ値）についてヒストグラムが作成される。そして、実施例１の図５で説明したと同様にして識別対象物の辞書が構成され、認識部８では、入力画像のベクトル量子化値ヒストグラム１３と、辞書のベクトル量子化値ヒストグラム１４とマッチングして距離（ＤＴｋ）を算出し、距離（ＤＴｋ）が最小となる識別対象（ｋ）を、認識対象のカラー画像と判定する。
【００４５】
〈実施例５〉
上記した実施例４では、入力画像の全ての画素に対して特徴量のヒストグラムを作成してコードブックとの比較処理を行っている。本実施例５は、小領域毎に生成された特徴量のヒストグラム情報に基づいてコードブックとの比較処理を変更して、画像認識に必要のない地肌部（背景部）やノイズ画像の認識処理を制御するもので、これにより認識率と処理速度の向上を図る。
【００４６】
一般に、画像認識に必要のない地肌部（背景部）は、小領域内の濃度がほぼ一定であることから、小領域毎に生成された特徴量のヒストグラムの度数は特定部分に集中していて、度数分布の幅は狭く度数の最大値が大きくなる傾向にある。そこで、本実施例では、ヒストグラムの度数の最大値が予め設定した閾値を超えた場合に、濃度が一様な画像のベタ部領域であると判定して、ベクトル量子化値に０を割り当て、認識対象とされる画像情報のみにベクトル量子化を施す。他の実施態様としては、ヒストグラムの度数が予め設定した閾値を超えた場合にコードブックとの比較処理を行わないように構成してもよい。
【００４７】
また、原稿をスキャナなどで読み取った得られる画像のハイライト部においては、原稿用紙の裏側の画像が裏写りする場合がある。このような裏写りによる画像ノイズを除く、ハイライト部の画像データの特徴量は、多くの場合一様であることから、その特徴量のヒストグラムは特定部分に集中して大きなピークを持ち、また度数の最大値が大きくなり、度数の分布の幅が狭くなる傾向にある。そこで、本実施例では、ヒストグラムの度数分布の情報が、予め設定したヒストグラム特性を備えていると判定された場合には、小領域の特徴量ヒストグラムから予め設定したヒストグラム特性を除去してコードブックとの比較を行って、ベクトル量子化値を割り当てる。これにより、画像ノイズによる影響が抑止され、認識対象とされる画像情報のみにベクトル量子化が施される。
【００４８】
なお、本実施例５では特徴量のヒストグラム情報として、度数の最大値と度数の分布幅を採用しているが、これに限定されるものではなく、認識対象とする画像の特徴量ヒストグラムの特性を分析して設定されるヒストグラム情報であればよい。
【００４９】
〈実施例６〉
上記した実施例４におけるコードブックは、認識対象画像を多数入力し、同様の条件で色度ヒストグラムのデータを大量に作成し、これらをクラスタリングすることによって作成しているので、入力画像が認識対象の画像以外の場合には、どのコードブックからも距離が離れる場合がある。
【００５０】
本実施例６では、認識対象画像をコードブックと比較してベクトル量子化値を割り当てる際に、比較した距離が所定の閾値よりも大きいとき、ベクトル量子化値を割り当てないように構成する。これにより、マッチングした結果、識別候補がない場合には速やかに入力カラー画像に認識対象の画像が存在しないと判定できるようになる。
【００５１】
〈実施例７〉
本実施例７は、実施例１、４において辞書とのマッチングを行う際、有効距離の閾値を設定しておき、求めた距離と閾値との比較を行い、距離が閾値以下ならばその辞書内の識別対象物を識別候補にするが、閾値より大きい場合には、識別候補にしないようにする。これにより、マッチングした結果、識別候補がない場合には入力カラー画像に認識対象の画像が存在しないと判定できるようになる。
【００５２】
〈実施例８〉
本実施例８は、実施例７における前記閾値を各識別対象物毎に設定し、求めた距離と各識別対象物毎の閾値の比較を行い、距離が閾値以下ならばその識別対象物を識別候補にするが、閾値より大きい場合にはその識別対象物を識別候補にしないようにする。これにより、複数の対象物を識別する際に、対象物の特性を活かしたマッチング処理が可能になる。より具体的にいえば、ある対象物ｋが対象物ｋ以外の原稿ｊと間違え易い場合には、この対象物ｋの閾値を低くすることで、対象物ｋと原稿ｊとを高精度に識別することができ、誤認識を防止することが可能となる。
【００５３】
〈実施例９〉
図１０は、本発明の対象物抽出方法に係る実施例９の構成を示す。図１０において、２値画像信号２１から黒連結成分の外接矩形を抽出する矩形抽出部２２と、抽出された矩形データを格納する矩形メモリ２３と、予め設定された閾値と抽出矩形の幅、高さを比較し、抽出すべき対象物が長方形か否かを判定する候補矩形判定部２４と、候補矩形データを格納する候補矩形メモリ２５と、対象物が回転しているか否かを判定する回転判定部２６と、対象物の短辺、長辺を測定する辺長測定部２７と、短辺、長辺の長さと予め設定された閾値とを比較して対象物か否かを判定する対象物判定部２８と、対象物矩形データを格納する対象物矩形メモリ２９と、全体を制御する制御部３０とから構成されている。
【００５４】
図１１は、本発明の対象物抽出および画像認識の処理フローチャートである。この処理フローチャートにおいて、本発明の対象物抽出方法に係る処理はステップ１０１からステップ１０８であり、まず対象物の抽出方法について、以下説明する。
【００５５】
入力画像から２値画像を生成し（ステップ１０１）、矩形抽出部２２は、２値画像から黒連結成分の外接矩形を抽出する（ステップ１０２）。矩形抽出方法としては、例えば本出願人が先に提案した方式（特願平３−３４１８８９、同４−２６７３１３、同４−１６０８６６）などを用いればよい。
【００５６】
図１２は、入力画像２０１から抽出された外接矩形２０２を示す。本発明では、外接矩形２０２の４頂点の座標（Ｘｓ，Ｙｓ）、（Ｘｅ，Ｙｅ）、（Ｘｓ，Ｙｅ）、（Ｘｅ，Ｙｓ）と、黒連結成分（対象物）２０３と外接矩形２０２との接点座標（Ｘｕ，Ｙｓ）、（Ｘｅ，Ｙｒ）、（Ｘｓ，Ｙｌ）、（Ｘｂ，Ｙｅ）を同時に抽出する。
【００５７】
次いで、候補矩形判定部２４では、抽出された外接矩形の高さ、幅が予め与えられた高さ、幅の範囲内にあるか否かを判定し（ステップ１０３）、高さ、幅の何れかが範囲外であれば、対象物でないと判定する（ステップ１１３）。なお、このようなサイズによる対象物の候補判定方法については、前掲した本出願による方式を用いればよい。続いて、候補矩形判定部２４では、抽出すべき対象物が長方形であるか否かをチェックする（ステップ１０４）。これは例えば、候補矩形判定部２４内に予め抽出すべき対象物として長方形データが設定されているものとする。
【００５８】
対象物が長方形であるものについて、回転判定部２６は、対象物がスキャンラインに対して回転しているか否かを判定する（ステップ１０５）。図１３は、回転の判定を説明する図であり、３０１は候補矩形、３０２は対象物である。この回転判定は、対象物３０２が長方形の場合、三角形ＡとＢ、三角形ＣとＤの合同を判定し、もしどちらか一方でも合同でないと判定された場合には回転していないと判定する。また、図１３に示すように、長方形の対角線Ｄ１，Ｄ２の長さ（この長さは矩形データの座標から計算する）を比較し、その差が大きければ菱形と判定し、対象物３０２が回転していないと判定する。
【００５９】
次いで、辺長測定部２７では、回転していると判定された長方形について、図１３のＳ１、Ｓ２の長さを計算し（矩形データの座標から計算する）、それぞれを短辺、長辺の長さとする（ステップ１０６）。一方、ステップ１０４で長方形でないと判定されたもの、ステップ１０５で回転していないと判定されたものについては、外接矩形の高さを短辺、幅を長辺の長さとする（ステップ１０７）。そして、短辺、長辺の長さが予め与えられた対象物の短辺、長辺の長さの範囲にあるか否かを判定し、何れか一方でも範囲外ならば対象物ではないと判定し、この条件に合うものを抽出すべき対象物と判定する（ステップ１０８）。以下の処理（ステップ１０９以降）については、後述する。
【００６０】
なお、上記した実施例において、矩形抽出部２２の前に、入力された２値画像に対して例えば８×８画素を１画素に変換するような画像圧縮部を設け、圧縮された画像から矩形を抽出するように構成を変更することも可能である。
【００６１】
〈実施例１０〉
本実施例１０では、入力画像をカラー画像信号（Ｒ，Ｇ，Ｂ）とし、以下のような明度（Ｌ）を求め、所定の閾値（Ｔｈ１）以下の明度を持つ画素を黒とし、閾値（Ｔｈ１）より大きい画素を白とするような２値画像を作成してから、実施例１と同様の処理を行う。
【００６２】
Ｌ＝Ｒ＋Ｇ＋Ｂ
Ｌ≦Ｔｈ１ならば黒画素
Ｌ＞Ｔｈ１ならば白画素
本実施例は、対象物以外の部分（背景）が白地の場合に対象物を抽出するのに有効な方式となる。つまり例えば、白紙（あるいは淡い地肌の用紙）に対象物の載せてスキャナなどで画像を読み取るような場合に有効な方式となる。
【００６３】
〈実施例１１〉
本実施例１１では、入力画像をカラー画像信号（Ｒ，Ｇ，Ｂ）とし、以下のような明度（Ｌ）を求め、所定の閾値（Ｔｈ２）以上の明度を持つ画素を黒とし、閾値（Ｔｈ２）より小さい画素を白とするような２値画像を作成してから、実施例１と同様の処理を行う。
【００６４】
Ｌ＝Ｒ＋Ｇ＋Ｂ
Ｌ≧Ｔｈ２ならば黒画素
Ｌ＜Ｔｈ２ならば白画素
本実施例は、銀板のような圧板を持つスキャナなどで入力した時に、対象物以外の部分（背景）が黒地になる場合に対象物を抽出するのに有効な方式となる。このように、実施例１０、１１によれば、原稿を押える蓋をした状態で画像を取り込んでも、また蓋を開けた状態で画像を取り込んでも何れにも対応できる。なお、上記実施例における対象物とは、スキャナに載せた原稿全体から抽出される場合、あるいは原稿中のある特定領域から抽出される場合の何れでもよい。
【００６５】
また、上記実施例１０、１１において、明度以外に、カラー画像信号（ＲＧＢ）、色相、彩度などを対象とすることもできる。さらに、実施例１０、１１において、Ｔｈ１≦Ｌ≦Ｔｈ２ならば黒画素、上記以外ならば白画素のように、所定範囲内を黒画素としてもよい。
【００６６】
図１４は、実施例１０、１１の構成を示す。実施例９（図１０）と異なる点は、カラー画像信号（ＲＧＢ）３４から２値画像を生成する２値画像生成部３１が設けられた点と、実施例１０で作成された２値画像を格納するメモリ３２と、実施例１１で作成された２値画像を格納するメモリ３３が設けられた点である。そして、背景が白地の場合にも黒地の場合にも対応できるように、実施例１０および１１をそれぞれ実行し、矩形抽出部２２で外接矩形を抽出する。この抽出された各外接矩形を外接矩形１、２とすると、候補矩形判定部２４において、これら外接矩形１、２の包含関係を判定し、例えば外接矩形１が外接矩形２を完全に含むとき、外接矩形１のみから対象物を抽出する。
【００６７】
〈実施例１２〉
図１５は、実施例１２の全体構成を示す。図において、対象物抽出部４２がカラー画像信号４１から対象物を抽出して、対象物矩形メモリ４３に格納する部分は、前述した図１４に示す構成と全く同一のものである。
【００６８】
本実施例では、カラー画像信号４１をベクトル量子化するベクトル量子化部４４と、ベクトル量子化値を格納するベクトル量子化値メモリ４５と、対象物抽出部４２で抽出された対象物と予め作成された辞書とのマッチングを行い対象物か否かを判定する対象物認識部４６が設けられている。
【００６９】
図１６は、実施例１２の詳細構成を示す。まず、対象物抽出部５１の構成から説明すると、対象物抽出部５１は図１４に示す要素から構成され、対象物が抽出されると、対象物認識部６１に対して起動信号５３を出力する。また抽出された対象物のデータが対象物矩形メモリ５２に格納され、対象物認識部６１に対して起動信号５３と共に、対象物の範囲データ５４が出力される。
【００７０】
ベクトル量子化処理部５５は、実施例１で説明したと同様に、入力されたカラー画像信号（ＲＧＢ）から得られたカラー画像を小領域（メッシュ）に分割するメッシュ分割部５６と、小領域内のカラー画像データから特徴量（色度ヒストグラム）を抽出する特徴量抽出部５７と、作成された色度ヒストグラムと予め作成してあるコードブック５９と比較することによりベクトル量子化を行うベクトル量子化部５８から構成されている。ベクトル量子化された入力カラー画像はベクトル量子化値メモリ６０に保持される。
【００７１】
入力カラー画像をベクトル量子化して、処理対象の小領域のベクトル量子化値（ＶＱ値）を割り当てて、ベクトル量子化値メモリ６０に保持するまでの処理は、前述した実施例１と同様であるので、その説明を省略する。
【００７２】
さて、対象物認識部６１のヒストグラム作成部６２では、上記したベクトル量子化値（前述した図４）からベクトル量子化値のヒストグラムを作成する。すなわち、入力画像のベクトル量子化値（ＶＱ値）について、対象物の範囲内にあるＶＱ値のヒストグラムを作成する（図１１のステップ１０９）。図１７は、対象物が回転している場合における、対象物の範囲内にあるベクトル量子化値のヒストグラム作成を説明する図である。
【００７３】
前述したように対象物が抽出されると対象物認識部６１の起動時に、対象物抽出部５１から対象物認識部６１に対して、対象物の範囲データ５４が渡されるので、図１７の対象物のエッジ５０１の直線データとベクトル量子化値メモリ６０の座標とを比較して包含関係を判定し、対象物の範囲内に完全に含まれる小領域（図１７の黒い部分で示す領域）について、ベクトル量子化値のヒストグラムを作成する。対象物が回転していない場合は、外接矩形の範囲データとベクトル量子化値メモリの座標とを比較して包含関係を判定し、外接矩形の範囲内の小領域について、ベクトル量子化値のヒストグラムを作成する。
【００７４】
そして、実施例１の図５で説明したと同様に、対象物認識部６１では、抽出対象物のベクトル量子化値ヒストグラム１３と、辞書のベクトル量子化値ヒストグラム１４とをマッチング部６３でマッチングして距離（ＤＴｋ）を算出し、距離（ＤＴｋ）が最小となる識別対象（ｋ）を、抽出対象物であると認識し、従って入力画像中に対象物が存在していると判定する（ステップ１１０、１１１）。
【００７５】
辞書との照合の結果、マッチングしていないときは、入力画像中に対象物がないと判定され（ステップ１１３）、抽出されたすべての矩形について同様の処理を行う（ステップ１１２）が、入力画像中に対象物が存在していると判定されたときは、未処理の矩形があっても処理を終了する。
【００７６】
本実施例は、入力カラー画像データに対して、矩形抽出とベクトル量子化を並列的に行うとともに、矩形抽出の処理が完了した矩形から順次サイズ判定を行い、対象物と判定された矩形に対して認識処理しているので、リアルタイム処理が可能になる。
【００７７】
【発明の効果】
以上、説明したように、本発明によれば、以下のような効果が得られる。
（１）対象物から特徴を抽出後にベクトル量子化しているので、認識対象物の情報量が失われることなく、処理データ量を圧縮することができると共に辞書をコンパクトに構成することができ、さらに特定画像を精度よく認識することができる。
【００７８】
（２）特徴量を圧縮しているので、より一層処理データ量を圧縮することができる。
【００７９】
（３）入力画像データから抽出された特徴量を圧縮処理後にベクトル量子化しているので、カラー画像認識に必要な情報量を保持しつつ処理データ量が削減され、処理速度が向上するとともに高精度に画像認識処理を行うことができる。
【００８０】
（４）マッチングを行う際に、得られたヒストグラム情報と設定された閾値とを比較する閾値処理により、マッチング処理の変更を行っているので、画像認識に必要な処理時間が短縮され、効率的なカラー画像認識処理を行うことができる。
【００８１】
（５）辞書とのマッチングを行う際に、算出された距離と設定された閾値とを比較しているので、認識精度をより一層向上させることができる。
【００８２】
（６）矩形対象物の抽出処理を、外接矩形と黒連結成分との接点情報に基づいて行っているので、画像中の対象物が存在する部分を高精度に抽出することができる。
【００８３】
（７）画像中の対象物が回転していても高精度に抽出することができる。
【００８４】
（８）背景が白地または黒地の場合でも対象物を正確に抽出することができ、また白地における外接矩形と黒地における外接矩形の包含関係を調べているので、重複した抽出処理を行う必要がない。
【００８５】
（９）対象物の抽出処理とカラー画像のベクトル量子化処理を並列的に行い、抽出された対象物について認識処理しているので、カラー画像中の特定画像を高速かつ高精度に認識することができる。また、カラー画像をベクトル量子化しているので、認識対象物の情報量が失われることなく、処理データ量を圧縮することができる。
【図面の簡単な説明】
【図１】
本発明の実施例１の構成を示す。
【図２】
原画像を小領域に分割した図を示す。
【図３】
小領域内の色度ヒストグラムを、コードブックを参照してベクトル量子化する例を示す図である。
【図４】
入力画像のベクトル量子化値（ＶＱ値）の例を示す図である。
【図５】
認識時における入力画像のベクトル量子化値ヒストグラムと辞書のベクトル量子化値ヒストグラムとのマッチングを説明する図である。
【図６】
特徴抽出時におけるサンプリング点を示す図である。
【図７】
本発明の実施例４の構成を示す。
【図８】
小領域内の特徴量を変換圧縮してヒストグラムを生成する図である。
【図９】
入力画像の特徴量ヒストグラムとコードブックとのマッチングを説明する図である。
【図１０】
本発明の対象物抽出方法に係る実施例９の構成を示す。
【図１１】
本発明の対象物抽出および画像認識の処理フローチャートである。
【図１２】
入力画像から抽出された外接矩形を示す。
【図１３】
回転の判定を説明する図である。
【図１４】
実施例１０、１１の構成を示す図である。
【図１５】
実施例１２の全体構成を示す図である。
【図１６】
実施例１２の詳細構成を示す図である。
【図１７】
対象物が回転している場合における、対象物の範囲内にあるベクトル量子化値のヒストグラム作成を説明する図である。
【符号の説明】
１　カラー画像信号
２　メッシュ分割部
３　特徴量抽出部
４　特徴量メモリ
５　コードブック
６　ベクトル量子化部
７　ベクトル量子化値メモリ
８　認識部
９　辞書
１０　制御部[0001]
[Industrial applications]
The present invention relates to a color image recognition method and apparatus for extracting and recognizing a specific image from a color image and a black-and-white image.
[0002]
[Prior art]
Products for processing color images, such as color copiers, color printers, color scanners, and color image communication devices, are expected to increase in the future. Although color images have become easier to use than before due to advances in hardware, particularly lower prices and larger capacities of memory, and lower communication costs, the amount of color image data is enormous (for example, , A3 size and 96 Mbytes), it is not possible to perform the same processing as a binary image at present.
[0003]
In particular, in a technology that requires complicated processing such as image recognition (specific image recognition, OCR, etc.), the processing amount is enormous, and it is more difficult to realize image recognition in a color image.
[0004]
[Problems to be solved by the invention]
Conventionally, as a method of identifying a specific color image, for example, since each picture portion constituting the image has a distribution on a unique color space, the distribution on the unique color space appearing in each picture portion is There is a method of specifying and extracting an image portion having the same characteristics as the specified characteristics (see Japanese Patent Application Laid-Open No. 4-180348). However, according to this method, it is not possible to identify an image having the same spread in the color space even if the color distribution inside the image is different, that is, if the spread of the color space is the same, the spread of the image is not increased. There is a possibility that an image having a different color distribution in the image may be erroneously detected as a specific image.
[0005]
On the other hand, various methods have been proposed as a method of extracting a target object necessary for the recognition process. For example, a rectangle of a black connection is extracted from an image and compared with a preset threshold value so that the character rectangle is extracted. There is an image extraction method for determining a line figure as a rectangle (see Japanese Patent Application Laid-Open No. 55-162177). This method does not identify the extracted line figure in more detail like a horizontal ruled line, a vertical ruled line, a table, an enclosing frame, etc., and cannot cope with the extraction of a rotated object.
[0006]
A first object of the present invention is to provide a color image recognition method for recognizing a target object with high accuracy by compressing a data amount while securing an information amount necessary for image recognition when performing a color image recognition process. It is to provide a device.
[0007]
A second object of the present invention is to efficiently perform data compression by performing a table conversion and compressing a data amount while securing a necessary amount of information for image recognition when performing a color image recognition process. Another object of the present invention is to provide a color image recognition method and apparatus for recognizing an object with high accuracy and high speed.
[0008]
A third object of the present invention is to reduce the effect of a show-through of a color image of a document to be recognized and perform vector quantization only on image information to be recognized, thereby reducing the recognition rate and processing speed. It is an object of the present invention to provide an improved color image recognition method and apparatus.
[0009]
A fourth object of the present invention is to provide a color image recognition method and apparatus in which recognition accuracy and processing speed are improved by not allocating a vector quantization value when a distance from a codebook is larger than a predetermined threshold. Is to do.
[0010]
A fifth object of the present invention is to provide a color image recognition method and apparatus for extracting an image area of an object from an input image with high accuracy.
[0011]
A sixth object of the present invention is to reduce the amount of data by compressing the data amount while securing the amount of information necessary for image recognition when performing a color image recognition process on the extracted object. It is an object of the present invention to provide a color image recognition method and apparatus that recognizes with high accuracy.
[0012]
[Means for Solving the Problems]
In order to achieve each of the above objects, in the invention according to claim 1, a process of recognizing a specific image from an input color image signal and a process of extracting an object having a predetermined shape from the input color image signal are executed in parallel. The recognition processing is performed on the target object in accordance with the extraction result of the target object.
[0013]
In the invention according to claim 2, the processing of extracting an object having a predetermined shape from the input image signal includes extracting a circumscribed rectangle of a black connected component from the input binary image signal, and extracting the circumscribed rectangle. It is characterized in that it is a process of extracting the object based on contact information between the object and the black connected component.
[0014]
According to a third aspect of the present invention, the extracted object is a rectangle having a predetermined side length.
[0015]
According to a fourth aspect of the present invention, the extracted object includes an object inclined with respect to a scan line.
[0016]
In the invention according to claim 5, when the input image signal is a color image signal, a brightness is obtained from the color image signal, and a plurality of binary images are generated by comparing the brightness with a plurality of threshold values. The circumscribed rectangle is extracted from each of the plurality of binary images.
[0017]
In the invention according to claim 6, the inclusion relation between the extracted first circumscribed rectangle and the second circumscribed rectangle is examined, and when one circumscribed rectangle includes the other circumscribed rectangle, one of the circumscribed rectangles is black and The method is characterized in that the object is extracted based on contact information with a connected component.
[0018]
In the invention according to claim 7, first processing means for recognizing a specific image from an input color image signal, second processing means for extracting an object having a predetermined shape from the input color image signal, Means for recognizing the object in accordance with the extraction result, wherein the first processing means and the second processing means are executed in parallel.
[0019]
[Action]
In the first embodiment, an input color image signal RGB is divided into small regions (mesh) by a mesh dividing unit. The feature amount extraction unit creates a chromaticity histogram of the input color image signal RGB for each of the divided small regions, and stores it in the feature amount memory. A chromaticity histogram of the identification target is created in advance and stored in the code book. The vector quantization unit calculates the distance between the chromaticity histogram of the feature memory and the codebook, and determines the code of the codebook having the minimum distance as the vector quantization value of the small area. Stored in value memory. In the dictionary, a vector quantization value histogram is obtained and stored in advance for the identification target. The recognition unit calculates the distance between the vector quantization value histogram of the input image and the vector quantization value histogram of the dictionary, and determines whether or not the input color image is an identification target. As a result, the data amount can be compressed, and the object can be recognized with high accuracy.
[0020]
【Example】
Hereinafter, an embodiment of the present invention will be specifically described with reference to the drawings.
<Example 1>
FIG. 1 shows the configuration of Embodiment 1 of the present invention. In FIG. 1, a mesh dividing unit 2 for dividing a color image obtained from an input color image signal (RGB) 1 into small areas (mesh), and a feature quantity for extracting a feature quantity from color image data in the small area An extraction unit 3, a feature memory 4 for storing the extracted features, a vector quantization unit 6 for performing vector quantization by comparing the extracted features with a previously created codebook 5, A vector quantization value memory 7 for holding a quantization value, a recognition unit 8 for performing a recognition process by comparing the memory with a dictionary 9 of an identification object, and an entire image recognition such as a memory management or a distance calculation for a matching process. And a control unit 10 for controlling each stage in the processing.
[0021]
The input color image signal RGB is divided into predetermined small areas (mesh). The feature amount of the input color image signal RGB is extracted for each divided small area (mesh). In this embodiment, a chromaticity histogram is used as the feature amount. That is, the feature amount extraction unit 3 converts the input color image signal RGB into chromaticity Pr and Pg shown below, and creates a histogram of chromaticity Pr and Pg values for each predetermined small area (mesh). Are stored in the feature memory 4. Since the color image subjected to the chromaticity conversion has only the hue information as described above, it is effective when, for example, the target object is extracted after the illumination unevenness is removed.
[0022]
FIG. 2 shows a diagram in which the original image is divided into small areas (mesh). In this example, the small area has a size of 64 pixels × 64 pixels.
[0023]
Pr = 256 * R / (R + G + B)
Pg = 256 * G / (R + G + B)
Here, R, G, and B are input 8-bit color image signals. The reason why Pr and Pg are multiplied by 256 is that Pr and Pg are also represented by 8 bits.
[0024]
In the above-described example, the chromaticity histogram of r and g is used as the feature amount. However, the present invention is not limited to this, and the chromaticity histogram of b may be used. , Color image signals (RGB), hue, saturation, and the like.
[0025]
FIG. 3 is a diagram illustrating an example in which a chromaticity histogram in a small area is vector-quantized with reference to a codebook.
[0026]
In FIG. 3, reference numeral 11 denotes a chromaticity histogram (Hi) in the small area stored in the feature memory 4. H (i) in the dimension of i = 0 to 255 represents Pr, and H (i) in the dimension of i = 256 to 511 represents Pg. Reference numeral 12 denotes the contents of a code book (C0, C1, C2,...) Created in advance.
[0027]
Here, a codebook is a representative chromaticity histogram (codebook) by inputting a large number of identification objects or general manuscripts, creating a large amount of chromaticity histogram data under the same conditions, and clustering them. Create in search of.
[0028]
The vector quantization unit 6 calculates the distance (DCj) by matching the chromaticity histogram (Hi) 11 with the contents 12 of the codebook, and defines the distance (generally defined as the square of the Euclidean distance). The code (Cj) of the code book having the smallest square distortion measure) is stored in the vector quantization value memory 7 as the vector quantization value (VQ value) of the small area.
[0029]
FIG. 4 shows an example of a vector quantization value (VQ value) of an input image. Each cell corresponds to one small area described above, and the numerical value in each cell is a vector quantization value (VQ value). Then, a histogram is created for these vector quantization values (VQ values).
[0030]
FIG. 5 is a diagram illustrating matching between a histogram of vector quantization values of an input image and a histogram of vector quantization values of a dictionary during image recognition. In the figure, reference numeral 13 denotes an example of a histogram of the created vector quantization values. Reference numeral 14 denotes the dictionary content of the identification target object, which is expressed in the form of a vector quantization value histogram and is created in advance, like the input image. In the case of the figure, for example, identification objects A, B, C. . A vector quantization value histogram (the number of codebooks is 64) is stored.
[0031]
The recognizing unit 8 calculates the distance (DTk) by matching the vector quantization value histogram 13 of the input image and the vector quantization value histogram 14 of the dictionary, and identifies the identification target (k) having the minimum distance (DTk). Is determined as a color image to be recognized.
[0032]
As described above, according to the present invention, since the color image is vector-quantized and then collated with the dictionary, the problem in the prior art Japanese Patent Laid-Open No. 4-180348 is solved.
[0033]
<Example 2>
In the first embodiment described above, the chromaticity histogram is created using all the pixels, but this requires an enormous amount of processing. Therefore, in the second embodiment, as shown in FIG. 6, the pixels for obtaining the chromaticity histogram are thinned out at M pixel intervals. As a method of thinning, for example, a method of selecting a pixel for which chromaticity is to be obtained by sampling at intervals of 8 pixels is adopted. Also, when thinning out at M pixel intervals, the average of the pixel values of the surrounding pixels may be obtained, and this value may be used as the thinned pixel value (this processing reduces noise).
[0034]
<Example 3>
In the first embodiment, when a chromaticity histogram is created, if a histogram is created with 8 bits for each of r and g, a 512-dimensional feature amount is obtained, the memory capacity is increased, and the matching process takes time.
[0035]
Therefore, in the present embodiment, for example, the following conversion is performed to reduce the feature amount dimension to 64 dimensions, and then the same processing as described above is performed.
[0036]

<Example 4>
FIG. 7 shows the configuration of the fourth embodiment of the present invention. In the configuration of the fourth embodiment, a conversion compression table 15 is added to the configuration of FIG. 1, and the data amount is converted and compressed by table conversion. That is, in the fourth embodiment, data is efficiently compressed without performing an arithmetic process using a conversion formula as in the third embodiment. The conversion compression table 15 converts and compresses the feature amount extracted by the feature amount extraction unit 3 as described later. Further, the feature memory 4 holds the compressed feature, and the vector quantization unit 6 performs the vector quantization by comparing the compressed feature with the codebook 5 created in advance. The configuration is slightly different from the configuration of FIG. Other components are the same as those described with reference to FIG.
[0037]
As in the first embodiment, the input color image signal is divided into small areas (mesh) as shown in FIG. 2, and the feature amount of the input color image signal is extracted for each small area. In this embodiment, a chromaticity signal of the input color image signal RGB is used as the feature amount. That is, the feature amount extraction unit 3 extracts the RGB chromaticity signal of the input color image signal for each small area (mesh), converts the feature amount by referring to the conversion compression table 15, and compresses the feature amount. A chromaticity histogram in a small area (mesh) is created for each of the RGB chromaticity signals, and the result is stored in the feature memory 4.
[0038]
FIG. 8 is a diagram for generating a histogram by transforming and compressing a feature amount in a small area. The conversion compression table 15 converts and outputs the input chromaticity signals according to the conversion characteristics. For example, when the value of the input signal is 255, its output value is set to 15 and the conversion and compression processing is performed. The histogram of the feature amount thus compressed is generated in the feature amount memory 4. As described above, the increase in the memory capacity is suppressed by the feature amount conversion compression, and the matching process is sped up.
[0039]
Returning to FIG. 8, the input chromaticity signals (R, G, B) are converted and compressed into R ′, G ′, B ′ by the conversion compression table 15 and output. In this example, R ',
G 'and B' take values from 0 to 15, respectively. Then, 16 is added to G ′, 32 is added to B ′, and a chromaticity histogram H (i) 16 is created in a dimension of 0 to 47 (horizontal axis in the drawing). That is, in the chromaticity histogram H (i) 16 in the small area, H (i) of the dimension i = 0 to 15 represents the frequency of the R signal that has been compression-converted, that is, the R ′ signal. H (i) represents the frequency of the compressed and converted G signal, that is, the G ′ signal, and H (i) of dimension i = 32 to 47 represents the frequency of the compressed and converted B signal, that is, the B ′ signal.
[0040]
In the above-described embodiment, the total number of dimensions of the feature amount histogram (48 dimensions in this example) is determined according to the degree of compression by the conversion compression table, and can be changed as appropriate. Although the histogram is created from the RGB chromaticity signals of the input color image signal, the present invention is not limited to this, and the YMC signal or the Lab signal obtained by converting the input color image signal is used as the feature amount. Alternatively, a conversion compression process and a histogram generation process may be performed.
[0041]
FIG. 9 is a diagram showing an example in which the chromaticity histogram in a small area is vector-quantized with reference to a codebook. Reference numeral 17 denotes a chromaticity histogram in a small area stored in the feature memory 4 (Hi ), And 18 indicates the contents of the code book 5 (C0, C1, C2...) Created in advance. 3 in that the number of dimensions of the histogram is compressed.
[0042]
Here, the code book 5 receives a large number of chromaticity histogram data under similar conditions by inputting a large number of objects to be identified or general documents, and clusters these to form a representative chromaticity histogram (code book). ).
[0043]
The vector quantization unit 6 calculates the distance (DCj) by matching the chromaticity histogram (Hi) 17 of the small area to be processed with the contents 18 of the codebook, and calculates the distance (in general, the Euclidean distance of 2). The code (Cj) of the codebook having the smallest square distortion measure (defined as the squared power) is calculated as the vector quantization value of the small region.
(VQ value) and is stored in the vector quantization value memory 7.
[0044]
An example of the vector quantization value (VQ value) of the input image is the same as the case of FIG. 4 described in the first embodiment, and the vector quantization value (VQ value) of each small area is stored in the vector quantization value memory 7. Is held, and a histogram is created for the vector quantization value (VQ value) of the input image. Then, a dictionary of the identification target is constructed in the same manner as described with reference to FIG. 5 of the first embodiment, and the recognition unit 8 matches the vector quantization value histogram 13 of the input image with the vector quantization value histogram 14 of the dictionary. Then, the distance (DTk) is calculated, and the identification target (k) having the minimum distance (DTk) is determined as the color image to be recognized.
[0045]
<Example 5>
In the above-described fourth embodiment, histograms of feature amounts are created for all pixels of an input image, and comparison processing with a codebook is performed. In the fifth embodiment, the comparison process with the codebook is changed based on the histogram information of the feature amount generated for each small region, and the background portion (background portion) and the noise image unnecessary for the image recognition are recognized. , Thereby improving the recognition rate and the processing speed.
[0046]
In general, in the background portion (background portion) that is not necessary for image recognition, the density of the feature amount histogram generated for each small region is concentrated on a specific portion because the density in the small region is almost constant. The frequency distribution has a narrow width and the maximum value of the frequency tends to increase. Therefore, in the present embodiment, when the maximum value of the frequency of the histogram exceeds a preset threshold value, it is determined that the image is a solid region of a uniform density image, and 0 is assigned to the vector quantization value. Vector quantization is performed only on the image information to be recognized. As another embodiment, a configuration may be adopted in which the comparison processing with the codebook is not performed when the frequency of the histogram exceeds a preset threshold.
[0047]
Further, in a highlight portion of an image obtained by reading an original with a scanner or the like, an image on the back side of the original paper may be shown off. Since the feature amount of the image data in the highlight portion excluding such image noise due to show-through is often uniform, the histogram of the feature amount has a large peak concentrated on a specific portion, and The maximum value of the frequency tends to increase, and the width of the frequency distribution tends to narrow. Therefore, in the present embodiment, when it is determined that the information on the frequency distribution of the histogram has a preset histogram characteristic, the code book is obtained by removing the preset histogram characteristic from the feature amount histogram of the small area. And a vector quantization value is assigned. As a result, the effect of image noise is suppressed, and vector quantization is performed only on the image information to be recognized.
[0048]
In the fifth embodiment, the maximum value of the frequency and the distribution width of the frequency are used as the histogram information of the characteristic amount. However, the present invention is not limited to this. Any histogram information may be used as long as the histogram information is set by analyzing.
[0049]
<Example 6>
The code book according to the fourth embodiment is created by inputting a large number of images to be recognized, creating a large amount of chromaticity histogram data under the same conditions, and clustering them, so that the input image In other cases, the distance may be far from any codebook.
[0050]
In the sixth embodiment, when the recognition target image is compared with the codebook and the vector quantization value is assigned, the vector quantization value is not assigned when the compared distance is larger than a predetermined threshold. As a result, when there is no identification candidate as a result of the matching, it is possible to quickly determine that the image to be recognized does not exist in the input color image.
[0051]
<Example 7>
In the seventh embodiment, when performing matching with a dictionary in the first and fourth embodiments, a threshold of an effective distance is set, and the obtained distance is compared with the threshold. Is set as an identification candidate, but is not set as an identification candidate when the identification object is larger than a threshold. As a result, when there is no identification candidate as a result of the matching, it is possible to determine that the image to be recognized does not exist in the input color image.
[0052]
<Example 8>
In the eighth embodiment, the threshold value in the seventh embodiment is set for each identification object, and the obtained distance is compared with the threshold value of each identification object. If the distance is equal to or less than the threshold value, the identification object is identified. Although it is set as a candidate, if the value is larger than the threshold, the identification target is not set as the identification candidate. Thereby, when identifying a plurality of objects, matching processing utilizing characteristics of the objects can be performed. More specifically, when a certain object k is apt to be mistaken for a document j other than the object k, the threshold value of the object k is lowered so that the object k and the document j can be identified with high accuracy. Erroneous recognition can be prevented.
[0053]
<Example 9>
FIG. 10 shows a configuration of a ninth embodiment according to the object extracting method of the present invention. In FIG. 10, a rectangle extracting unit 22 for extracting a circumscribed rectangle of a black connected component from a binary image signal 21, a rectangle memory 23 for storing extracted rectangle data, a preset threshold value, a width and a height of the extracted rectangle. A candidate rectangle determining unit 24 that determines whether the object to be extracted is a rectangle, a candidate rectangle memory 25 that stores candidate rectangle data, and a rotation that determines whether the object is rotating. A determining unit 26, a side length measuring unit 27 that measures the short side and the long side of the object, and an object that determines whether the object is an object by comparing the lengths of the short side and the long side with a preset threshold value An object determination unit 28, an object rectangle memory 29 for storing object rectangle data, and a control unit 30 for controlling the whole.
[0054]
FIG. 11 is a processing flowchart of object extraction and image recognition according to the present invention. In this processing flowchart, the processing according to the object extraction method of the present invention is from step 101 to step 108. First, the method for extracting an object will be described below.
[0055]
A binary image is generated from the input image (step 101), and the rectangle extracting unit 22 extracts a circumscribed rectangle of the black connected component from the binary image (step 102). As a rectangle extraction method, for example, a method proposed by the present applicant (Japanese Patent Application Nos. 3-341889, 4-267313, 4-160866) may be used.
[0056]
FIG. 12 shows a circumscribed rectangle 202 extracted from the input image 201. In the present invention, the coordinates (Xs, Ys), (Xe, Ye), (Xs, Ye), (Xe, Ys) of the four vertices of the circumscribed rectangle 202, the black connected component (object) 203, and the circumscribed rectangle 202 (Xu, Ys), (Xe, Yr), (Xs, Yl), and (Xb, Ye) are simultaneously extracted.
[0057]
Next, the candidate rectangle determination unit 24 determines whether or not the height and width of the extracted circumscribed rectangle are within the predetermined height and width ranges (step 103). If is outside the range, it is determined that the object is not an object (step 113). In addition, as a method for determining a candidate for an object based on such a size, the method according to the present application described above may be used. Subsequently, the candidate rectangle determination unit 24 checks whether or not the target object to be extracted is a rectangle (Step 104). For example, it is assumed that rectangle data is set in the candidate rectangle determination unit 24 as an object to be extracted in advance.
[0058]
For a rectangular object, the rotation determination unit 26 determines whether the object is rotating with respect to the scan line (step 105). FIG. 13 is a diagram for explaining rotation determination, where 301 is a candidate rectangle, and 302 is an object. In this rotation determination, when the object 302 is a rectangle, the congruence of the triangles A and B and the triangles C and D is determined. If it is determined that either one is not congruent, it is determined that the rotation is not performed. Also, as shown in FIG. 13, the lengths of the diagonal lines D1 and D2 of the rectangles (this length is calculated from the coordinates of the rectangular data) are compared. If the difference is large, it is determined that the object is a diamond, and the object 302 is rotated. It is determined that it has not been performed.
[0059]
Next, the side length measuring unit 27 calculates the lengths of S1 and S2 in FIG. 13 (calculated from the coordinates of the rectangular data) for the rectangle determined to be rotating, The length is set (step 106). On the other hand, for those determined to be non-rectangular in step 104 and those determined not to rotate in step 105, the height of the circumscribed rectangle is set to the short side and the width is set to the length of the long side (step 107). Then, it is determined whether or not the length of the short side and the long side is within the range of the length of the short side and the long side of the given object in advance, and if any one of them is out of the range, it is not the object. It is determined that an object meeting this condition is an object to be extracted (step 108). The following processing (from step 109) will be described later.
[0060]
In the above-described embodiment, an image compression unit that converts, for example, 8 × 8 pixels into one pixel for the input binary image is provided before the rectangle extraction unit 22. It is also possible to change the configuration to extract.
[0061]
<Example 10>
In the tenth embodiment, an input image is a color image signal (R, G, B), the following lightness (L) is obtained, pixels having a lightness equal to or less than a predetermined threshold (Th1) are black, and a threshold ( After creating a binary image in which pixels larger than Th1) are white, the same processing as in the first embodiment is performed.
[0062]
L = R + G + B
Black pixel if L ≦ Th1
White pixel if L> Th1
This embodiment is an effective method for extracting an object when a portion (background) other than the object is a white background. That is, for example, this is an effective method when an object is placed on white paper (or light ground paper) and an image is read by a scanner or the like.
[0063]
<Example 11>
In the eleventh embodiment, an input image is a color image signal (R, G, B), the following brightness (L) is obtained, pixels having brightness equal to or more than a predetermined threshold (Th2) are set to black, and the threshold ( After creating a binary image in which pixels smaller than Th2) are white, the same processing as in the first embodiment is performed.
[0064]
L = R + G + B
Black pixel if L ≧ Th2
White pixel if L <Th2
This embodiment is an effective method for extracting a target when a portion (background) other than the target becomes a black background when input is performed by a scanner having a pressure plate such as a silver plate. As described above, according to the tenth and eleventh embodiments, it is possible to cope with both cases where an image is captured with the lid that holds the original document in place and an image is captured with the lid opened. The target in the above embodiment may be either a case where the target is extracted from the entire original placed on the scanner or a case where the target is extracted from a specific region in the original.
[0065]
Further, in the tenth and eleventh embodiments, in addition to the lightness, a color image signal (RGB), a hue, a saturation, and the like can be targeted. Furthermore, in the tenth and eleventh embodiments, a predetermined range may be a black pixel, such as a black pixel if Th1 ≦ L ≦ Th2, and a white pixel otherwise.
[0066]
FIG. 14 shows the configuration of the tenth and eleventh embodiments. The difference from the ninth embodiment (FIG. 10) is that a binary image generation unit 31 for generating a binary image from a color image signal (RGB) 34 is provided. The difference is that a memory 32 for storing the image and a memory 33 for storing the binary image created in the eleventh embodiment are provided. Then, the tenth and eleventh embodiments are executed so that the case where the background is a white background and the case where the background is a black background, and the circumscribed rectangle is extracted by the rectangle extracting unit 22. Assuming that the extracted circumscribed rectangles are circumscribed

rectangles

1 and 2, the candidate rectangle determination unit 24 determines the inclusion relationship of these circumscribed

rectangles

1 and 2. For example, when the circumscribed rectangle 1 completely includes the circumscribed rectangle 2, An object is extracted from only the circumscribed rectangle 1.
[0067]
<Example 12>
FIG. 15 shows the overall configuration of the twelfth embodiment. In the figure, the part where the object extraction unit 42 extracts the object from the color image signal 41 and stores it in the object rectangle memory 43 is exactly the same as the configuration shown in FIG.
[0068]
In the present embodiment, a vector quantization unit 44 that vector-quantizes the color image signal 41, a vector quantization value memory 45 that stores the vector quantization value, and an object extracted by the object extraction unit 42 are created in advance. An object recognizing unit 46 that performs matching with the set dictionary and determines whether the object is an object or not is provided.
[0069]
FIG. 16 shows a detailed configuration of the twelfth embodiment. First, the configuration of the object extraction unit 51 will be described. The object extraction unit 51 includes the elements shown in FIG. 14, and outputs an activation signal 53 to the object recognition unit 61 when an object is extracted. . The data of the extracted target object is stored in the target object rectangular memory 52, and the start object signal 53 and the target range data 54 are output to the target object recognition unit 61.
[0070]
As described in the first embodiment, the vector quantization processing unit 55 includes a mesh division unit 56 that divides a color image obtained from an input color image signal (RGB) into small regions (mesh), A feature quantity extraction unit 57 for extracting a feature quantity (chromaticity histogram) from the color image data in the image data, and a vector quantizer for performing vector quantization by comparing the created chromaticity histogram with a previously created codebook 59. It is comprised from the conversion part 58. The input color image subjected to vector quantization is stored in the vector quantization value memory 60.
[0071]
The processes from the vector quantization of the input color image to the assignment of the vector quantization value (VQ value) of the small region to be processed to the storage of the vector quantization value in the vector quantization value memory 60 are the same as those in the first embodiment. Therefore, the description is omitted.
[0072]
The histogram creating unit 62 of the object recognizing unit 61 creates a histogram of vector quantized values from the above-described vector quantized values (FIG. 4 described above). That is, for the vector quantization value (VQ value) of the input image, a histogram of VQ values within the range of the target object is created (step 109 in FIG. 11). FIG. 17 is a diagram illustrating the creation of a histogram of vector quantization values within the range of the object when the object is rotating.
[0073]
As described above, when the target object is extracted, when the target object recognition unit 61 is activated, the target object range data 54 is passed from the target object extraction unit 51 to the target object recognition unit 61. The inclusion data is determined by comparing the straight line data of the edge 501 of the object with the coordinates of the vector quantization value memory 60, and regarding a small region (region indicated by a black portion in FIG. 17) completely included in the range of the object. , Create a histogram of vector quantized values. If the object is not rotated, the inclusive relationship is determined by comparing the circumscribed rectangle range data with the coordinates of the vector quantization value memory, and for the small area within the circumscribed rectangle, a histogram of the vector quantization value is obtained. Create
[0074]
Then, in the same manner as described with reference to FIG. 5 of the first embodiment, the target object recognition unit 61 matches the vector quantization value histogram 13 of the extraction target object with the vector quantization value histogram 14 of the dictionary by the matching unit 63. (DTk), and the identification target (k) having the minimum distance (DTk) is recognized as the extraction target, and therefore, it is determined that the target exists in the input image (step). 110, 111).
[0075]
As a result of the collation with the dictionary, if there is no matching, it is determined that there is no target in the input image (step 113), and the same processing is performed for all the extracted rectangles (step 112). If it is determined that the target object is present, the process is terminated even if there is an unprocessed rectangle.
[0076]
In this embodiment, rectangle extraction and vector quantization are performed in parallel on input color image data, and size determination is performed sequentially from rectangles for which rectangle extraction processing has been completed. Real-time processing becomes possible because of the recognition processing.
[0077]
【The invention's effect】
As described above, according to the present invention, the following effects can be obtained.
(1) Since the vector quantization is performed after the feature is extracted from the object, the amount of data to be processed can be reduced without losing the information amount of the recognition object, and the dictionary can be compactly configured. A specific image can be accurately recognized.
[0078]
(2) Since the feature amount is compressed, the processing data amount can be further reduced.
[0079]
(3) Since the feature amount extracted from the input image data is vector-quantized after the compression processing, the amount of processing data is reduced while maintaining the amount of information necessary for color image recognition, thereby improving the processing speed and improving the accuracy. Image recognition processing.
[0080]
(4) Since the matching processing is changed by threshold processing for comparing the obtained histogram information with the set threshold when performing matching, the processing time required for image recognition is reduced, and the efficiency is reduced. Color image recognition processing can be performed.
[0081]
(5) Since the calculated distance is compared with the set threshold when matching with the dictionary, the recognition accuracy can be further improved.
[0082]
(6) Since the extraction processing of the rectangular object is performed based on the contact information between the circumscribed rectangle and the black connected component, the portion where the object exists in the image can be extracted with high accuracy.
[0083]
(7) Even if the object in the image is rotating, it can be extracted with high accuracy.
[0084]
(8) Even if the background is a white background or a black background, the object can be accurately extracted, and the inclusion relationship between the circumscribed rectangle on the white background and the circumscribed rectangle on the black background is checked, so that there is no need to perform redundant extraction processing. .
[0085]
(9) Since the object extraction processing and the color quantization vector quantization processing are performed in parallel, and the recognition processing is performed on the extracted objects, the specific image in the color image can be quickly and accurately recognized. Can be. Further, since the color image is vector-quantized, the amount of processed data can be compressed without losing the information amount of the recognition target.
[Brief description of the drawings]
FIG.
1 shows a configuration of Embodiment 1 of the present invention.
FIG. 2
FIG. 4 shows a diagram in which an original image is divided into small regions.
FIG. 3
FIG. 14 is a diagram illustrating an example in which a chromaticity histogram in a small area is vector-quantized with reference to a codebook.
FIG. 4
FIG. 3 is a diagram illustrating an example of a vector quantization value (VQ value) of an input image.
FIG. 5
FIG. 9 is a diagram illustrating matching between a vector quantization value histogram of an input image and a vector quantization value histogram of a dictionary during recognition.
FIG. 6
FIG. 9 is a diagram showing sampling points at the time of feature extraction.
FIG. 7
9 shows a configuration of a fourth embodiment of the present invention.
FIG. 8
FIG. 9 is a diagram for generating a histogram by transforming and compressing a feature amount in a small area.
FIG. 9
FIG. 4 is a diagram for explaining matching between a feature amount histogram of an input image and a codebook.
FIG. 10
The structure of Example 9 according to the object extraction method of the present invention is shown.
FIG. 11
It is a processing flowchart of object extraction and image recognition of the present invention.
FIG.
The circumscribed rectangle extracted from the input image is shown.
FIG. 13
FIG. 9 is a diagram for explaining rotation determination.
FIG. 14
FIG. 14 is a diagram illustrating a configuration of Examples 10 and 11;
FIG.
FIG. 34 is a diagram illustrating an overall configuration of a twelfth embodiment.
FIG.
FIG. 37 is a diagram illustrating a detailed configuration of a twelfth embodiment.
FIG.
FIG. 9 is a diagram illustrating creation of a histogram of vector quantization values within the range of the object when the object is rotating.
[Explanation of symbols]
1 Color image signal
2 Mesh division
3 Feature extractor
4 Feature memory
5 Codebook
6 Vector quantization unit
7 Vector quantization value memory
8 Recognition unit
9 dictionaries
10 control unit

Claims

A process of recognizing a specific image from an input color image signal and a process of extracting an object having a predetermined shape from the input color image signal are executed in parallel, and a recognition process is performed on the object according to the result of extracting the object. A color image recognition method.

The processing of extracting an object having a predetermined shape from the input image signal includes extracting a circumscribed rectangle of a black connected component from the input binary image signal, and contact information between the extracted circumscribed rectangle and the black connected component. 2. The color image recognition method according to claim 1, wherein the processing is for extracting the target object based on the following.

3. The color image recognition method according to claim 1, wherein the extracted object is a rectangle having a predetermined side length.

3. The color image recognition method according to claim 1, wherein the extracted object includes an object inclined with respect to a scan line.

When the input image signal is a color image signal, brightness is obtained from the color image signal, a plurality of binary images are generated by comparing the brightness with a plurality of threshold values, and each of the plurality of binary images is generated from the plurality of binary images. 3. The color image recognition method according to claim 2, wherein the circumscribed rectangle is extracted.

The inclusion relationship between the extracted first circumscribed rectangle and the second circumscribed rectangle is examined, and when one circumscribed rectangle includes the other circumscribed rectangle, based on contact information between one circumscribed rectangle and a black connected component. The color image recognition method according to claim 5, wherein the object is extracted.

First processing means for recognizing a specific image from an input color image signal; second processing means for extracting an object having a predetermined shape from the input color image signal; Means for performing recognition processing on an object, wherein the first processing means and the second processing means are executed in parallel.