JP3952592B2

JP3952592B2 - Image search apparatus and method

Info

Publication number: JP3952592B2
Application number: JP12106298A
Authority: JP
Inventors: 弘隆椎山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1998-04-30
Filing date: 1998-04-30
Publication date: 2007-08-01
Anticipated expiration: 2018-04-30
Also published as: JPH11312248A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像を検索する画像検索装置及び方法に関するものである。
【０００２】
【従来の技術】
従来より類似画像を検索するための種々の技術が提案されている。類似画像検索を自然画像について行うための、ある程度実用化されている技術では、色情報を画像特徴量として用いているものが多い。そして、その多くが、色情報に関するヒストグラムを取ることにより、ＲＧＢの割合や画像中に多く存在する色の組み合わせを用いて検索を行うものである。
【０００３】
しかしながら、上記の手法では、色の位置情報が失われてしまうためにその検索精度は必ずしも高くなかった。また、例えば特開平８−２４９３４９号には、画像を複数のブロックに分け夫々の特徴量（代表色）を用いたパターンマッチングが開示されている。しかしながら、この手法では、マッチングを行う２つの画像について各ブロック間の特徴量の距離を計算しなければならず、膨大な計算量が必要となってしまう。特に特徴量として代表色を用いると、ＲＧＢ３個のデータを扱わなければならず、更に計算が複雑なものとなる。また、特徴量そのものを用いて比較を行うので、比較の精度が高くなる反面、画像のアングルが変ったり、物体の位置が変ったりするだけで類似画像検索できなくなってしまうといった問題がある。すなわち、画像のアングルが変ったり、物体の位置が変ったり、あるいは撮影条件による画像特徴量のある程度の違い等を吸収するなど、ある程度の曖昧さを有しながらも適切に画像検索を行うという、いわゆるロバストな類似画像検索を行うことはできなかった。
【０００４】
また、画像中のある物体（一部分）に着目して検索を行いたいような場合には、その物体以外の画像（背景画像）が異なると検索不可能となってしまう等、不都合な場合があった。
【０００５】
【発明が解決しようとする課題】
一般的な画像検索システムとして、自然画像を検索する場合に、画像にキーワードを付与しておき、このキーワードによって画像検索を行うことが知られている。しかし、このキーワード付け作業は人手のかかる作業であり、更に、キーワード付けが行われていない画像に関しては、縮小画を提示してマニュアルにて選択するという作業が生じ、検索操作を煩雑なものとしていた。
【０００６】
本発明は上記の問題点に鑑みてなされたものであり、画像の特徴量の配置を考慮した高速な類似画像の検索を可能とする画像検索装置及び方法を提供することを目的とする。
【０００７】
また、本発明の他の目的は、画像の特徴量の配置を考慮した類似画像の検索を行うとともに、撮影条件の変動等による違いを吸収した類似画像検索を可能とする画像検索装置及び方法を提供することにある。
【０００８】
また、本発明の他の目的は、特徴量群を１つのラベルで表し、画像をラベル行列で表現して画像間の類似度を算出することにより類似度の計算量を減少させ、迅速な類似画像検索を可能とすることにある。
【０００９】
また、本発明の他の目的は、ラベル行列を適切に管理し、ラベルを用いた画像検索処理の処理速度を著しく向上することにある。
【００１０】
また、本発明の他の目的は、元画像と比較先画像の類似度をラベル列もしくはラベル行列の比較によって行う際に、ＤＰマッチングやファジー非決定性オートマトン等のラベル位置の前後の曖昧さを許す手法を適用し、より効果的な類似画像検索を可能とすることにある。
【００１１】
また、本発明の他の目的は、画像中のある物体（一部分）に着目した類似画像検索を可能とすることにある。
【００１２】
【課題を解決するための手段】
上記の目的を達成するための本発明の一態様による画像検索装置は、例えば以下の構成を備える。すなわち、
画像データを複数のブロックに分割し、各ブロックについて取得された特徴量に応じてラベルを付与し、付与されたラベルを所定のブロック順序で並べることによりラベル行列を生成する第１生成手段と、
前記第１生成手段で生成されたラベル行列を前記画像データに対応付けて記憶する記憶手段と、
前記生成されたラベル行列より部分ラベル行列を抽出し、抽出された部分ラベル行列をキーとして画像データを検索可能なテーブルを保持する保持手段と、
検索元の画像データから部分ラベル行列を抽出し、前記テーブルを用いて類似画像を検索する第１検索手段と、
前記第１検索手段で検索された各類似画像を比較先画像として、前記記憶手段から得られた比較先画像のラベル行列と、前記検索元の画像データから得られるラベル行列との間でラベル間距離に基づくマッチング処理を行って類似度を獲得し、獲得した類似度に基づいて検索結果を得る第２検索手段と、
画像の一部を構成する部分画像が検索元の画像として指定された場合、該部分画像を表す部分画像データが含まれるブロックに関して前記特徴量に応じたラベルを付与し、他のブロックに関してはラベルとの間の距離がゼロであることを示すラベルを付与することで検索元の画像のラベル行列を生成する第２生成手段と、
前記部分画像の画像データが含まれるブロックに関して付与されたラベルに基づいて部分ラベル行列を生成する第３生成手段とを備え、
前記第１及び第２検索手段は、前記第２生成手段及び前記第３生成手段によって生成されたラベル行列と部分ラベル行列を用いて画像の検索を行う。
【００１３】
又、上記の目的を達成するための本発明の他の態様による画像検索装置は、以下の公正を備える。すなわち、
画像データを複数のブロックに分割し、各ブロックについて取得された特徴量に応じてラベルを付与し、付与されたラベルを所定のブロック順序で並べることによりラベル行列を生成する第１生成手段と、
前記第１生成手段で生成されたラベル行列を前記画像データに対応付けて記憶する記憶手段と、
前記生成されたラベル行列より部分ラベル行列を抽出し、抽出された部分ラベル行列をキーとして画像データを検索可能なテーブルを保持する保持手段と、
検索元の画像データから部分ラベル行列を抽出し、前記テーブルを用いて類似画像を検索する第１検索手段と、
前記第１検索手段で検索された各類似画像を比較先画像として、前記記憶手段から得られた比較先画像のラベル行列と、前記検索元の画像データから得られるラベル行列との間でラベル間距離に基づくマッチング処理を行って類似度を獲得し、獲得した類似度に基づいて検索結果を得る第２検索手段とを備え、
前記ラベル行列は２次元のラベル行列を表し、
前記第２検索手段が、
前記検索元の画像データのラベル行列より抽出される行単位のラベル列と、前記第１検索手段で得られた比較先画像データのラベル行列より抽出される行単位のラベル列とをＤＰマッチングによって対応づけることによって該比較先画像データの行並びを得る第１マッチング手段と、
前記元画像データのラベル行列の行並びと、前記第１マッチング手段で得られた行並びとの類似度をＤＰマッチングによって求める第２マッチング手段とを含む。
【００１５】
【発明の実施の形態】
以下、添付の図面を参照して本発明の好適な実施形態を説明する。
【００１６】
図１は本実施形態による画像検索装置の制御構成を示すブロック図である。同図において、１０１はＣＰＵであり、本実施形態の画像検索装置における各種制御を実行する。１０２はＲＯＭであり、本装置の立ち上げ時に実行されるブートプログラムや各種データを格納する。１０３はＲＡＭであり、ＣＰＵ１０１が処理するための制御プログラムを格納するとともに、ＣＰＵ１０１が各種制御を実行する際の作業領域を提供する。１０４はキーボード、１０５はマウスであり、ユーザによる各種入力操作環境を提供する。
【００１７】
１０６は外部記憶装置であり、ハードディスクやフロッピーディスク、ＣＤ−ＲＯＭ等で構成される。１０７はネットワークインターフェースであり、ネットワーク上の各機器との通信を可能とする。１０９はインターフェース、１１０は画像読み取りのためのスキャナである。また、１１１は上記の各構成を接続するバスである。なお、後述の各フローチャートに示される処理を実現する制御プログラムは、ＲＯＭ１０２に格納されていてもよいし、外部記憶装置１０６よりＲＡＭ１０３にロードされてもよい。
【００１８】
なお、上記の構成においてスキャナ１１０や外部記憶装置１０６はネットワーク上に配置されたもので代用してもよい。
【００１９】
図２は本実施形態の画像検索装置の機能構成を示すブロック図である。同図において、１１はユーザインターフェース部であり、表示器１０７、キーボード１０４及びマウス１０５を用いて、ユーザからの各種の操作入力を検出する。１２は画像入力部であり、スキャナ１１０による画像の読み取りを行う。１３は画像メモリであり、画像入力部１２によって得られたイメージデータをＲＡＭ１０３の所定の領域に格納する。１４は画像特徴量抽出部であり、画像メモリ１３に格納した画像について、後述の手順で特徴量を抽出する。１５は特徴量ラベル行列化部であり、画像特徴量抽出部１４によって得られた特徴量に基づいてラベル行列を生成する。１６はパターンマッチング部であり、指定された画像のラベル行列と、画像蓄積部１７に蓄積されている画像のラベル行列との間の類似度を算出し、類似画像を検索する。１７は画像蓄積部であり、画像入力部１２等によって得られた画像データを蓄積する。
【００２０】
１８は画像管理データベース（以下、画像管理ＤＢ）であり、図３で示されるデータ形態で画像蓄積部１７に格納された画像データを管理する。また、１９はラベル成分インデックスであり、図４に示されるラベル成分インデックスファイルを格納する。なお、ラベル成分インデックスの利用については、図９のフローチャートにより後述する。
【００２１】
以上のような構成を備えた本実施形態の画像検索装置の動作例を以下に説明する。なお、以下の例では色に着目した画像特徴量として、赤（Ｒ）、緑（Ｇ）、青（Ｂ）の三色を採用し、３次元の色空間での処理を用いて説明する。
【００２２】
［画像の登録処理］
先ず画像登録の際に行う処理を説明する。図５は本実施形態による画像登録処理の手順を表すフローチャートである。まず、ステップＳ１１において、ユーザーインターフェース部１１を介したユーザの指示により、画像入力部１２を用いて画像を読み込み、画像メモリ１３に保持する。次に、ステップＳ１２において、この画像を複数のブロックに分割する。本実施形態では、画像を縦横の複数ブロックに分割する。図６は本実施形態による画像のブロック分割例を示す図である。同図に示されるように、本実施形態では、説明のため３×３の計９個のブロックに画像を分割するものとする。次にステップＳ１３において、分割された各ブロックの特徴量を算出し、得られた特徴量を次の手順でラベル化する。
【００２３】
なお、本実施形態で用いる３×３への分割はあくまで説明のためのものである。実際には、自然画であれば１０×１０以上の分割数とするのが好ましい。また、白の無地背景に商品が写っているような場合であれば、１３×１３以上の分割数とするのが好ましい。
【００２４】
図７は本実施形態による多次元特徴量空間を説明する図である。図６に示すように、多次元特徴量空間（ＲＧＢカラー空間）を複数のブロック（色ブロック）、即ちセル（色セル）に分割し、夫々のセル（色セル）に対して通し番号でユニークなラベルを付与する。ここで、多次元特徴量空間（ＲＧＢカラー空間）を複数のブロックに分けたのは微妙な特徴量（色）の違いを吸収するためである。
【００２５】
なお、多次元特徴量空間に関しては、画像特徴量をそのまま用いるものではなく各パラメータを平均と分散を実験によって求め規格化（正規化）した後、例えば、主成分分析等の直交変換を行い、意味のある次元にしたものを用いることが考えられる。なお、「意味のある次元」とは、主成分分析において、寄与率が大きな主成分軸で構成される次元である。
【００２６】
ステップＳ１３では、ステップＳ１２で得られた各分割ブロックに対して、定められた画像特徴量計算処理を行い、上記多次元特徴量空間上のどのセルに属するかを求め、対応するラベルを求める。この処理を全てのブロックに対して行う。すなわち、分割画像ブロックに対して、全ての画素がどの色セルに属するかの計算処理を行い、もっとも頻度の多い色セルのラベルをその分割画像ブロックのパラメータラベル（カラーラベル）として決定し、この処理を全てのブロックに対して行う。
【００２７】
以上のようにして各ブロックに対してパラメータラベルが付与されると、ステップＳ１４において、各ブロックに付与されたパラメータラベルを所定のブロック順序で並べることにより、パラメータラベル行列（以下、ラベル行列とする）が生成される。
【００２８】
図８はラベル列を生成する際のブロック順序例を説明する図である。同図の分割画像ブロックの升にある数字に従って上記のパラメータラベルを並べ、ラベル列を作る。なお、画像管理データベース１８やラベル成分インデックス１９にラベル行列を格納するに際しては、上述のように２次元的なラベル行列を所定の順序で１次元に並べたものを格納するが、本実施形態ではこのような１次元の形態のものもラベル行列と称することにする。
【００２９】
ここで、図８の（ａ）では、分割ブロックを左から右へ水平方向へスキャンし、この水平方向のスキャンを上から下へ行う順序となっている。なお、本実施形態に適用可能なスキャンの方法としては、
・水平方向（図８の（ａ）に示したように、左から右へのスキャンを上から下へ行うという順序の他に、図８の（ｂ）〜（ｄ）に示すように、左から右へのスキャンを下から上へ行う等、４通りのスキャン方法がある）、
・垂直方向（上から下へのスキャンを左から右へ行う等、４通りのスキャン方法が考えられる）、
・図８（ｅ）に示すように、偶数ラインと奇数ラインでスキャンを分ける。
【００３０】
なお、本実施形態では、図８の（ａ）に示すスキャン方法を採用するが、上述した他のスキャン方法も適用可能であることは明らかであろう。
【００３１】
続いてステップＳ１５において、以上のようにして得たラベル列や画像データを画像蓄積部１７、画像管理ＤＢ１８に反映する。すなわち、ステップＳ１１で読み込んだ画像データに対して画像ＩＤを取得し、これらをペアにして画像蓄積部１７に格納する。また、当該画像ＩＤに対応付けて図３に示した画像管理ＤＢレコードを生成し、これを画像管理ＤＢ１８に登録する。
【００３２】
そして、ステップＳ１６において、当該画像のラベル行列から部分ラベル行列を獲得し、図４に示すようなラベル成分インデックスファイルを更新する。ラベル成分インデックスファイルは、部分ラベル行列（単独のラベルを含む）を検索キーとし、画像ＩＤ群と各画像に含まれる部分ラベル行列の個数を納めるレコードを格納したものである。このラベル成分インデックスによれば、部分ラベル行列（単独のラベルを含む）を与えることにより、当該部分ラベル行列を持つ画像ＩＤ群と、各画像の当該部分ラベル行列（単独のラベルを含む）の含有数が高速に得られる。なお、画像登録時の段階では未加工のレベルで情報を格納しておく。また、ラベル行列を構成するすべてのラベルをラベル成分インデックスに登録する。例えば、ラベルのヒストグラム情報を得て、出現するラベルの全てがラベル成分インデックスに反映されるようにする。例えば、単独のラベルが登録されたラベル成分インデックスを用いた場合は、ヒストグラム情報に現れる全てのラベルをキーとするデータレコードを更新することになる。
【００３３】
なお、画像の登録処理において異は、後述のドント・ケアー・ラベルは用いられない。すなわち、ラベル成分インデックスには、ドント・ケアー・ラベルは登録されない。以上が画像登録時に行われる処理である。
【００３４】
［画像の検索処理］
次に、図９及び図１０のフローチャートに従って、本実施形態による類似画像検索処理を説明する。検索処理はおおきく分けて画像全体の類似画像検索であるか、あるいは物体に着目した検索であるかにより処理が異なってくる。以下、順を追って説明する。
【００３５】
図９は、本実施形態による画像検索の手順を説明するフローチャートである。まず、ステップＳ２１において、当該検索が画像全体の類似画像検索であるか、所望の物体（或いは部分領域）に着目した検索であるかを指定する。検索方法が指定されるとステップＳ２２において、指定された検索方法に適合したラベル行列と特徴部分ラベル行列が取得される。このステップＳ２２では、指定された検索方法に応じて処理が異なる。以下、図１０のフローチャートを参照してステップＳ２２における処理の手順を詳細に説明する。
【００３６】
図１０は本実施形態による、特徴部分ラベル行列の抽出手順を説明するフローチャートである。
【００３７】
まず、ステップＳ４１において、指定された検索方法を判定する。指定された検索方法が「画像全体の類似検索」であった場合は、ステップＳ４２へ進む。ステップＳ４２では、検索者が画像全体の類似検索を行いたい画像を選択する。すなわち、類似検索元画像を指定する。ステップＳ４３では、指定された類似検索元画像に対応するラベル行列を画像ＤＢ１８より獲得する。そして、ステップＳ４４において、特徴部分ラベル行列を取得する。
【００３８】
なお、特徴部分ラベル行列の求め方に関しては、経験・実験による学習や統計的な手法など、実現手段は様々存在するものであり、ここでは特に限定されるものではない。一例を挙げると、このラベル行列を例えばヒストグラム解析を行うことにより特徴的な部分ラベル行列（単一ラベルも含む）を求める。
【００３９】
一方、ステップＳ４１において、着目物体の類似検索が指定されたと判定された場合は、ステップＳ４５へ進み、検索元となる着目物体の画像を決定する。着目物体の決定方法としては、例えば以下の方法が挙げられる。
【００４０】
（１）画像中の着目物体をポインティングデバイスで指示し、画像処理により着目物体を背景画像から分離し、背景画像を取り除くことで着目物体画像を作成する。一例を挙げると、ユーザーが指定した座標からエッジや色の変化具合を考慮して着目物体を抽出する既存の画像処理技術を用いる。この場合、着目物体の抽出内容に、誤りがあれば、抽出輪郭線をポインティングデバイスで補正することで、正確な着目物体の抽出が行える。
【００４１】
（２）物体画像ライブラリから着目物体画像を選択する。物体画像ライブラリに関して一例を挙げれば、予めキーワード検索が可能な物体画像を自らのシステムにインストールしておく方法や、あるいはインターネット等のネットワークを介してサーバーにキーワードで物体画像を検索して物体画像を入手する方法が考えられる。
【００４２】
（３）描画ソフトを用いて着目物体の画像を描いたり、（１）或いは（２）で得られた着目物体画像の色を変更してこれを着目物体画像とする。
【００４３】
次に、ステップＳ４６へ進み、上記抽出した着目物体画像を、その大きさや位置、方向を考慮して画像フレームに貼り付け、類似検索元画像を作成する。そして、ステップＳ４７において、当該類似検索元画像をブロック分割し、上記（１）〜（３）のいずれかの手法によって得た着目物体画像を含む分割ブロックの部分のみに関して、登録のときと同様な画像特徴抽出処理（例えば、代表色抽出）を行いラベル化する。更に、ステップＳ４８では、着目物体画像を含まない分割ブロックに対して、どのラベルともペナルティーが０のドント・ケアー・ラベル（don't care label）を付与し、類似検索元画像のラベル行列を生成する。
【００４４】
なお、上記の（１）から（３）の手法によって得られる画像において、着目物体画像の占める面積と背景領域の面積との比に応じて画像の分割方法を変えるようにしてもよい。例えば、着目物体画像の面積が所定の割合以下であって場合は、当該着目物体画像に外接するような矩形領域について所定数の分割ブロックを得て、ラベル割り当てを行う。また、着目物体画像の面積が所定の割合を越えていれば、上述の画像フレームについて分割ブロックを得るようにする。
【００４５】
そして、ステップＳ４９では、ステップＳ４８で得られた類似検索元画像のラベル行列よりドント・ケアー・ラベルを除いた各ラベルに対して、上述のようなヒストグラム処理を行い、特徴部分ラベル行列を取得する。例えば、この着目物体を含む画像分割ブロック群のみから得たラベル行列にヒストグラム解析を施すことにより特徴的な部分ラベル列を取得する。なお、ヒストグラム解析においては、色の事前の存在確率や色空間上での飽和などを考慮した重み付けヒストグラム処理を行ってもよい。また、着目物体による類似検索においても、特徴的な部分ラベル行列は複数個の存在を可能とする。
【００４６】
以上のようにして、検索方法に適合した類似検索元画像のラベル行列と特徴部分ラベル行列が取得されると、処理は図９のステップＳ２３へ進む。ステップＳ２３では、ラベル成分インデックス１９ステップＳ２２で得られた特徴部分ラベル行列の含有数の上限値及び下限値を決定する。この上限値及び下限値で決まる含有数の範囲は、ラベル成分インデックス１９を用いて行われるマッチング処理対象画像の絞り込み（プリサーチ）に用いられる。
【００４７】
特徴的な部分ラベル行列の含有数の上限下限の決定手法に関しては経験・実験による学習や統計的な手法など手段はさまざま存在し、ここでは特に限定は行わない。一例を挙げると、検索元画像に取得された特徴部分ラベル行列が含まれる個数ｎ0に対する割合で指定を行う。例えば、値が１以上の曖昧度fuzziness（大きいほど曖昧な検索を行う）を導入し、上限は曖昧度に比例したｎ0×fuzzinessとし、下限は反比例したｎ0÷fuzzinessとする。
【００４８】
このように、拡大縮小した画像を検索したい場合や、多少異なった画像をも検索したい場合の曖昧さの度合いに応じた曖昧度fuzzinessを与え、プリサーチの曖昧さを調節することが可能となる。
【００４９】
次に、ステップＳ２４において、ラベル成分インデックスファイルを参照して、ステップＳ２３で決定した上限値および下限値の範囲内の個数の特徴部分ラベル行列を含む画像ＩＤ群を求める。ラベル成分インデックスは図４に示したように部分ラベル行列をキーとして、当該部分ラベル行列を含む画像の画像ＩＤとその含有数が登録されている。従って、ステップＳ４４或いはステップＳ４９で得られる特徴部分ラベル行列を用いて検索が行え、当該特徴部分ラベル行列を所望の含有数範囲で含む画像ＩＤを容易、かつ高速に得ることができる。
【００５０】
部分特徴ラベル行列を用いて取得される画像ＩＤの数が、所定の目標数以下となるまで絞り込みを行う。例えば、ヒストグラム上位のものから順に選ばれた特徴部分ラベル列を用いて上記の手法により順次画像ＩＤ群を取得し、順次取得される画像ＩＤ群の論理積を取って目標数の画像ＩＤに絞り込んでいく。なお、ヒストグラム解析では、色の事前の存在確率や色空間上での飽和などを考慮した重み付けヒストグラム処理を行ってもよい。また、特徴的な部分ラベル行列は複数個存在してもよい。
【００５１】
次に、ある程度以上同一のラベルを含むラベル行列群と類似検索元の画像のラベル行列とを比較し、類似検索したい画像のラベル行列に近いラベル行列群を類似度とともに検索結果として出力する。
【００５２】
検索者が類似検索を行いたい画像を選択すると画像管理ＤＢからこれに対応するカラーラベル行列を得て、インデックスファイルから既に登録している画像のカラーラベル行列群を得て、これとの比較により、類似検索したい画像のカラーラベル行列に近いカラーラベル行列群を類似度とともに検索結果として出力する。
【００５３】
上記のステップＳ２４における処理は、登録してある全ての画像についてラベル行列間のマッチング（比較）を行うと処理が遅くなるので、予め似ているものを抽出し、抽出された画像について類似検索元画像のラベル行列との比較を行うためである。もちろん、処理が遅くなってもよければ、登録した画像の全てのラベル行列との比較を行うようにして、精度の高い検索を行ってもよい。
【００５４】
次に、ステップＳ２５において、検索元画像のラベル行列と、ステップＳ２４で取得された画像のラベル行列との間でパターンマッチングを行う。すなわち、ある程度以上共通なラベルを持った画像のラベル行列と類似検索元画像のラベル行列とを、ラベル間ペナルティーマトリックスを考慮したマッチング処理を行って比較する。ペナルティーマトリックスについては図１１により後述する。この比較の結果得られるペナルティー値（ラベル間の距離）に基づいて類似度を決定する。本実施形態では、ラベル行列におけるラベルの２次元的配置を考慮した２次元ＤＰマッチングを用いる。なお、２次元ＤＰマッチングについては後述する。
【００５５】
ステップＳ２６では、ステップＳ２５におけるマッチング処理の結果として得られた類似度が所定のしきい値以上であるかどうかを判断する。類似度が所定のしきい値以上であれば、ステップＳ２７へ進み、当該画像ＩＤを検索結果に登録するとともに、類似度の大きい順にソートする。一方、ステップＳ２６において類似度が所定のしきい値に達していないと判定された場合は、ステップＳ２７をスキップし、検索結果から除外する。ステップＳ２８では、ステップＳ２４で取得された全ての画像について上記ステップＳ２５からＳ２７の処理を終えたかどうかを判断し、未処理の画像があればステップＳ２５へ戻り、全画像について処理を終えていればステップＳ２９へ進む。
【００５６】
ステップＳ２９では、画像管理データベース１８を参照して、検索結果として類似度の大きい順に登録された画像ＩＤに対応するフルパスのファイル名を得て、これをユーザに提示する。
【００５７】
［２次元ＤＰマッチング］
次にラベル行列同士の類似比較を行うための２次元ＤＰマッチングについて述べる。
【００５８】
図１１はラベル列を比較し類似度を求める際に用いるラベル間のペナルティマトリックスの一例を示す図である。マトリクス中の値が小さい程類似していることになる。すなわち、カラーラベル間のパターンマッチングに際して、隣接する色セル同士ではペナルティー（距離）を小さくし、遠いものには大きなペナルティーを与えるものである。例えば、ラベル２とラベル６のペナルティは「７」である。また、同じラベル同士のペナルティは当然のことながら「０」となっている。本マトリクスの使用目的はラベルの類似に応じた距離判定を行うことにある。すなわち、本実施形態では、特徴量空間としてＲＧＢカラー空間を用いているので、色の類似に応じた距離判定が行えることになる。但し、ドント・ケアー・ラベルに関しては全てのラベルに対しペネルティーが０と定義する。
【００５９】
例えば、図１２に示す例では、検索元画像のラベル列が「１１２３１３４４１」であり、検索対象画像のラベル列が「１１３２２４４５２」である。このような一次元のラベル列のＤＰマッチング、すなわち一次元のＤＰマッチングは一般によく知られたものである。すなわち、図１２のラベル列に関して、図１１のペナルティマトリクスを用いてＤＰマッチングを行うと、図１３に示されるように両ラベル列間の距離（最終解）が求まる。なお、この例では、傾斜制限として次の条件を用いている。すなわち、図１４において、格子点(i-1,j),(i-1,j-1),(i,j-1)上のコストをそれぞれg(i-1,j), g(i-1,j-1), g(i,j-1)とし、格子点（i,j）上のペナルティをd(i,j)とした場合に、格子点（i,j）上のコストg（i,j）を以下の漸化式によって求めている。
【００６０】
【数１】

【００６１】
なお、ラベル列同士の比較においては、オートマトン等のラベルシーケンスを曖昧に比較できるマッチングを行うようにしてもよい。このような曖昧化の手法を用いることにより、余分なラベルの付加、ラベルの欠落や同じラベルの繰り返しに対しては低いペナルティが与えられとともに、ラベル間のペナルティには図１２のカラーラベル間のペナルティマトリックスを用いてラベル列同士の距離計算を行うことで、曖昧なパターンマッチングが行えるようになる。なお、オートマトンとしては、「特開平８−２４１３３５のファジー非決定性有限オートマトンを使用した曖昧な文字列検索方法およびシステム」に記載されている「ファジー非決定性有限オートマトン」を適用することができる。
【００６２】
本実施形態では、上記の一次元ＤＰマッチングを２次元に拡張した２次元ＤＰマッチングを用いてラベル行列同士の類似比較（類似度の算出）を行う。以下、２次元ＤＰマッチングを説明する。
【００６３】
図１５は本実施形態による類似度算出処理を説明する図である。上述のステップＳ２２（ステップＳ４３、Ｓ４８）において取得された類似検索元画像のラベル行列は、そのスキャン方法に従って図１５の（ｂ）のように並べることができる。また、ステップＳ２４において抽出された画像ＩＤのラベル行列群のうちの一つを類似比較先画像とすると、そのラベル行列は図１５の（ａ）のように並べることができる。
【００６４】
まず、類似比較先画像の第１行目のラベル列「ａｂｃ」と、類似検索元画像の第１〜第３行目のラベル列（「１２３」、「４５６」、「７８９」）のそれぞれとの距離をＤＰマッチングによって求め、その距離が最少となるラベル列の類似検索元画像における行番号を類似ライン行列（図１５の（ｃ））の該当する位置に記憶する。また、得られた最小距離が所定の閾値よりも大きい場合には、どの行とも類似していないと判断し、類似ライン行列の該当する位置に「！」を記憶する。ＤＰマッチングの性質により、たとえば画像のアングルが水平方向へ多少変わっていたとしても、上記処理により類似する行（ライン）を検出可能である。以上の処理を、類似比較先画像の全ての行（「ｄｅｆ」「ｇｈｉ」）について行うことで、図１５の（ｃ）のような列方向の類似ライン行列が得られる。なお、この処理において、類似検索元画像中のドント・ケアー・ラベルのみからなる列（ドント・ケアー・ラベル列）に関しては処理を行わない。これは、ドント・ケアー・ラベルのみからなる列は類似比較先画像の全ての列と距離がゼロとなってしまうからである。
【００６５】
図１５の（ｃ）では、「ａｂｃ」に類似した行が類似検索元画像に存在せず、「ｄｅｆ」に類似した行が類似検索元画像の第１行目、「ｇｈｉ」に類似した行が類似検索元画像の第２行目であったことを示している。以上のようにして得られた類似ライン行列と標準ライン行列（類似検索元画像の行の並びであり、本例では「１２３」となる）との類似度をＤＰマッチングを用いて算出し、これを当該類似検索元画像と当該類似比較先画像との類似度として出力する。なお、ＤＰマッチングでは、周知のように、比較するラベルシーケンスが最も類似距離が小さくなるように、比較するラベルシーケンスを伸縮（比較する相手を次に進めないで我慢する）させて比較するという処理を行う。また、何処まで伸縮（我慢）できるかを制約条件（整合窓の幅）で与えることも可能である。
【００６６】
図１６は本実施形態による２次元ＤＰマッチングを採用した類似度算出の手順を説明するフローチャートである。上記図１５を参照して説明した処理を、図１６のフローチャートを参照して更に説明する。
【００６７】
まず、ステップＳ１０１において、類似比較先画像の行番号を表す変数ｉと、類似検索元画像の行番号を表す変数ｊを１に初期化し、ともに第１行目を示すように設定する。次に、ステップＳ１０２において、類似比較先画像の第ｉ行目のラベル列を取得する。例えば図１５の場合、ｉ＝１であれば「ａｂｃ」が取得される。そして、ステップＳ１０３において、類似検索元画像の第ｊ行目のラベル列を取得する。例えば、図１５において、ｊ＝１であれば、「１２３」が取得される。ステップＳ１０３ａでは、ステップＳ１０３で取得されたラベル列がドント・ケアー・ラベル列か否かを判断し、ドント・ケアー・ラベル列であればステップＳ１０６へ、そうでなければステップＳ１０４へそれぞれ進む。
【００６８】
次にステップＳ１０４では、上記ステップＳ１０２、Ｓ１０３で得られた２つのラベル列間の距離を、図１１で説明した色セルペナルティーマトリクスを用いて、ＤＰマッチングによって求める。そして、ステップＳ１０５において、ステップＳ１０４で得た距離が、第ｉ行目に関してそれまでに得られた距離の最小値であれば、当該行番号（ｊ）をライン行列要素ＬＩＮＥ［ｉ］に記憶する。
【００６９】
以上のステップＳ１０３からステップＳ１０５の処理を、類似検索元画像の全ての行について行う（ステップＳ１０６、Ｓ１０７）。以上のようにして、類似比較先画像の第ｉ行目のラベル列に対して、類似検索元画像に含まれる行のうち最も距離の近い行の番号がＬＩＮＥ［ｉ］に格納されることになる。
【００７０】
そして、ステップＳ１０８において、上記処理でられたＬＩＮＥ［ｉ］と所定の閾値（Ｔｈｒｅｓｈ）とを比較する。そして、ＬＩＮＥ［ｉ］がＴｈｒｅｓｈ以上であればステップＳ１０８へ進み、いずれの行とも類似していないことを表す「！」をＬＩＮＥ［ｉ］に格納する。
【００７１】
以上説明したステップＳ１０２からＳ１０８の処理を類似比較先画像の全ての行について実行する（ステップＳ１１０、Ｓ１１１）ことにより、ＬＩＮＥ［１］〜ＬＩＮＥ［ｉｍａｘ］が得られるので、これを類似ライン行列ＬＩＮＥ［］として出力する。
【００７２】
次に、ステップＳ１１３において、標準ライン行列「１２…ｉｍａｘ」と類似ライン行列ＬＩＮＥ［］とのＤＰマッチングを行い、両者の距離を算出する。なお、ここで、標準ライン行列とは、１から始まり、列方向に１ずつ増加する行列である。
【００７３】
ここで、標準ライン行列と類似ライン行列間のＤＰマッチングにおいて用いられるペナルティーについて説明する。列方向の類似ライン行列と標準ライン行列とのＤＰマッチングのペナルティーの設定としては２つの方法が考えられる。すなわち、動的なペナルティーと固定的なペナルティーの２つである。
【００７４】
動的なペナルティーとは、動的にライン番号間のペナルティーを設定するものであり、画像によってライン番号間のペナルティーは変化する。本実施形態では、類似検索元画像の自分自身の横（行）方向のラベル行列の距離を求め、これに基づいて各行間のペナルティーを求める。
【００７５】
図１７は本実施形態による動的なペナルティー値の設定手順を示すフローチャートである。ステップＳ１２１において、変数ｉを１に、変数ｊを２にそれぞれセットする。次に、ステップＳ１２２において、類似検索元画像の第ｉ行目のラベル列を取得し、ステップＳ１２３において、類似検索もと画像の第ｊ行目のラベル列を取得する。そして、ステップＳ１２４において、類似検索元画像の第ｉ行目のラベル列と第ｊ行目のラベル列とについて、色ペナルティーマトリクスを用いてＤＰマッチングを行い、距離を獲得する。ステップＳ１２５では、ステップＳ１２４で得たＤＰマッチングの距離を、類似検索元画像のｉ行目のラベル列とｊ行目のラベル列間のペナルティーとしてＬＩＮＥ[i][j]に記憶すると共に、、類似検索元画像のｊ行目のラベル列とｉ行目のラベル列間のペナルティーとしてＬＩＮＥ[j][i]に記憶する。
【００７６】
ステップＳ１２６によって、変数ｊの値がｊｍａｘとなるまで、ステップＳ１２３〜Ｓ１２５の処理が繰返される。この結果、第ｉ行目のラベル列について、ｉ＋１〜ｊｍａｘ行目の各ラベル列との間のペナルティー値が決定される。そして、ステップＳ１２８、Ｓ１２９、Ｓ１３０により、ステップＳ１２３〜Ｓ１２６の処理を変数ｉの値がｉｍａｘ−１となるまで繰返される。この結果、ＬＩＮＥ[i]「ｊ］には、ｉ＝ｊの対角成分を除く全てに、上記処理で決定されたペナルティー値が記憶されることになる。
【００７７】
次にステップＳ１３１では、上記処理で決定されていないＬＩＮＥ[i][j]の対角成分のペナルティーを決定する。この部分はｉ＝ｊであり、同一のラベル列であるから、その距離は０であり、従ってペナルティー０が記憶される。また、ステップＳ１３２では、「！」に関してペナルティーを決定する。すなわち、「！」に対するペナルティーは、ＬＩＮＥ[i][j]の全てのペナルティー値の中で、最大のペナルティー値よりもある程度大きな値を設定する。ただし、このペナルティー値を極端に大きくすると、曖昧にヒットする性質が損なわれてしまう。
【００７８】
以上のようにして類似検索元画像に関して計算されたラベル列間のペナルティーを用いて、上記ステップＳ１１３におけるＤＰマッチングを行い、類似度検索元画像と類似比較先画像の類似度を獲得する。
【００７９】
一方、固定的なペナルティーでは、ＤＰマッチングのペナルティーとして、ラベルが一致すればペナルティー「０」を、一致しない場合、もしくは「！」との比較であった場合にはある程度の大きさのペナルティーを与える。この場合は類似検索元画像によらず、常に同じペナルティーとなる。このような固定的なペナルティーを用いてステップＳ１１３の処理を実行し、類似度検索元画像と類似比較先画像の類似度を決定する。
【００８０】
以上説明したマッチング処理は次のような特徴を有する。もし、図１５の（ａ）と（ｂ）が極めて類似していれば、類似ライン行列は「１２３」となり、その距離は０となる。また、類似ライン行列が「！１２」や「２１２」であれば、類似検索元画像に対して類似比較先画像は下方向へずれたものである可能性があるし、類似ライン行列が「２３！」や「２３３」であれば類似検索元画像に対して類似比較先画像が上方向へずれたものである可能性がある。また、類似ライン行列が「１３！」や「！１３」であれば、類似検索元画像に対して類似比較先画像が縮小したものである可能性がある。同様に、類似比較先画像が類似検索元画像を拡大したようなものである場合も検出可能である。
【００８１】
上述のステップＳ１１３で説明したように、類似ライン行列と標準ライン行列との間でＤＰマッチングを行うことにより、垂直方向へのずれが効果的に吸収される。このため、上述したような、上方向や下方向へのずれ、拡大、縮小等に起因する類似検索元画像と類似比較先画像との相違が効果的に吸収され、良好な類似検索を行うことが可能となる。
【００８２】
すなわち、本実施形態の２次元ＤＰマッチングは、ラベル行列の各ラベル列における前後の曖昧さを許容するマッチングであり、画像の位置ずれの影響を吸収する性質を有する。また、アングルの違い等により物体の位置が変わり、ブロックによって切りとられる物体の位置が変わることにより、ブロックの色合いも微妙に異なることが予想されるが、この違いは上述のペナルティーマトリクスにより吸収されることになる。このように、本実施形態の２次元ＤＰマッチングによる曖昧さを許容するマッチングと、ペナルティーマトリクスによる特徴量の曖昧さの許容との相乗効果によって、上下左右拡大、縮小のずれに対しても影響の少ないマッチングを可能としている。
【００８３】
ただし、動的なペナルティーと固定的なペナルティーとでは、動的なペナルティーを用いる方が好ましい。例えば、一面麦畑の類似元検索画像があったとした場合、どのラインも似たようなラベル列となることが考えられる。一方、類似比較先画像にも一面麦畑の画像があったとした場合に、この画像の類似ライン行列には全て最初のライン番号１が入り、「１１１」となってしまう可能性がある。この場合、類似検索元画像のどのラインも似たような画像となっており、ライン番号間のペナルティーが極めて小さくなければ低い距離でのヒットはしない。しかしながら、動的なペナルティーを用いた場合は、ライン番号間のペナルティーが低くなり、類似度の高い結果を得ることができる。
【００８４】
一方、固定的なペナルティーを用いると、「１２３」と「１１１」ではペナルティー値が大きくなり、類似度が低くなってしまう。
【００８５】
以上のようにして、ＤＰマッチングを水平・鉛直方向、すなわち２次元に行うことにより、水平や鉛直方向、更に斜め方向に画像アングルが変わったり、物体が移動していても検索を行うことが可能である。また、ＤＰマッチングの時系列伸縮特性により、ズームアップ撮影画像やマクロ撮影画像をも検索することが可能となる。
【００８６】
なお、上記実施形態では、水平方向のブロック並びに対応するラベル列を用いて類似ライン行列を得たが、垂直方向のブロック並びに対応するラベル列を用いて類似ライン行列を得るようにすることも、上記と同様の手法で実現可能である。
【００８７】
以上説明したように、本実施形態によれば、特徴量群（特徴量空間を分割して得られる特徴量のグループ）を１つのシンボルで表現し（すなわちラベル化し）、ラベル同士の類似度に基づく距離を上述の２次元ＤＰマッチングとペナルティーマトリクスによって与える。このため、２つの画像のブロック間の距離の計算量を大幅に減少させることができるとともに、類似した特徴量が同じラベルで表されることになるので、類似画像の検索を良好に行うことができる。
【００８８】
また、（１）ペナルティマトリクスによるラベル同士の距離概念を導入し、（２）比較するラベル位置を前後曖昧に移動させることが出来、トータルの距離が最小（類似度が最大）となるようなラベル行列の比較を実現する上記２次元ＤＰマッチングを導入したことにより、画像のアングルが多少変わっても検索することが可能となり、雰囲気の似ている画像を検索できるようになる。
【００８９】
更に上記実施形態によれば、インデックスデータベース（ラベル成分インデックス）を用いたことにより、画像検索が更に高速化する。
【００９０】
すなわち、上記実施形態によれば、画像の特徴量の配置を考慮した類似画像の高速な検索が行われるとともに、撮影条件の変動等による違いを吸収した類似画像の検索が可能となり、従来難しかった画像のアングルが変ったり、物体の位置が変ったり、あるいは他の撮影条件が変動したりすることによる画像の特徴量のある程度の違いを吸収するなど、ロバストな類似画像検索を行うことが可能となる。
【００９１】
なお、上記各実施形態においては、自然画像検索を行う例を説明したが、本発明はＣＧやＣＡＤ等の人工的な画像の検索にも適応可能な技術であることは当業者には明らかである。
【００９２】
また、上記各実施形態では画像特徴量として色情報を選んだが、本発明はこれに限られるものではなく、その他の画像パラメータを画像分割ブロックごとに求めることで実施することも可能である。
【００９３】
また、上記実施形態では１つの特徴量での認識の例を挙げたが、その他の特徴量での検索結果との論理演算を行うことにより、複数の特徴量からの高速な検索を行うことも可能である。
【００９４】
１つの画像に対して複数の画像特徴量を用いた検索を行う場合には、本発明で得られる類似度を１つの新たなる画像特徴量とみなし、複数のパラメータを用いた多変量解析を行い統計的な距離尺度を用いた検索を行うことも可能である。また、上記実施形態では、類似度が所定値を越える類似画像を検索結果として得るが、類似度の高い画像から順に前もって指定された個数の画像を検索結果として出力するようにしてもよいことはいうまでもない。
【００９５】
なお、曖昧度を指定することにより、ＤＰマッチングにおけるいわゆる整合窓の幅を変更することにより、検索の曖昧度を所望に設定可能とすることもできる。図１８はＤＰマッチングにおける整合窓を説明する図である。図１８において直線ＡはＪ＝Ｉ＋ｒで表され、直線ＢはＪ＝Ｉ−ｒで表される。整合窓の幅はｒの値を変更することで行える。したがって、キーボード１０４から曖昧度を指定することにより、このｒの値が変更されるように構成すれば、ユーザの所望の曖昧度（整合窓の幅）で類似度検索を行えるようになる。
【００９６】
なお、上記実施形態のような２次元ＤＰマッチングにおいては、水平方向のＤＰマッチングにおける整合窓の幅と、垂直方向のＤＰマッチングにおける整合窓の幅とを別々に設定できるようにしてもよい。或いは、両整合窓が異なる変化率で変化するように構成してもよい。このようにすれば、ユーザは、類似画像検索に際してよりきめ細かい曖昧さの設定を行えるようになる。例えば、図８で示されたようなブロック順序を用いた場合において、検索元の画像中における注目物体の横方向への移動を許容したいような場合や、検索元の画像が横長画像であるような場合には、横方向への曖昧度を大きくするために水平方向のＤＰマッチングにおける整合窓の幅をより大きくすればよい。
【００９７】
なお、ブロック化できない１つの画像に対して１つのパラメータを加味した類似検索の場合には、本発明で得られる類似度（ペナルティの総和を用いて作る）を１つの新たなる特徴量として、統計的な距離尺度に基づく検索を行うことも可能である。また、上記実施形態では、類似度が所定値を越える類似画像を検索結果として得るが、類似度の高い画像から順に前もって指定された個数の画像を検索結果として出力するようにしてもよいことはいうまでもない。
【００９８】
更に、ＤＰマッチング処理の代わりにファジー非決定性オートマトン等の、比較するラベル位置を前後曖昧に移動させることが出来、トータルの距離が最小（類似度が最大）となるようなラベル列の比較を実現する手法を導入することも可能であり、これにより画像のアングルが多少変わっても検索することが可能となり、雰囲気の似ている画像を検索できるようになる。
【００９９】
更に上記実施形態によれば、インデックスデータベース（ラベル成分インデックス（図４））を用いたことにより、画像検索が更に高速化する。
【０１００】
以上のように、本実施形態によれば、キーワードの付いていない画像を検索するのに好適な画像検索装置、方法が提供される。
【０１０１】
画像認識技術への壁が厚い現在、自分の欲しい画像に近い縮小画を見つけ、その画像に類似する画像を検索する手段を提供する事により、縮小画提示、類似画像検索を繰り返す事により、高い確率で検索者の欲しい画像を得る手段が考えられる。
【０１０２】
その際、本方式では、従来難しかった画像のアングルが変わったり、物体の位置が変わったり、或いは撮影条件による画像特徴量のある程度の違い等を吸収するなどロバストな類似画像検索を高速に行うことが可能となる。また、従来の類似画像検索の弱点であった背景による検索の影響を受けない、着目物体を重視した検索も可能となった。
【０１０３】
なお、本発明は、複数の機器（例えばホストコンピュータ，インタフェイス機器，リーダ，プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、複写機，ファクシミリ装置など）に適用してもよい。
【０１０４】
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読出し実行することによっても、達成されることは言うまでもない。
【０１０５】
この場合、記憶媒体から読出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。
【０１０６】
プログラムコードを供給するための記憶媒体としては、例えば、フロッピディスク，ハードディスク，光ディスク，光磁気ディスク，ＣＤ−ＲＯＭ，ＣＤ−Ｒ，磁気テープ，不揮発性のメモリカード，ＲＯＭなどを用いることができる。
【０１０７】
また、コンピュータが読出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１０８】
さらに、記憶媒体から読出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０１０９】
【発明の効果】
以上説明したように、本発明によれば、画像の特徴量の配置を考慮した高速な類似画像の検索が可能となる。
【０１１０】
また、本発明によれば、画像の特徴量の配置を考慮した類似画像の検索を行うとともに撮影条件の変動等による違いを吸収した類似画像検索が可能となる。
【０１１１】
また、本発明によれば、特徴量群を１つのラベルで表し、画像をラベル行列で表現して画像間の類似度を算出することにより類似度の計算量を減少させるので、迅速な類似画像検索が可能となる。
【０１１２】
また、本発明によれば、ラベル行列を適切に管理し、ラベルを用いた画像検索処理の処理速度が著しく向上する。
【０１１３】
また、本発明によれば、元画像と比較先画像の類似度をラベル列もしくはラベル行列の比較によって行う際に、ＤＰマッチングやファジー非決定性オートマトン等のラベル位置の前後の曖昧さを許す手法を適用し、より効果的な類似画像検索が可能となる。
【０１１４】
また、本発明によれば、画像中のある物体（一部分）に着目した類似画像検索が可能となる。
【０１１５】
【図面の簡単な説明】
【図１】本実施形態による画像検索装置の制御構成を示すブロック図である。
【図２】本実施形態の画像検索装置の機能構成を示すブロック図である。
【図３】画像管理データベースのデータ構成例を示す図である。
【図４】ラベル成分インデックスのデータ構成例を示す図である。
【図５】本実施形態による画像登録処理の手順を表すフローチャートである。
【図６】本実施形態による画像のブロック分割例を示す図である。
【図７】本実施形態による多次元特徴量空間を説明する図である。
【図８】ラベル列を生成する際のブロック順序例を説明する図である。
【図９】本実施形態による画像検索の手順を説明するフローチャートである。
【図１０】本実施形態による、特徴部分ラベル行列の抽出手順を説明するフローチャートである。
【図１１】ラベル列を比較し類似度を求める際に用いるラベル間のペナルティマトリックスの一例を示す図である。
【図１２】類似検索元画像のラベル列と類似検索先画像のラベル列の一例を示す図である。
【図１３】一次元のＤＰマッチングを説明する図である。
【図１４】ＤＰマッチングの傾斜制限を説明する図である。
【図１５】本実施形態による類似度算出処理を説明する図である。
【図１６】本実施形態による２次元ＤＰマッチングを採用した類似度算出の手順を説明するフローチャートである。
【図１７】本実施形態による動的なペナルティー値の設定手順を示すフローチャートである。
【図１８】ＤＰマッチングにおける整合窓を説明する図である。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image search apparatus and method for searching for an image.
[0002]
[Prior art]
Conventionally, various techniques for searching for similar images have been proposed. Many techniques that have been put to practical use for performing similar image search on natural images often use color information as image feature amounts. Many of them perform a search using a combination of colors that are present in the image and a ratio of RGB by taking a histogram relating to color information.
[0003]
However, in the above method, color position information is lost, so that the search accuracy is not necessarily high. For example, Japanese Patent Laid-Open No. 8-249349 discloses pattern matching using an image divided into a plurality of blocks and using respective feature amounts (representative colors). However, in this method, the distance of the feature amount between each block must be calculated for two images to be matched, which requires a huge amount of calculation. In particular, when a representative color is used as a feature quantity, three RGB data must be handled, and the calculation is further complicated. In addition, since the comparison is performed using the feature amount itself, the accuracy of the comparison is increased, but there is a problem that a similar image cannot be searched just by changing the angle of the image or the position of the object. That is, the image angle is changed, the position of the object is changed, or the image search is appropriately performed while having a certain degree of ambiguity, such as absorbing a certain amount of difference in the image feature amount depending on the shooting conditions. So-called robust similar image search could not be performed.
[0004]
In addition, when searching for an object (part) in an image, it may be inconvenient that the search cannot be performed if the image (background image) other than the object is different. .
[0005]
[Problems to be solved by the invention]
As a general image search system, it is known that when a natural image is searched, a keyword is assigned to the image and an image search is performed using this keyword. However, this keyword attaching operation is a labor-intensive operation. Further, for an image that has not been assigned a keyword, an operation of presenting a reduced image and selecting it manually is required, and the search operation is complicated. It was.
[0006]
The present invention has been made in view of the above-described problems, and an object of the present invention is to provide an image search apparatus and method capable of searching for similar images at high speed in consideration of the arrangement of image feature amounts.
[0007]
Another object of the present invention is to provide an image search apparatus and method that can search for a similar image in consideration of the arrangement of image feature amounts and can search for a similar image that absorbs differences due to changes in shooting conditions. It is to provide.
[0008]
Further, another object of the present invention is to reduce the amount of calculation of similarity by expressing the feature quantity group with one label and expressing the image with a label matrix to calculate the similarity between the images, and to quickly make the similarity It is to enable image retrieval.
[0009]
Another object of the present invention is to appropriately manage the label matrix and to significantly improve the processing speed of image search processing using labels.
[0010]
Another object of the present invention is to allow ambiguity before and after the label position such as DP matching and fuzzy nondeterministic automaton when comparing the similarity between the original image and the comparison target image by comparing the label sequence or the label matrix. The method is applied to enable a more effective similar image search.
[0011]
Another object of the present invention is to enable a similar image search focusing on a certain object (part) in an image.
[0012]
[Means for Solving the Problems]
  In order to achieve the above object, an image search apparatus according to an aspect of the present invention has the following arrangement, for example. That is,
  The image data is divided into a plurality of blocks, labels are assigned according to the feature quantities acquired for each block, and a label matrix is generated by arranging the assigned labels in a predetermined block order.FirstGenerating means;
  SaidFirstStorage means for storing the label matrix generated by the generation means in association with the image data;
  A holding means for extracting a partial label matrix from the generated label matrix and holding a table capable of searching image data using the extracted partial label matrix as a key;
  First search means for extracting a partial label matrix from image data of a search source and searching for similar images using the table;,
  Using each similar image searched by the first search means as a comparison destination image, a label between the label matrix of the comparison destination image obtained from the storage means and the label matrix obtained from the image data of the search source A second search means for performing a matching process based on a distance to obtain a similarity, and obtaining a search result based on the obtained similarity;
  When a partial image constituting a part of an image is designated as a search source image, a label corresponding to the feature amount is assigned to a block including partial image data representing the partial image, and a label is assigned to other blocks. Second generation means for generating a label matrix of a search source image by assigning a label indicating that the distance between and is zero;
  A third generation means for generating a partial label matrix based on a label given with respect to a block including image data of the partial image;
  The first and second search means search for an image using the label matrix and the partial label matrix generated by the second generation means and the third generation means..
[0013]
  An image search apparatus according to another aspect of the present invention for achieving the above object has the following fairness. That is,
  First generation means for dividing the image data into a plurality of blocks, assigning labels according to the feature amounts acquired for each block, and generating a label matrix by arranging the assigned labels in a predetermined block order;
  Storage means for storing the label matrix generated by the first generation means in association with the image data;
  A holding means for extracting a partial label matrix from the generated label matrix and holding a table from which the image data can be searched using the extracted partial label matrix as a key;
  A first search means for extracting a partial label matrix from image data of a search source and searching for a similar image using the table;
  Using each similar image searched by the first search means as a comparison destination image, a label between the label matrix of the comparison destination image obtained from the storage means and the label matrix obtained from the image data of the search source A second search means for performing a matching process based on a distance to obtain a similarity, and obtaining a search result based on the obtained similarity;
  The label matrix represents a two-dimensional label matrix;
  The second search means is
  By DP matching, a label unit in a row unit extracted from the label matrix of the image data of the search source and a label column in a unit of row extracted from the label matrix of the comparison destination image data obtained by the first search unit First matching means for obtaining a row sequence of the comparison target image data by associating with each other;
  Second matching means for obtaining a similarity between the row arrangement of the label matrix of the original image data and the row arrangement obtained by the first matching means by DP matching.
[0015]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings.
[0016]
FIG. 1 is a block diagram showing a control configuration of the image search apparatus according to the present embodiment. In the figure, reference numeral 101 denotes a CPU which executes various controls in the image search apparatus of this embodiment. Reference numeral 102 denotes a ROM which stores a boot program executed when the apparatus is started up and various data. Reference numeral 103 denotes a RAM which stores a control program for processing by the CPU 101 and provides a work area when the CPU 101 executes various controls. A keyboard 104 and a mouse 105 provide various input operation environments for the user.
[0017]
An external storage device 106 includes a hard disk, a floppy disk, a CD-ROM, and the like. Reference numeral 107 denotes a network interface, which enables communication with each device on the network. Reference numeral 109 denotes an interface, and 110 denotes a scanner for reading an image. Reference numeral 111 denotes a bus connecting the above-described components. It should be noted that a control program that realizes processing shown in each flowchart described below may be stored in the ROM 102 or may be loaded into the RAM 103 from the external storage device 106.
[0018]
In the above configuration, the scanner 110 and the external storage device 106 may be replaced with those arranged on the network.
[0019]
FIG. 2 is a block diagram showing a functional configuration of the image search apparatus according to the present embodiment. In the figure, reference numeral 11 denotes a user interface unit, which detects various operation inputs from the user using the display 107, the keyboard 104, and the mouse 105. An image input unit 12 reads an image by the scanner 110. An image memory 13 stores image data obtained by the image input unit 12 in a predetermined area of the RAM 103. Reference numeral 14 denotes an image feature amount extraction unit, which extracts a feature amount of an image stored in the image memory 13 by a procedure described later. Reference numeral 15 denotes a feature amount label matrix forming unit, which generates a label matrix based on the feature amount obtained by the image feature amount extracting unit 14. Reference numeral 16 denotes a pattern matching unit, which calculates the similarity between the label matrix of the designated image and the label matrix of the image stored in the image storage unit 17 and searches for similar images. An image storage unit 17 stores image data obtained by the image input unit 12 or the like.
[0020]
Reference numeral 18 denotes an image management database (hereinafter referred to as an image management DB), which manages image data stored in the image storage unit 17 in the data format shown in FIG. Reference numeral 19 denotes a label component index, which stores the label component index file shown in FIG. The use of the label component index will be described later with reference to the flowchart of FIG.
[0021]
An operation example of the image search apparatus of the present embodiment having the above configuration will be described below. In the following example, three colors of red (R), green (G), and blue (B) are adopted as image feature amounts focused on color, and description will be made using processing in a three-dimensional color space.
[0022]
[Image registration process]
First, processing performed at the time of image registration will be described. FIG. 5 is a flowchart showing the procedure of image registration processing according to this embodiment. First, in step S <b> 11, an image is read using the image input unit 12 in accordance with a user instruction via the user interface unit 11 and held in the image memory 13. Next, in step S12, this image is divided into a plurality of blocks. In this embodiment, an image is divided into a plurality of vertical and horizontal blocks. FIG. 6 is a diagram showing an example of block division of an image according to the present embodiment. As shown in the figure, in this embodiment, it is assumed that an image is divided into a total of 9 blocks of 3 × 3 for the sake of explanation. Next, in step S13, the feature amount of each divided block is calculated, and the obtained feature amount is labeled according to the following procedure.
[0023]
Note that the 3 × 3 division used in the present embodiment is for illustrative purposes only. Actually, it is preferable to set the number of divisions to 10 × 10 or more for natural images. In addition, if the product is shown on a white plain background, the number of divisions is preferably 13 × 13 or more.
[0024]
FIG. 7 is a diagram for explaining a multidimensional feature amount space according to the present embodiment. As shown in FIG. 6, the multi-dimensional feature space (RGB color space) is divided into a plurality of blocks (color blocks), that is, cells (color cells), and each cell (color cell) is uniquely identified by a serial number. Give a label. Here, the reason why the multidimensional feature amount space (RGB color space) is divided into a plurality of blocks is to absorb a subtle difference in feature amount (color).
[0025]
For the multi-dimensional feature space, the image feature is not used as it is, and after obtaining and normalizing (normalizing) the average and variance of each parameter by experiment, for example, orthogonal transformation such as principal component analysis is performed, It is conceivable to use a meaningful dimension. The “significant dimension” is a dimension composed of principal component axes having a large contribution rate in the principal component analysis.
[0026]
In step S13, a predetermined image feature amount calculation process is performed on each divided block obtained in step S12 to determine which cell in the multi-dimensional feature amount space belongs to and a corresponding label. This process is performed for all blocks. That is, for each divided image block, calculation processing is performed to determine which color cell all pixels belong to, and the label of the most frequently used color cell is determined as the parameter label (color label) of the divided image block. Processing is performed for all blocks.
[0027]
When parameter labels are assigned to the respective blocks as described above, in step S14, the parameter labels assigned to the respective blocks are arranged in a predetermined block order, thereby obtaining a parameter label matrix (hereinafter referred to as a label matrix). ) Is generated.
[0028]
FIG. 8 is a diagram illustrating an example of a block order when generating a label string. The above-mentioned parameter labels are arranged according to the numbers at the bottom of the divided image block shown in FIG. When storing a label matrix in the image management database 18 or the label component index 19, a two-dimensional label matrix arranged in a predetermined order as described above is stored as described above. Such a one-dimensional form is also referred to as a label matrix.
[0029]
Here, in FIG. 8A, the divided blocks are scanned in the horizontal direction from left to right, and this horizontal scanning is performed from top to bottom. As a scanning method applicable to this embodiment,
・ Horizontal direction (as shown in FIG. 8A, in addition to the order of scanning from left to right from top to bottom, as shown in FIGS. 8B to 8D, left There are four scanning methods, such as scanning from right to right, from bottom to top)
・ Vertical direction (4 scanning methods are possible, such as scanning from top to bottom from left to right)
As shown in FIG. 8E, the scan is divided into even lines and odd lines.
[0030]
In this embodiment, the scanning method shown in FIG. 8A is adopted, but it is obvious that the other scanning methods described above can be applied.
[0031]
In step S15, the label sequence and image data obtained as described above are reflected in the image storage unit 17 and the image management DB 18. That is, an image ID is acquired for the image data read in step S11, and these are paired and stored in the image storage unit 17. Further, the image management DB record shown in FIG. 3 is generated in association with the image ID, and this is registered in the image management DB 18.
[0032]
In step S16, a partial label matrix is acquired from the label matrix of the image, and a label component index file as shown in FIG. 4 is updated. The label component index file uses a partial label matrix (including a single label) as a search key, and stores a record that stores an image ID group and the number of partial label matrices included in each image. According to this label component index, by providing a partial label matrix (including a single label), the image ID group having the partial label matrix and the inclusion of the partial label matrix (including a single label) of each image Numbers are obtained at high speed. Information is stored at an unprocessed level at the stage of image registration. Also, all labels constituting the label matrix are registered in the label component index. For example, the label histogram information is obtained, and all the appearing labels are reflected in the label component index. For example, when a label component index in which a single label is registered is used, a data record using all labels appearing in the histogram information as keys is updated.
[0033]
Note that the don't care label described later is not used in the image registration process. That is, the don't care label is not registered in the label component index. The above is the process performed at the time of image registration.
[0034]
[Image search processing]
Next, similar image search processing according to the present embodiment will be described with reference to the flowcharts of FIGS. 9 and 10. The search process is roughly divided into different processes depending on whether the search is a similar image search of the entire image or a search focusing on an object. In the following, description will be given in order.
[0035]
FIG. 9 is a flowchart for explaining the image search procedure according to this embodiment. First, in step S21, it is specified whether the search is a similar image search of the entire image or a search focusing on a desired object (or partial area). When the search method is designated, in step S22, a label matrix and a feature part label matrix suitable for the designated search method are acquired. In step S22, the process differs depending on the designated search method. Hereinafter, the procedure of the process in step S22 will be described in detail with reference to the flowchart of FIG.
[0036]
FIG. 10 is a flowchart for explaining a feature part label matrix extraction procedure according to this embodiment.
[0037]
First, in step S41, the designated search method is determined. If the designated search method is “similar search of entire image”, the process proceeds to step S42. In step S42, the searcher selects an image for which a similar search is to be performed on the entire image. That is, a similar search source image is designated. In step S43, a label matrix corresponding to the designated similar search source image is acquired from the image DB 18. In step S44, a feature part label matrix is acquired.
[0038]
Note that there are various means for obtaining the feature part label matrix, such as learning through experience and experiment, and statistical methods, and there is no particular limitation here. For example, a characteristic partial label matrix (including a single label) is obtained by performing histogram analysis on the label matrix, for example.
[0039]
On the other hand, if it is determined in step S41 that the similarity search for the target object has been designated, the process proceeds to step S45 to determine an image of the target object as a search source. As a method of determining the object of interest, for example, the following method can be cited.
[0040]
(1) An object of interest in an image is indicated by a pointing device, the object of interest is separated from a background image by image processing, and the object image of interest is created by removing the background image. As an example, an existing image processing technique for extracting a target object from coordinates specified by a user in consideration of an edge or color change is used. In this case, if there is an error in the extracted content of the target object, the target object can be accurately extracted by correcting the extracted contour line with the pointing device.
[0041]
(2) A target object image is selected from the object image library. As an example of the object image library, a method for installing an object image that can be searched for a keyword in advance in its own system or a method for searching an object image with a keyword on a server via a network such as the Internet The method of obtaining can be considered.
[0042]
(3) Draw the image of the object of interest using the drawing software, or change the color of the object of interest image obtained in (1) or (2) and use it as the object of interest image.
[0043]
Next, proceeding to step S46, the extracted object image of interest is pasted on an image frame in consideration of its size, position, and direction, and a similar search source image is created. In step S47, the similar search source image is divided into blocks, and only the portion of the divided block including the target object image obtained by any one of the methods (1) to (3) is the same as that for registration. Image feature extraction processing (for example, representative color extraction) is performed for labeling. Further, in step S48, a don't care label with a penalty of 0 is assigned to any divided block that does not include the target object image, and a label matrix of the similarity search source image is generated. To do.
[0044]
In the image obtained by the above methods (1) to (3), the image division method may be changed according to the ratio of the area occupied by the object image of interest and the area of the background region. For example, when the area of the target object image is equal to or less than a predetermined ratio, a predetermined number of divided blocks are obtained for a rectangular area circumscribing the target object image, and label allocation is performed. Further, if the area of the object image of interest exceeds a predetermined ratio, a divided block is obtained for the above-described image frame.
[0045]
In step S49, the above-described histogram processing is performed on each label obtained by removing the don't-care label from the label matrix of the similar search source image obtained in step S48, thereby obtaining a feature part label matrix. . For example, a characteristic partial label string is acquired by performing histogram analysis on a label matrix obtained only from the image division block group including the object of interest. In the histogram analysis, weighted histogram processing may be performed in consideration of the prior existence probability of color or saturation in the color space. In addition, a plurality of characteristic partial label matrices can also exist in the similarity search using the object of interest.
[0046]
As described above, when the label matrix and the feature part label matrix of the similar search source image suitable for the search method are acquired, the process proceeds to step S23 in FIG. In step S23, the upper limit value and the lower limit value of the number of contents of the feature part label matrix obtained in step S22 of the label component index 19 are determined. The range of the number of contents determined by the upper limit value and the lower limit value is used for narrowing down (presearch) the matching processing target image performed using the label component index 19.
[0047]
There are various means for determining the upper and lower limits of the number of characteristic partial label matrix contents, such as learning through experience and experiment, and statistical methods, and there is no particular limitation here. As an example, designation is performed at a ratio to the number n0 of feature part label matrices acquired in the search source image. For example, ambiguity fuzziness with a value of 1 or more is introduced (the larger the value, the more ambiguous search is performed), the upper limit is n0 × fuzziness proportional to the ambiguity, and the lower limit is n0 ÷ fuzziness inversely proportional.
[0048]
In this way, it is possible to adjust the ambiguity of pre-search by giving ambiguity fuzziness according to the degree of ambiguity when searching for enlarged / reduced images or searching for slightly different images. .
[0049]
Next, in step S24, the label component index file is referenced, and an image ID group including the number of feature portion label matrices within the range of the upper limit value and the lower limit value determined in step S23 is obtained. As shown in FIG. 4, the label component index is registered with the image ID of the image including the partial label matrix and the number of its contents, using the partial label matrix as a key. Therefore, a search can be performed using the feature part label matrix obtained in step S44 or step S49, and an image ID including the feature part label matrix in a desired content number range can be obtained easily and at high speed.
[0050]
Narrowing is performed until the number of image IDs acquired using the partial feature label matrix is equal to or less than a predetermined target number. For example, image ID groups are sequentially acquired by the above-described method using feature part label sequences selected in order from the top of the histogram, and the logical product of the sequentially acquired image ID groups is obtained to narrow down to a target number of image IDs. Go. In the histogram analysis, weighted histogram processing may be performed in consideration of the prior existence probability of color or saturation in the color space. A plurality of characteristic partial label matrices may exist.
[0051]
Next, the label matrix group including the same label for a certain degree is compared with the label matrix of the image of the similarity search source, and the label matrix group close to the label matrix of the image to be subjected to the similarity search is output as a search result together with the similarity.
[0052]
When the searcher selects an image for which similarity search is to be performed, the corresponding color label matrix is obtained from the image management DB, and the color label matrix group of the already registered images is obtained from the index file, and compared with this. The color label matrix group close to the color label matrix of the image to be searched for similarity is output as a search result together with the similarity.
[0053]
The processing in step S24 described above is slow when matching (comparison) between label matrices is performed for all registered images. Therefore, similar images are extracted in advance, and similar search sources are extracted for the extracted images. This is for comparison with the label matrix of the image. Of course, if the processing can be delayed, a high-accuracy search may be performed by comparing with all the label matrices of the registered image.
[0054]
Next, in step S25, pattern matching is performed between the label matrix of the search source image and the label matrix of the image acquired in step S24. That is, the label matrix of an image having a common label for a certain degree and the label matrix of a similar search source image are compared by performing a matching process considering an inter-label penalty matrix. The penalty matrix will be described later with reference to FIG. The degree of similarity is determined based on the penalty value (distance between labels) obtained as a result of this comparison. In the present embodiment, two-dimensional DP matching considering the two-dimensional arrangement of labels in the label matrix is used. The two-dimensional DP matching will be described later.
[0055]
In step S26, it is determined whether the similarity obtained as a result of the matching process in step S25 is greater than or equal to a predetermined threshold value. If the similarity is greater than or equal to a predetermined threshold value, the process proceeds to step S27, where the image ID is registered in the search result and sorted in descending order of similarity. On the other hand, when it is determined in step S26 that the similarity does not reach the predetermined threshold value, step S27 is skipped and excluded from the search result. In step S28, it is determined whether or not the processing in steps S25 to S27 has been completed for all the images acquired in step S24. If there is an unprocessed image, the process returns to step S25, and if all the images have been processed. Proceed to step S29.
[0056]
In step S29, referring to the image management database 18, a full path file name corresponding to the image ID registered in descending order of similarity is obtained as a search result, and this is presented to the user.
[0057]
[2D DP matching]
Next, two-dimensional DP matching for comparing similarities between label matrices will be described.
[0058]
FIG. 11 is a diagram illustrating an example of a penalty matrix between labels used when comparing label strings to obtain similarity. The smaller the value in the matrix, the more similar. That is, in pattern matching between color labels, a penalty (distance) is reduced between adjacent color cells, and a large penalty is given to distant ones. For example, the penalty of label 2 and label 6 is “7”. Moreover, the penalty between the same labels is “0” as a matter of course. The purpose of using this matrix is to perform distance determination according to the similarity of labels. That is, in this embodiment, since the RGB color space is used as the feature amount space, the distance determination according to the similarity of colors can be performed. However, regarding the don't care label, the penalty is defined as 0 for all labels.
[0059]
For example, in the example illustrated in FIG. 12, the label string of the search source image is “112313441”, and the label string of the search target image is “113224442”. Such DP matching of one-dimensional label strings, that is, one-dimensional DP matching is generally well known. That is, when DP matching is performed on the label sequence of FIG. 12 using the penalty matrix of FIG. 11, the distance (final solution) between both label sequences is obtained as shown in FIG. In this example, the following conditions are used as the tilt restriction. That is, in FIG. 14, the costs on the lattice points (i−1, j), (i−1, j−1) and (i, j−1) are respectively expressed as g (i−1, j) and g (i -1, j-1), g (i, j-1), and the penalty on grid point (i, j) is d (i, j), the cost on grid point (i, j) g (i, j) is obtained by the following recurrence formula.
[0060]
[Expression 1]

[0061]
In the comparison between the label strings, matching that allows an ambiguous comparison of label sequences such as automata may be performed. By using such an obfuscation technique, a low penalty is given to the addition of an extra label, the loss of a label, and the repetition of the same label, and the penalty between labels is between the color labels in FIG. By calculating the distance between label sequences using a penalty matrix, ambiguous pattern matching can be performed. As the automaton, the “fuzzy nondeterministic finite automaton” described in “Ambiguous Character String Retrieval Method and System Using Fuzzy Nondeterministic Finite Automaton of JP-A-8-241335” can be applied.
[0062]
In the present embodiment, similarity comparison (similarity calculation) between label matrices is performed using two-dimensional DP matching obtained by extending the one-dimensional DP matching to two dimensions. Hereinafter, two-dimensional DP matching will be described.
[0063]
FIG. 15 is a diagram for explaining similarity calculation processing according to this embodiment. The label matrix of the similar search source image acquired in the above-described step S22 (steps S43 and S48) can be arranged as shown in FIG. 15B according to the scanning method. If one of the label matrix groups of the image ID extracted in step S24 is a similar comparison destination image, the label matrix can be arranged as shown in FIG.
[0064]
First, the label column “abc” in the first row of the similar comparison destination image and the label columns (“123”, “456”, “789”) in the first to third rows of the similar search source image, respectively Is obtained by DP matching, and the row number in the similar search source image of the label column having the smallest distance is stored in the corresponding position of the similar line matrix ((c) of FIG. 15). If the obtained minimum distance is larger than a predetermined threshold, it is determined that no row is similar, and “!” Is stored at the corresponding position in the similar line matrix. Due to the nature of DP matching, for example, even if the angle of the image is slightly changed in the horizontal direction, a similar line (line) can be detected by the above processing. By performing the above processing for all rows (“def” and “ghi”) of the similar comparison target image, a similar line matrix in the column direction as shown in FIG. 15C is obtained. In this processing, processing is not performed for a column consisting only of don't care labels (don't care label columns) in the similar search source image. This is because a column consisting only of a don't care label has a distance of zero from all the columns of the similar comparison target image.
[0065]
In FIG. 15C, a row similar to “abc” does not exist in the similar search source image, and a row similar to “def” is the first row of the similar search source image, a row similar to “ghi”. Is the second line of the similar search source image. The similarity between the similar line matrix obtained as described above and the standard line matrix (the row sequence of similar search source images, which is “123” in this example) is calculated using DP matching, Is output as the similarity between the similarity search source image and the similarity comparison target image. In DP matching, as is well known, the comparison is performed by expanding / contracting the label sequence to be compared (withstanding the partner to be compared without proceeding) so that the label sequence to be compared has the smallest similarity distance. I do. It is also possible to give a constraint condition (the width of the matching window) to what extent the expansion / contraction (withstand) can be made.
[0066]
FIG. 16 is a flowchart for explaining the similarity calculation procedure employing the two-dimensional DP matching according to this embodiment. The process described with reference to FIG. 15 will be further described with reference to the flowchart of FIG.
[0067]
First, in step S101, a variable i representing the row number of the similar comparison target image and a variable j representing the row number of the similar search source image are initialized to 1, and are set to indicate the first row. In step S102, the i-th label column of the similar comparison target image is acquired. For example, in the case of FIG. 15, if i = 1, “abc” is acquired. In step S103, the label column in the j-th row of the similar search source image is acquired. For example, in FIG. 15, if j = 1, “123” is acquired. In step S103a, it is determined whether or not the label string acquired in step S103 is a don't-care label string. If it is a don't-care label string, the process proceeds to step S106, and if not, the process proceeds to step S104.
[0068]
In step S104, the distance between the two label columns obtained in steps S102 and S103 is obtained by DP matching using the color cell penalty matrix described in FIG. In step S105, if the distance obtained in step S104 is the minimum distance obtained so far with respect to the i-th row, the row number (j) is stored in the line matrix element LINE [i]. .
[0069]
The processing from step S103 to step S105 is performed for all rows of the similar search source image (steps S106 and S107). As described above, for the i-th label column of the similar comparison target image, the number of the closest row among the rows included in the similar search source image is stored in LINE [i]. Become.
[0070]
In step S108, LINE [i] obtained by the above processing is compared with a predetermined threshold (Thresh). If LINE [i] is equal to or greater than Thresh, the process proceeds to step S108, and “!” Indicating that no line is similar is stored in LINE [i].
[0071]
LINE [1] to LINE [imax] are obtained by executing the processing of steps S102 to S108 described above for all the rows of the similar comparison target image (steps S110 and S111). Output as [].
[0072]
Next, in step S113, DP matching between the standard line matrix “12... Imax” and the similar line matrix LINE [] is performed, and the distance between the two is calculated. Here, the standard line matrix is a matrix starting from 1 and increasing by 1 in the column direction.
[0073]
Here, the penalty used in DP matching between the standard line matrix and the similar line matrix will be described. Two methods are conceivable for setting the DP matching penalty between the similar line matrix in the column direction and the standard line matrix. That is, there are two dynamic penalties and fixed penalties.
[0074]
The dynamic penalty dynamically sets a penalty between line numbers, and the penalty between line numbers varies depending on the image. In the present embodiment, the distance of the label matrix in the horizontal (row) direction of the similar search source image is obtained, and a penalty between the rows is obtained based on the distance.
[0075]
FIG. 17 is a flowchart showing a dynamic penalty value setting procedure according to this embodiment. In step S121, variable i is set to 1 and variable j is set to 2. Next, in step S122, the label column of the i-th row of the similar search source image is acquired, and in step S123, the label column of the j-th row of the similar search source image is acquired. In step S124, DP matching is performed on the i-th label column and the j-th label column of the similar search source image using a color penalty matrix to obtain a distance. In step S125, the DP matching distance obtained in step S124 is stored in LINE [i] [j] as a penalty between the i-th label column and the j-th label column of the similar search source image, It is stored in LINE [j] [i] as a penalty between the j-th label column and the i-th label column of the similar search source image.
[0076]
Steps S123 to S125 are repeated until the value of variable j reaches jmax in step S126. As a result, for the i-th label column, a penalty value between the i + 1-th and jmax-th label columns is determined. Then, in steps S128, S129, and S130, the processes in steps S123 to S126 are repeated until the value of the variable i becomes imax-1. As a result, the penalty value determined in the above process is stored in LINE [i] “j” except for the diagonal component of i = j.
[0077]
Next, in step S131, the penalty of the diagonal component of LINE [i] [j] that has not been determined in the above process is determined. Since this portion is i = j and is the same label string, the distance is 0, and thus penalty 0 is stored. In step S132, a penalty is determined for “!”. That is, the penalty for “!” Is set to a value that is somewhat larger than the maximum penalty value among all the penalty values of LINE [i] [j]. However, if this penalty value is extremely increased, the ambiguous hit property is lost.
[0078]
DP matching in step S113 is performed using the penalty between the label strings calculated for the similar search source image as described above, and the similarity between the similarity search source image and the similar comparison target image is acquired.
[0079]
On the other hand, in the case of a fixed penalty, a penalty of “0” is given as a penalty for DP matching if the labels match, and a certain amount of penalty is given if they do not match or are compared with “!”. . In this case, the penalty is always the same regardless of the similar search source image. The process of step S113 is executed using such a fixed penalty, and the similarity between the similarity search source image and the similar comparison destination image is determined.
[0080]
The matching process described above has the following characteristics. If (a) and (b) in FIG. 15 are very similar, the similar line matrix is “123”, and the distance is zero. If the similar line matrix is “! 12” or “212”, the similar comparison destination image may be shifted downward with respect to the similar search source image, and the similar line matrix is “23”. ! "Or" 233 ", there is a possibility that the similar comparison target image is shifted upward with respect to the similar search source image. If the similar line matrix is “13!” Or “! 13”, there is a possibility that the similar comparison target image is reduced with respect to the similar search source image. Similarly, it is possible to detect a case where the similar comparison target image is an enlargement of the similar search source image.
[0081]
As described in step S113 above, by performing DP matching between the similar line matrix and the standard line matrix, the shift in the vertical direction is effectively absorbed. Therefore, as described above, the difference between the similar search source image and the similar comparison target image due to the upward or downward shift, enlargement, reduction, or the like is effectively absorbed, and a good similarity search is performed. Is possible.
[0082]
That is, the two-dimensional DP matching according to the present embodiment is a matching that allows ambiguity before and after each label column of the label matrix, and has a property of absorbing the influence of image positional deviation. Also, it is expected that the color of the block will be slightly different due to the change in the position of the object due to the difference in angle and the position of the object cut by the block, but this difference is absorbed by the above penalty matrix. Will be. As described above, the synergistic effect of the matching that allows ambiguity by the two-dimensional DP matching of the present embodiment and the allowance of ambiguity of the feature amount by the penalty matrix has no influence on the vertical / horizontal enlargement / reduction shift. Less matching is possible.
[0083]
However, it is preferable to use a dynamic penalty between the dynamic penalty and the fixed penalty. For example, if there is a similar source search image of a whole wheat field, it is conceivable that every line has a similar label sequence. On the other hand, if there is an image of the whole wheat field in the similar comparison destination image, all the similar line matrices of this image may have the first line number 1 and become “111”. In this case, all lines of the similar search source image are similar images, and hits at low distances are not made unless the penalty between line numbers is extremely small. However, when a dynamic penalty is used, the penalty between line numbers is low, and a result with high similarity can be obtained.
[0084]
On the other hand, if a fixed penalty is used, the penalty value increases for “123” and “111”, and the similarity decreases.
[0085]
By performing DP matching in the horizontal and vertical directions, that is, two-dimensionally as described above, it is possible to perform a search even if the image angle changes in the horizontal and vertical directions and further in the oblique direction, or the object moves. It is. Also, it is possible to search for a zoomed-up photographed image and a macro photographed image by the DP matching time series expansion / contraction characteristics.
[0086]
In the above embodiment, the similar line matrix is obtained using the horizontal block and the corresponding label column. However, the similar line matrix may be obtained using the vertical block and the corresponding label column. It can be realized by the same method as described above.
[0087]
As described above, according to the present embodiment, a feature amount group (a group of feature amounts obtained by dividing a feature amount space) is expressed by one symbol (ie, labeled), and the similarity between labels is expressed. The base distance is given by the above-described two-dimensional DP matching and penalty matrix. For this reason, the amount of calculation of the distance between the blocks of two images can be greatly reduced, and similar feature amounts are represented by the same label, so that similar images can be searched well. it can.
[0088]
In addition, (1) The concept of distance between labels using a penalty matrix is introduced, and (2) the label position to be compared can be moved ambiguously back and forth so that the total distance is minimum (similarity is maximum). By introducing the above-described two-dimensional DP matching that realizes matrix comparison, it is possible to search even if the angle of the image changes slightly, and it is possible to search for images having similar atmospheres.
[0089]
Furthermore, according to the above embodiment, the use of the index database (label component index) further speeds up the image search.
[0090]
That is, according to the above-described embodiment, it is possible to perform a high-speed search for similar images in consideration of the arrangement of image feature amounts, and to search for similar images that absorb differences due to changes in shooting conditions, which has been difficult in the past. It is possible to perform robust similar image search such as absorbing a certain amount of image feature amount due to image angle change, object position change, or other shooting conditions fluctuation etc. Become.
[0091]
In each of the above embodiments, an example of performing natural image search has been described. However, it is obvious to those skilled in the art that the present invention is a technique that can be applied to search for artificial images such as CG and CAD. is there.
[0092]
In each of the above embodiments, color information is selected as the image feature amount. However, the present invention is not limited to this, and the present invention can be implemented by obtaining other image parameters for each image division block.
[0093]
In the above-described embodiment, an example of recognition with one feature amount has been described. However, a high-speed search from a plurality of feature amounts may be performed by performing a logical operation with a search result with another feature amount. Is possible.
[0094]
When a search using a plurality of image feature amounts is performed on one image, the similarity obtained in the present invention is regarded as one new image feature amount, and a multivariate analysis using a plurality of parameters is performed. It is also possible to perform a search using a statistical distance measure. In the above-described embodiment, similar images having a degree of similarity exceeding a predetermined value are obtained as search results. However, a predetermined number of images may be output in advance as search results in descending order of similarity. Needless to say.
[0095]
Note that by specifying the ambiguity, the search ambiguity can be set as desired by changing the width of the so-called matching window in DP matching. FIG. 18 is a diagram for explaining a matching window in DP matching. In FIG. 18, the straight line A is represented by J = I + r, and the straight line B is represented by J = I−r. The width of the alignment window can be changed by changing the value of r. Therefore, if the value of r is changed by designating the degree of ambiguity from the keyboard 104, the similarity search can be performed with the user's desired degree of ambiguity (matching window width).
[0096]
In the two-dimensional DP matching as in the above embodiment, the width of the matching window in the DP matching in the horizontal direction and the width of the matching window in the DP matching in the vertical direction may be set separately. Alternatively, both matching windows may be configured to change at different rates of change. In this way, the user can set finer ambiguity when searching for similar images. For example, in the case where the block order as shown in FIG. 8 is used, when it is desired to allow the object of interest to move in the horizontal direction in the search source image, or the search source image is a horizontally long image. In such a case, the width of the matching window in the horizontal DP matching may be increased in order to increase the ambiguity in the horizontal direction.
[0097]
Note that in the case of a similarity search in which one parameter is added to one image that cannot be blocked, the similarity (created using the total sum of penalties) obtained by the present invention is used as one new feature value, and the statistics It is also possible to perform a search based on a typical distance measure. In the above-described embodiment, similar images having a degree of similarity exceeding a predetermined value are obtained as search results. However, a predetermined number of images may be output in advance as search results in descending order of similarity. Needless to say.
[0098]
Furthermore, instead of DP matching processing, the label position to be compared, such as fuzzy non-deterministic automata, can be moved back and forth in an ambiguous manner, and the comparison of label sequences that minimizes the total distance (maximum similarity) is realized. It is also possible to introduce a technique to do this, which makes it possible to search even if the angle of the image changes slightly, and to search for images with similar atmospheres.
[0099]
Furthermore, according to the above embodiment, the use of the index database (label component index (FIG. 4)) further speeds up the image search.
[0100]
As described above, according to the present embodiment, an image search apparatus and method suitable for searching for an image without a keyword are provided.
[0101]
There are many barriers to image recognition technology, and it is possible to find a reduced image close to the image you want, and provide a means to search for an image similar to that image. A means for obtaining the image desired by the searcher with probability is conceivable.
[0102]
In this case, this method can perform robust similar image search at high speed, such as changing the angle of an image, changing the position of an object, or absorbing a certain amount of image feature amount depending on shooting conditions. Is possible. In addition, it has become possible to perform a search with an emphasis on the object of interest without being affected by the background search, which was a weak point of the conventional similar image search.
[0103]
Note that the present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.), or a device (for example, a copier, a facsimile device, etc.) including a single device. You may apply to.
[0104]
Another object of the present invention is to supply a storage medium storing software program codes for implementing the functions of the above-described embodiments to a system or apparatus, and the computer (or CPU or MPU) of the system or apparatus stores the storage medium. Needless to say, this can also be achieved by reading and executing the program code stored in the.
[0105]
In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention.
[0106]
As a storage medium for supplying the program code, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, or the like can be used.
[0107]
Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) operating on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.
[0108]
Further, after the program code read from the storage medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. It goes without saying that the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.
[0109]
【The invention's effect】
As described above, according to the present invention, it is possible to search for a similar image at high speed in consideration of the arrangement of the feature amount of the image.
[0110]
Further, according to the present invention, it is possible to search for a similar image in consideration of the arrangement of the feature amount of the image and to search for a similar image that absorbs a difference due to a change in shooting conditions.
[0111]
Further, according to the present invention, since the feature amount group is represented by one label and the image is represented by a label matrix to calculate the similarity between the images, the calculation amount of the similarity is reduced. Search is possible.
[0112]
Further, according to the present invention, the label matrix is appropriately managed, and the processing speed of the image search process using the label is remarkably improved.
[0113]
Further, according to the present invention, a method for allowing ambiguity before and after the label position such as DP matching or fuzzy nondeterministic automaton when the similarity between the original image and the comparison target image is performed by comparing the label sequence or the label matrix. This makes it possible to search for more effective similar images.
[0114]
Further, according to the present invention, it is possible to perform a similar image search focusing on a certain object (part) in the image.
[0115]
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a control configuration of an image search apparatus according to an embodiment.
FIG. 2 is a block diagram illustrating a functional configuration of the image search apparatus according to the present embodiment.
FIG. 3 is a diagram illustrating a data configuration example of an image management database.
FIG. 4 is a diagram illustrating a data configuration example of a label component index.
FIG. 5 is a flowchart illustrating a procedure of image registration processing according to the present embodiment.
FIG. 6 is a diagram illustrating an example of block division of an image according to the present embodiment.
FIG. 7 is a diagram illustrating a multidimensional feature amount space according to the present embodiment.
FIG. 8 is a diagram illustrating an example of a block order when generating a label string.
FIG. 9 is a flowchart illustrating an image search procedure according to the present embodiment.
FIG. 10 is a flowchart for explaining a feature part label matrix extraction procedure according to the present embodiment;
FIG. 11 is a diagram illustrating an example of a penalty matrix between labels used when comparing a sequence of labels to obtain a similarity.
FIG. 12 is a diagram illustrating an example of a label string of a similar search source image and a label string of a similar search destination image.
FIG. 13 is a diagram illustrating one-dimensional DP matching.
FIG. 14 is a diagram for explaining a tilt restriction of DP matching.
FIG. 15 is a diagram illustrating similarity calculation processing according to the present embodiment.
FIG. 16 is a flowchart for explaining a similarity calculation procedure employing two-dimensional DP matching according to the present embodiment.
FIG. 17 is a flowchart showing a procedure for setting a dynamic penalty value according to the present embodiment.
FIG. 18 is a diagram for explaining a matching window in DP matching.

Claims

First generation means for dividing the image data into a plurality of blocks, assigning labels according to the feature amounts acquired for each block, and generating a label matrix by arranging the assigned labels in a predetermined block order;
Storage means for storing the label matrix generated by the first generation means in association with the image data;
A holding means for extracting a partial label matrix from the generated label matrix and holding a table capable of searching image data using the extracted partial label matrix as a key;
A first search means for extracting a partial label matrix from image data of a search source and searching for a similar image using the table ;
Using each similar image searched by the first search means as a comparison destination image, a label between the label matrix of the comparison destination image obtained from the storage means and the label matrix obtained from the image data of the search source A second search means for performing a matching process based on a distance to obtain a similarity, and obtaining a search result based on the obtained similarity;
When a partial image constituting a part of an image is designated as a search source image, a label corresponding to the feature amount is assigned to a block including partial image data representing the partial image, and a label is assigned to other blocks. Second generation means for generating a label matrix of a search source image by assigning a label indicating that the distance between and is zero;
A third generation means for generating a partial label matrix based on a label given with respect to a block including image data of the partial image;
The image search apparatus, wherein the first search unit and the second search unit search for an image using a label matrix and a partial label matrix generated by the second generation unit and the third generation unit .

The identifier of image data including the partial label matrix and the content number of the partial label matrix by each image data are registered in the table using each partial label matrix as a key. Image search device.

The label is a unique label given to each of the cells obtained by dividing the multidimensional feature amount space into a plurality of cells, and the generation unit calculates the feature amount for each of the blocks, and is calculated. The image search apparatus according to claim 1, wherein a label assigned to a cell to which the feature amount belongs is assigned to the block.

A designation means for designating a desired partial image in the image;
The image search apparatus according to claim 1 , further comprising an extraction unit that extracts partial image data corresponding to the partial image designated by the designation unit and provides the partial image data to the second generation unit.

A designation means for designating image data representing a desired object of interest;
The image processing apparatus further comprises: a combining unit that combines the image data specified by the specifying unit with an image having a predetermined background to generate a search source image and provides the generated image to the second generating unit. Item 2. The image search device according to Item 1 .

First generation means for dividing the image data into a plurality of blocks, assigning labels according to the feature amounts acquired for each block, and generating a label matrix by arranging the assigned labels in a predetermined block order;
Storage means for storing the label matrix generated by the first generation means in association with the image data;
A holding means for extracting a partial label matrix from the generated label matrix and holding a table from which the image data can be searched using the extracted partial label matrix as a key;
A first search means for extracting a partial label matrix from image data of a search source and searching for a similar image using the table;
As the first search means comparison destination image each similar image retrieved in the labeled matrix of the comparison target images obtained from the storage means, between the label between the search source of the resulting label matrix from the image data A second search means for obtaining a similarity by performing a matching process based on a distance and obtaining a search result based on the obtained similarity;
The label matrix represents a two-dimensional label matrix;
The second search means is
By DP matching, a label unit in a row unit extracted from the label matrix of the image data of the search source and a label column in a unit of row extracted from the label matrix of the comparison destination image data obtained by the first search unit First matching means for obtaining a row sequence of the comparison target image data by associating with each other;
An image search apparatus comprising: second matching means for obtaining a similarity between the row arrangement of the label matrix of the original image data and the row arrangement obtained by the first matching means by DP matching.

The first matching means has a penalty table that holds a penalty value for each pair of labels, and calculates the distance between the label sequence of the search source image and the label sequence of the comparison destination image using a DP matching method. The image search apparatus according to claim 6 , wherein the penalty table is referred to at the time.

The second matching means has a line-to-line penalty table that holds a penalty value for each pair of line numbers in a line arrangement, and uses a DP matching method to calculate the similarity between the line arrangement of the search source image and the line arrangement of the comparison destination image. The image search apparatus according to claim 6 , wherein the line spacing penalty table is referred to when calculating using the line spacing penalty table.

9. The storage device according to claim 8 , further comprising a holding unit that determines a penalty value for each pair of rows based on the similarity of each label column in the row direction of the original image data, and holds the penalty value as the penalty table between rows. The image search apparatus described.

The image search apparatus according to claim 6 , further comprising first setting means for setting a matching window width of DP matching used in the first matching means.

The image search apparatus according to claim 6 , further comprising a second setting unit configured to set a width of a matching window for DP matching used in the second matching unit.

The first search means extracts a partial label matrix from the image data of the search source using the same method as the holding means, and uses the table to store an image including the partial label matrix within a predetermined content number range. image retrieval apparatus according to any one of claims 1 to 11, characterized in that search.

The range of the content number of the search source of portions label matrix extracted from the image data, to claim 12, characterized by further comprising setting means for setting, based on the number of image data of the search source comprises The image search apparatus described.

The image search apparatus according to claim 13 , wherein the setting unit determines the range of the content number based on a number included in the image data of the search source and a specified ambiguity.

A first generation step of dividing the image data into a plurality of blocks, giving a label according to the feature amount acquired for each block, and generating a label matrix by arranging the given labels in a predetermined block order;
A storage step of storing the label matrix generated in the first generation step in a storage device in association with the image data;
Extracting a partial label matrix from the generated label matrix, and holding a table capable of searching for image data using the extracted partial label matrix as a key;
A first search step of extracting a partial label matrix from the image data of the search source and searching for similar images using the table ;
Using each similar image searched in the first search step as a comparison destination image, a label between the label matrix of the comparison destination image obtained from the storage device and the label matrix obtained from the image data of the search source A second search step of performing a matching process based on a distance to obtain a similarity, and obtaining a search result based on the obtained similarity;
When a partial image constituting a part of an image is designated as a search source image, a label corresponding to the feature amount is assigned to a block including partial image data representing the partial image, and a label is assigned to other blocks. A second generation step of generating a label matrix of a search source image by assigning a label indicating that the distance between and is zero;
A third generation step of generating a partial label matrix based on a label given with respect to a block including image data of the partial image,
In the image search method, the first and second search steps search for an image using the label matrix and the partial label matrix generated by the second generation step and the third generation step .

A first generation step of dividing the image data into a plurality of blocks, giving a label according to the feature amount acquired for each block, and generating a label matrix by arranging the given labels in a predetermined block order;
A storage step of storing the label matrix generated by the first generation unit in a storage device in association with the image data;
Extracting a partial label matrix from the generated label matrix, and holding a table capable of searching for image data using the extracted partial label matrix as a key;
A first search step of extracting a partial label matrix from the image data of the search source and searching for similar images using the table;
Using each similar image searched in the first search step as a comparison destination image, a label between the label matrix of the comparison destination image obtained from the storage device and the label matrix obtained from the image data of the search source A second search step of performing a matching process based on a distance to obtain a similarity, and obtaining a search result based on the obtained similarity;
The label matrix represents a two-dimensional label matrix;
The second search step includes
By DP matching, a label unit in a row unit extracted from the label matrix of the image data of the search source and a label column in a unit of row extracted from the label matrix of the comparison destination image data obtained by the first search unit A first matching step of obtaining a row sequence of the comparison target image data by associating;
An image retrieval method comprising: a second matching step of obtaining a similarity between the row arrangement of the label matrix of the original image data and the row arrangement obtained by the first matching means by DP matching.

A computer-readable storage medium storing a control program for causing a computer to execute the image search method according to claim 15 or 16.