JP2004361987A

JP2004361987A - Image retrieval system, image classification system, image retrieval program, image classification program, image retrieval method, and image classification method

Info

Publication number: JP2004361987A
Application number: JP2003155886A
Authority: JP
Inventors: Toshinori Nagahashi; 敏則長橋; Takashi Hiuga; 崇日向
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2003-05-30
Filing date: 2003-05-30
Publication date: 2004-12-24
Also published as: EP1482428A2; CN1573742A; US7440638B2; CN100357944C; EP1482428A3; US20050008263A1

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image retrieval system suitable for acquiring a retrieval result or classification result according to the desire of a user. <P>SOLUTION: With respect to a retrieval key image and each retrieval object image, an area under consideration is extracted from those images and the feature vector V of the images is generated on the basis of the extracted area under consideration. Then, an image similar to the retrieval key image is retrieved from a retrieval object image registration DB10 on the basis of the generated feature vector V. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、複数の画像のなかから検索キー画像に適合する画像を検索しまたは複数の画像を分類するシステムおよびプログラム、並びに方法に係り、特に、利用者の希望に添う検索結果または分類結果を得るのに好適な画像検索システム、画像分類システム、画像検索プログラムおよび画像分類プログラム、並びに画像検索方法および画像分類方法に関する。
【０００２】
【従来の技術】
従来、与えられた検索キー画像をもとに、複数の検索対象画像のなかから検索キー画像に類似する画像を検索する技術としては、例えば、特許文献１に開示されている画像検索装置があった。
特許文献１記載の画像検索装置は、獲得された対象画像データまたは参照画像データより、複数の特徴量を抽出して特徴べクトルを生成する特徴ベクトル抽出部と、獲得された複数の参照画像データに対して、各参照画像データごとに特徴ベクトル抽出部で抽出された特徴べクトルとこの参照画像のアドレス情報を統合して参照ベクトルをつくり、参照べクトル群を生成する参照ベクトル群処理部と、獲得された対象画像データより特徴ベクトル抽出部で抽出された特徴べクトルと、参照ベクトル群より選択された参照画像データの特徴べクトルとの類似度を計算する類似度計算部と、計算された類似度を所定の基準と比較する類似度判定部と、類似していると判断された画像のアドレス情報を、参照ベクトル群より取出部とを備える。
【０００３】
ここで、画像の特徴量としては、色、テクスチャー、構造的特徴、時間的特徴を用いている。テクスチャーに関しては、濃度ヒストグラム、同時生起行列、差分統計量等の計算により、またエッジや線、輪郭等の構造的特徴に関しては、ラプラシアンフィルターのコンボルーションやハフ変換等により、さらに色に関しては、ＲＧＢ空間やＨＳＶ空間、あるいはスペクトルへの変換等により、時間的特徴に関してはオプティカルフローの計算やウェーブレットへの変換により、それぞれ、特徴量が求められている。
【０００４】
【特許文献１】
特開２００１−５２１７５号公報
【０００５】
【発明が解決しようとする課題】
類似の概念は、人の主観に依存するところが大きいので、ある人にとっては類似していると感じる画像でも、他の人にとっては類似していないと感じる場合がある。そのため、画像の類似検索を行う場合、類似の概念をどのように定義するかが重要である。
【０００６】
画像の全体および部分について検討すれば、検索キー画像に対して、例えば、全体的に類似しているが特徴的な部分については類似していない画像や、逆に特徴的な部分については類似しているが全体的に類似していない画像が存在するが、それらの画像についてはそれぞれ類似度を適切に評価する必要がある。利用者が画像を見る場合、利用者は、画像のなかで特徴的な部分（花が主体的に写っている画像であれば花の部分）に注目している。そのため、全体的に類似しているが特徴的な部分については類似していない画像よりは、特徴的な部分については類似しているが全体的に類似していない画像の方が類似していると感じるはずである。したがって、画像全体をとらえて類似度を評価するよりは、画像のうち特徴的な部分については類似度を重視し、そうでない部分については類似度を軽視するようにして画像の類似度を評価する方が実情に沿うことになる。
【０００７】
しかしながら、特許文献１記載の画像検索装置にあっては、画像データ全体から複数の特徴量を抽出して特徴ベクトルを生成し、生成した特徴ベクトルに基づいて画像の類似検索を行うようになっているため、利用者が注目する箇所を考慮して画像の類似度を評価してない。したがって、検索結果に利用者の主観を十分に反映できず、利用者の希望に添う検索結果を得にくいという問題があった。
【０００８】
このことは、画像の類似検索を行う場合に限らず、複数の画像をその類似度に応じて分類する場合についても同様の問題が想定される。
そこで、本発明は、このような従来の技術の有する未解決の課題に着目してなされたものであって、利用者の希望に添う検索結果または分類結果を得るのに好適な画像検索システム、画像分類システム、画像検索プログラムおよび画像分類プログラム、並びに画像検索方法および画像分類方法を提供することを目的としている。
【０００９】
【課題を解決するための手段】
〔発明１〕
上記目的を達成するために、発明１の画像検索システムは、
与えられた検索キー画像をもとに、複数の検索対象画像のなかから前記検索キー画像に適合する画像を検索するシステムであって、
前記検索キー画像および前記各検索対象画像について、当該画像から注目領域を抽出し、抽出した注目領域に基づいて当該画像の特徴を示す特徴ベクトルを生成し、
生成した特徴ベクトルに基づいて、前記複数の検索対象画像のなかから前記検索キー画像に適合する画像を検索するようになっていることを特徴とする。
【００１０】
このような構成であれば、検索キー画像が与えられると、検索キー画像から注目領域が抽出され、抽出された注目領域に基づいて検索キー画像の特徴を示す特徴ベクトルが生成される。また同様に、各検索対象画像ごとに、その検索対象画像から注目領域が抽出され、抽出された注目領域に基づいてその検索対象画像の特徴を示す特徴ベクトルが生成される。そして、生成された特徴ベクトルに基づいて、複数の検索対象画像のなかから検索キー画像に適合する画像が検索される。
【００１１】
これにより、利用者が注目する箇所を考慮して検索が行われるので、利用者の主観が検索結果に反映しやすくなる。したがって、従来に比して、利用者の希望に比較的添った検索結果を得ることができるという効果が得られる。
ここで、注目領域とは、検索キー画像または検索対象画像のなかで利用者が注目すると思われる領域をいう。以下、発明２の画像検索システム、発明８および９の画像分類システム、発明１４の画像検索プログラム、発明１５の画像分類プログラム、発明１６の画像検索方法、並びに発明１７の画像分類方法において同じである。
【００１２】
また、本システムは、単一の装置、端末その他の機器として実現するようにしてもよいし、複数の装置、端末その他の機器を通信可能に接続したネットワークシステムとして実現するようにしてもよい。後者の場合、各構成要素は、それぞれ通信可能に接続されていれば、複数の機器等のうちいずれに属していてもよい。以下、発明２の画像検索システム、並びに発明８および９の画像分類システムにおいて同じである。
〔発明２〕
さらに、発明２の画像検索システムは、
与えられた検索キー画像をもとに、複数の検索対象画像のなかから前記検索キー画像に適合する画像を検索するシステムであって、
前記複数の検索対象画像を記憶するための検索対象画像記憶手段と、前記検索キー画像を入力する検索キー画像入力手段と、前記検索キー画像入力手段で入力した検索キー画像および前記検索対象画像記憶手段の各検索対象画像について当該画像から注目領域を抽出する注目領域抽出手段と、前記検索キー画像および前記各検索対象画像について前記注目領域抽出手段で抽出した注目領域に基づいて当該画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成手段と、前記特徴ベクトル生成手段で生成した特徴ベクトルに基づいて前記検索対象画像記憶手段のなかから前記検索キー画像に適合する画像を検索する画像検索手段とを備えることを特徴とする。
【００１３】
このような構成であれば、検索キー画像入力手段から検索キー画像が入力されると、注目領域抽出手段により、入力された検索キー画像から注目領域が抽出され、特徴ベクトル生成手段により、抽出された注目領域に基づいて検索キー画像の特徴を示す特徴ベクトルが生成される。また同様に、検索対象画像記憶手段の各検索対象画像ごとに、注目領域抽出手段により、その検索対象画像から注目領域が抽出され、特徴ベクトル生成手段により、抽出された注目領域に基づいてその検索対象画像の特徴を示す特徴ベクトルが生成される。そして、画像検索手段により、生成された特徴ベクトルに基づいて、検索対象画像記憶手段のなかから検索キー画像に適合する画像が検索される。
【００１４】
これにより、利用者が注目する箇所を考慮して検索が行われるので、利用者の主観が検索結果に反映しやすくなる。したがって、従来に比して、利用者の希望に比較的添った検索結果を得ることができるという効果が得られる。
ここで、注目領域抽出手段は、検索キー画像および各検索対象画像についてその画像から注目領域を抽出するようになっていればよい。例えば、検索キー画像または検索対象画像に基づいて誘目度を算出し、算出した誘目度に基づいて注目領域を抽出することができる。以下、発明９の画像分類システム、発明１４の画像検索プログラム、および発明１５の画像分類プログラムにおいて同じである。
【００１５】
また、検索キー画像入力手段は、検索キー画像を入力するようになっていればどのような構成であってもよく、例えば、検索対象画像記憶手段のなかから選択した検索対象画像を検索キー画像として入力するようになっていてもよいし、画像記憶媒体、ネットワークまたは他の画像記憶手段から検索キー画像を入力するようになっていてもよい。以下、発明９の画像分類システム、発明１４の画像検索プログラム、および発明１５の画像分類プログラムにおいて同じである。
【００１６】
また、検索対象画像記憶手段は、検索対象画像をあらゆる手段でかつあらゆる時期に記憶するものであり、検索対象画像をあらかじめ記憶してあるものであってもよいし、検索対象画像をあらかじめ記憶することなく、本システムの動作時に外部からの入力等によって検索対象画像を記憶するようになっていてもよい。〔発明３〕
さらに、発明３の画像検索システムは、発明２の画像検索システムにおいて、さらに、前記検索キー画像および前記各検索対象画像について当該画像に含まれる人物画像の顔情報を判定する顔情報判定手段を備え、
前記特徴ベクトル生成手段は、前記検索キー画像および前記各検索対象画像について、前記顔情報判定手段で判定した顔情報に基づいて、当該画像の特徴を示す特徴ベクトルを生成するようになっていることを特徴とする。
【００１７】
このような構成であれば、顔情報判定手段により、検索キー画像に含まれる人物画像の顔情報が判定され、特徴ベクトル生成手段により、判定された顔情報に基づいて、検索キー画像の特徴を示す特徴ベクトルが生成される。また同様に、各検索対象画像ごとに、顔情報判定手段により、その検索対象画像に含まれる人物画像の顔情報が判定され、特徴ベクトル生成手段により、判定された顔情報に基づいて、その検索対象画像の特徴を示す特徴ベクトルが生成される。
【００１８】
これにより、人物画像の顔情報を考慮して検索が行われるので、検索キー画像に含まれる人物画像の顔に適合する検索結果を得ることができるという効果も得られる。
〔発明４〕
さらに、発明４の画像検索システムは、発明２および３のいずれかの画像検索システムにおいて、
さらに、前記各検索対象画像ごとに当該検索対象画像に含まれる人物画像の顔と前記検索キー画像に含まれる人物画像の顔との類似度を判定する類似度判定手段を備え、
前記特徴ベクトル生成手段は、前記検索キー画像および前記各検索対象画像について、前記類似度判定手段の判定結果に基づいて、当該画像の特徴を示す特徴ベクトルを生成するようになっていることを特徴とする。
【００１９】
このような構成であれば、類似度判定手段により、各検索対象画像ごとに、その検索対象画像に含まれる人物画像の顔と検索キー画像に含まれる人物画像の顔との類似度が判定され、特徴ベクトル生成手段により、その判定結果に基づいて、検索キー画像の特徴を示す特徴ベクトルおよび各検索対象画像の特徴を示す特徴ベクトルが生成される。
【００２０】
これにより、人物画像の顔同士の類似度を考慮して検索が行われるので、検索キー画像に含まれる人物画像の顔に類似する検索結果を得ることができるという効果も得られる。
〔発明５〕
さらに、発明５の画像検索システムは、発明２ないし４のいずれかの画像検索システムにおいて、
前記画像検索手段は、前記検索キー画像の特徴ベクトルとのベクトル間距離が最も小さい特徴ベクトルに対応する検索対象画像を前記検索対象画像記憶手段のなかから索出するようになっていることを特徴とする。
【００２１】
このような構成であれば、画像検索手段により、検索キー画像の特徴ベクトルとのベクトル間距離が最も小さい特徴ベクトルに対応する検索対象画像が検索対象画像記憶手段のなかから索出される。
これにより、利用者の希望に最も添うと思われる検索結果を得ることができるという効果も得られる。
〔発明６〕
さらに、発明６の画像検索システムは、発明２ないし４のいずれかの画像検索システムにおいて、
前記画像検索手段は、前記検索対象画像の特徴ベクトル同士のベクトル間距離に基づいて前記各検索対象画像を複数のグループに分類し、前記複数のグループのうち前記検索キー画像が属するもののすべての検索対象画像を前記検索対象画像記憶手段のなかから索出するようになっていることを特徴とする。
【００２２】
このような構成であれば、画像検索手段により、検索対象画像の特徴ベクトル同士のベクトル間距離に基づいて各検索対象画像が複数のグループに分類され、複数のグループのうち検索キー画像が属するもののすべての検索対象画像が検索対象画像記憶手段のなかから索出される。
これにより、利用者の希望に添うと思われるいくつかの検索結果を得ることができるという効果が得られる。
〔発明７〕
さらに、発明７の画像検索システムは、発明２ないし６のいずれかの画像検索システムにおいて、
前記特徴ベクトル生成手段は、前記検索キー画像と前記検索対象画像とのアスペクト比、または前記検索対象画像同士のアスペクト比が異なるときは、アスペクト比が異なる第１画像および第２画像を重ね合わせ、前記第１画像のうち重複領域について前記第１画像の特徴ベクトルを生成し、前記第２画像のうち重複領域について前記第２画像の特徴ベクトルを生成するようになっていることを特徴とする。
【００２３】
このような構成であれば、検索キー画像と検索対象画像とのアスペクト比、または検索対象画像同士のアスペクト比が異なると、特徴ベクトル生成手段により、アスペクト比が異なる第１画像および第２画像を重ね合わせ、第１画像のうち重複領域について第１画像の特徴ベクトルが生成され、第２画像のうち重複領域について第２画像の特徴ベクトルが生成される。
【００２４】
これにより、アスペクト比が異なる画像同士であっても、比較的正確に類否を判定することができるので、利用者の希望にさらに添った検索結果を得ることができるという効果も得られる。
〔発明８〕
一方、上記目的を達成するために、発明８の画像分類システムは、
複数の分類対象画像を分類するシステムであって、
前記各分類対象画像ごとに、当該分類対象画像から注目領域を抽出し、抽出した注目領域に基づいて当該分類対象画像の特徴を示す特徴ベクトルを生成し、
生成した特徴ベクトルに基づいて、前記各分類対象画像を複数のグループに分類するようになっていることを特徴とする。
【００２５】
このような構成であれば、各分類対象画像ごとに、その分類対象画像から注目領域が抽出され、抽出された注目領域に基づいてその分類対象画像の特徴を示す特徴ベクトルが生成される。そして、生成された特徴ベクトルに基づいて、各分類対象画像が複数のグループに分類される。
これにより、利用者が注目する箇所を考慮して分類が行われるので、利用者の主観が分類結果に反映しやすくなる。したがって、従来に比して、利用者の希望に比較的添った分類結果を得ることができるという効果が得られる。
〔発明９〕
さらに、発明９の画像分類システムは、
複数の分類対象画像を分類するシステムであって、
前記複数の分類対象画像を記憶するための分類対象画像記憶手段と、前記分類対象画像記憶手段の各分類対象画像ごとに当該分類対象画像から注目領域を抽出する注目領域抽出手段と、前記注目領域抽出手段で抽出した注目領域に基づいて前記各分類対象画像ごとに当該分類対象画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成手段と、前記特徴ベクトル生成手段で生成した特徴ベクトルに基づいて前記各分類対象画像を複数のグループに分類する画像分類手段とを備えることを特徴とする。
【００２６】
このような構成であれば、分類対象画像記憶手段の各分類対象画像ごとに、注目領域抽出手段により、その分類対象画像から注目領域が抽出され、特徴ベクトル生成手段により、抽出された注目領域に基づいてその分類対象画像の特徴を示す特徴ベクトルが生成される。そして、画像分類手段により、生成された特徴ベクトルに基づいて各分類対象画像が複数のグループに分類される。
【００２７】
これにより、利用者が注目する箇所を考慮して分類が行われるので、利用者の主観が分類結果に反映しやすくなる。したがって、従来に比して、利用者の希望に比較的添った分類結果を得ることができるという効果が得られる。
ここで、分類対象画像記憶手段は、分類対象画像をあらゆる手段でかつあらゆる時期に記憶するものであり、分類対象画像をあらかじめ記憶してあるものであってもよいし、分類対象画像をあらかじめ記憶することなく、本システムの動作時に外部からの入力等によって分類対象画像を記憶するようになっていてもよい。
〔発明１０〕
さらに、発明１０の画像分類システムは、発明９の画像分類システムにおいて、
さらに、前記各分類対象画像ごとに当該分類対象画像に含まれる人物画像の顔情報を判定する顔情報判定手段を備え、
前記特徴ベクトル生成手段は、前記顔情報判定手段で判定した顔情報に基づいて、前記各分類対象画像ごとに、当該分類対象画像の特徴を示す特徴ベクトルを生成するようになっていることを特徴とする。
【００２８】
このような構成であれば、各分類対象画像ごとに、顔情報判定手段により、その分類対象画像に含まれる人物画像の顔情報が判定され、特徴ベクトル生成手段により、判定された顔情報に基づいて、その分類対象画像の特徴を示す特徴ベクトルが生成される。
これにより、人物画像の顔情報を考慮して分類が行われるので、人物画像の顔が適合する画像同士が同一グループに属するような分類結果を得ることができるという効果も得られる。
〔発明１１〕
さらに、発明１１の画像分類システムは、発明９および１０のいずれかの画像分類システムにおいて、
さらに、前記各分類対象画像ごとに当該分類対象画像に含まれる人物画像の顔と特定画像に含まれる人物画像の顔との類似度を判定する類似度判定手段を備え、
前記特徴ベクトル生成手段は、前記類似度判定手段の判定結果に基づいて、前記各分類対象画像ごとに、当該分類対象画像の特徴を示す特徴ベクトルを生成するようになっていることを特徴とする。
【００２９】
このような構成であれば、各分類対象画像ごとに、類似度判定手段により、その分類対象画像に含まれる人物画像の顔と特定画像に含まれる人物画像の顔との類似度が判定され、特徴ベクトル生成手段により、その判定結果に基づいて、その分類対象画像の特徴を示す特徴ベクトルが生成される。
これにより、人物画像の顔同士の類似度を考慮して分類が行われるので、人物画像の顔が類似する画像同士が同一グループに属するような分類結果を得ることができるという効果も得られる。
〔発明１２〕
さらに、発明１２の画像分類システムは、発明９ないし１１のいずれかの画像分類システムにおいて、
前記画像分類手段は、前記分類対象画像の特徴ベクトル同士のベクトル間距離に基づいて前記各分類対象画像を複数のグループに分類し、前記各グループごとに所定数の分類対象画像を前記分類対象画像記憶手段のなかから索出するようになっていることを特徴とする。
【００３０】
このような構成であれば、画像分類手段により、分類対象画像の特徴ベクトル同士のベクトル間距離に基づいて各分類対象画像が複数のグループに分類され、各グループごとに所定数の分類対象画像が分類対象画像記憶手段のなかから索出される。
これにより、異なるグループから所定数の分類対象画像が索出されるので、多様な検索結果を得ることができるという効果も得られる。
〔発明１３〕
さらに、発明１３の画像分類システムは、発明９ないし１２のいずれかの画像分類システムにおいて、
前記特徴ベクトル生成手段は、前記分類対象画像同士のアスペクト比が異なるときは、アスペクト比が異なる第１画像および第２画像を重ね合わせ、前記第１画像のうち重複領域について前記第１画像の特徴ベクトルを生成し、前記第２画像のうち重複領域について前記第２画像の特徴ベクトルを生成するようになっていることを特徴とする。
【００３１】
このような構成であれば、分類対象画像同士のアスペクト比が異なると、特徴ベクトル生成手段により、アスペクト比が異なる第１画像および第２画像を重ね合わせ、第１画像のうち重複領域について第１画像の特徴ベクトルが生成され、第２画像のうち重複領域について第２画像の特徴ベクトルが生成される。
これにより、アスペクト比が異なる画像同士であっても、比較的正確に類否を判定することができるので、利用者の希望にさらに添った分類結果を得ることができるという効果も得られる。
〔発明１４〕
一方、上記目的を達成するために、発明１４の画像検索プログラムは、
与えられた検索キー画像をもとに、複数の検索対象画像のなかから前記検索キー画像に適合する画像を検索するプログラムであって、
前記複数の検索対象画像を記憶するための検索対象画像記憶手段と、前記検索キー画像を入力する検索キー画像入力手段とを利用可能なコンピュータに対して、
前記検索キー画像入力手段で入力した検索キー画像および前記検索対象画像記憶手段の各検索対象画像について当該画像から注目領域を抽出する注目領域抽出手段、前記検索キー画像および前記各検索対象画像について前記注目領域抽出手段で抽出した注目領域に基づいて当該画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成手段、並びに前記特徴ベクトル生成手段で生成した特徴ベクトルに基づいて前記検索対象画像記憶手段のなかから前記検索キー画像に適合する画像を検索する画像検索手段として実現される処理を実行させるためのプログラムであることを特徴とする。
【００３２】
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られたプログラムに従ってコンピュータが処理を実行すると、発明２の画像検索システムと同等の作用および効果が得られる。
〔発明１５〕
一方、上記目的を達成するために、発明１５の画像分類プログラムは、
複数の分類対象画像を分類するプログラムであって、
前記複数の分類対象画像を記憶するための分類対象画像記憶手段を利用可能なコンピュータに対して、
前記分類対象画像記憶手段の各分類対象画像ごとに当該分類対象画像から注目領域を抽出する注目領域抽出手段、前記注目領域抽出手段で抽出した注目領域に基づいて前記各分類対象画像ごとに当該分類対象画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成手段、および前記特徴ベクトル生成手段で生成した特徴ベクトルに基づいて前記各分類対象画像を複数のグループに分類する画像分類手段として実現される処理を実行させるためのプログラムであることを特徴とする。
【００３３】
このような構成であれば、コンピュータによってプログラムが読み取られ、読み取られたプログラムに従ってコンピュータが処理を実行すると、発明９の画像分類システムと同等の作用および効果が得られる。
〔発明１６〕
一方、上記目的を達成するために、発明１６の画像検索方法は、
与えられた検索キー画像をもとに、複数の検索対象画像を記憶した検索対象画像記憶手段のなかから前記検索キー画像に適合する画像を検索する方法であって、
前記検索キー画像を入力する検索キー画像入力ステップと、
前記検索キー画像入力ステップで入力した検索キー画像から注目領域を抽出する第１注目領域抽出ステップと、
前記第１注目領域抽出ステップで抽出した注目領域に基づいて前記検索キー画像の特徴を示す特徴ベクトルを生成する第１特徴ベクトル生成ステップと、
前記検索対象画像から注目領域を抽出する第２注目領域抽出ステップと、
前記第２注目領域抽出ステップで抽出した注目領域に基づいて前記検索対象画像の特徴を示す特徴ベクトルを生成する第２特徴ベクトル生成ステップと、
前記第２注目領域抽出ステップおよび前記第２特徴ベクトル生成ステップを前記検索対象画像記憶手段の各検索対象画像ごとに繰り返し行う繰返ステップと、前記第１特徴ベクトル生成ステップおよび前記第２特徴ベクトル生成ステップで生成した特徴ベクトルに基づいて前記検索対象画像記憶手段のなかから前記検索キー画像に適合する画像を検索する画像検索ステップとを含むことを特徴とする。
【００３４】
これにより、発明２の画像検索システムと同等の効果が得られる。
ここで、注目領域抽出ステップは、検索キー画像および各検索対象画像についてその画像から注目領域を抽出すればよい。例えば、検索キー画像または検索対象画像に基づいて誘目度を算出し、算出した誘目度に基づいて注目領域を抽出することができる。以下、発明１７の画像分類方法において同じである。
【００３５】
また、検索キー画像入力ステップは、検索キー画像を入力すればどのような方法であってもよく、例えば、検索対象画像記憶手段のなかから選択した検索対象画像を検索キー画像として入力してもよいし、画像記憶媒体、ネットワークまたは他の画像記憶手段から検索キー画像を入力してもよい。
〔発明１７〕
一方、上記目的を達成するために、発明１７の画像分類方法は、
複数の分類対象画像を分類する方法であって、
前記分類対象画像から注目領域を抽出する注目領域抽出ステップと、
前記注目領域抽出ステップで抽出した注目領域に基づいて前記分類対象画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成ステップと、
前記注目領域抽出ステップおよび前記特徴ベクトル生成ステップを前記各分類対象画像ごとに繰り返し行う繰返ステップと、
前記特徴ベクトル生成ステップで生成した特徴ベクトルに基づいて前記各分類対象画像を複数のグループに分類する画像分類ステップとを含むことを特徴とする。
【００３６】
これにより、発明９の画像分類システムと同等の効果が得られる。
【００３７】
【発明の実施の形態】
以下、本発明の実施の形態を図面を参照しながら説明する。図１ないし図７は、本発明に係る画像検索システム、画像分類システム、画像検索プログラムおよび画像分類プログラム、並びに画像検索方法および画像分類方法の実施の形態を示す図である。
【００３８】
本実施の形態は、本発明に係る画像検索システム、画像分類システム、画像検索プログラムおよび画像分類プログラム、並びに画像検索方法および画像分類方法を、利用者が注目する箇所を考慮して画像の類似検索を行う場合について適用したものである。
本実施の形態では、画像のなかで利用者が注目すると思われる箇所（以下、注目領域という。）の抽出基準として「誘目度」という概念を用いる。誘目度の算出方法は、例えば、「特開２００１−１２６０７０号公報（注目領域抽出装置およびそれを用いた自動構図決定装置）に詳細に開示されている。
【００３９】
誘目度について簡単に説明する。
注目領域の抽出のために、原画像の物理的特徴に従って誘目度を評価する。ここで、誘目度とは、人間の主観に合ったパラメータをいう。注目領域の抽出は、評価結果から一番目立つ領域を注目領域として抽出する。つまり、注目領域の評価の際は、物理的特徴に従って人間の主観に合った評価をするので、人間の主観に適合した注目領域を抽出することができる。
【００４０】
例えば、物理的特徴が色の異質度を含む場合、各領域の色の違いに基づいて誘目度を評価することができる。
また、物理的特徴が、色の異質度に加えて、形の異質度、面積の異質度およびテクスチャ（模様）の異質度をさらに含むので、この４つの異質度の少なくとも１つの異質度に基づいて誘目度を評価すれば、原画像の特徴に応じて的確に誘目度を評価することができる。
【００４１】
また、色の３要素（色相、彩度、明度）についても評価する場合であれば、人間の主観による目立つ色（赤色）に近い領域を最も目立つ領域と評価することができる。
さらに、空間周波数や原画像における各領域の面積についても評価すれば、最も目立つ領域の評価をさらに的確に判定することができる。
【００４２】
また、誘目度の評価は、以下の手順により行う。
（１）最初に原画像を領域分割する。この場合、原画像を図領域と絵領域に分割する。領域分割の方法には、１９９７ＩＥＥＥにおいてＷ．Ｙ．ＭａやＢ．Ｓ．Ｍａｎｊｕｎａｔｈらが「ＥｄｇｅＦｌｏｗ：ＡＦｒａｍｅｗｏｒｋｏｆＢｏｕｎｄａｒｙＤｅｔｅｃｔｉｏｎａｎｄＩｍａｇｅＳｅｇｍｅｎｔａｔｉｏｎ」に記載した”ｅｄｇｅｆｌｏｗ”に基づく境界検出方法が適用される。
（２）次に、分割した図領域を抽出し、領域の誘目度を評価する。
【００４３】
誘目度の評価は、概略以下のようにして行う。
最初に、各領域の異質性誘目度を求める。この場合、色の異質度、テクスチャの異質度、形の異質度および面積の異質度を各々求め、それぞれに重み係数を付与して線形結合し、各領域の異質性誘目度を求める。
次に、各領域における特徴誘目度を求める。この場合、色の誘目度、空間周波数の誘目度、面積の誘目度を求め、それぞれに重み係数を付与して線形結合し、各領域の特徴誘目度を求める。
【００４４】
次に、各領域の異質性誘目度と特徴誘目度を加算し、特徴量統合値を求め、特徴量統合値を所定のベータ関数により評価して、誘目度を算出する。
（３）また、原画像から誘目度を評価したパターン図を生成する。
次に、本発明に係る画像検索装置１００の構成を図１を参照しながら説明する。
【００４５】
図１は、本発明に係る画像検索装置１００の構成を示す機能ブロック図である。
画像検索装置１００は、図１に示すように、複数の検索対象画像を登録した検索対象画像登録データベース（以下、データベースのことを単にＤＢと略記する。）１０と、検索キー画像を指定する検索キー画像指定部１２と、検索キー画像指定部１２で指定された検索対象画像を検索キー画像として検索対象画像登録ＤＢ１０から読み出す検索キー画像読出部１４とを有して構成されている。さらに、検索キー画像読出部１４で読み出した検索キー画像および検索対象画像登録ＤＢ１０の各検索対象画像についてその画像から注目領域を抽出する注目領域抽出部１６と、検索キー画像読出部１４で読み出した検索キー画像および検索対象画像登録ＤＢ１０の各検索対象画像について顔情報および類似度を判定する顔画像処理部１８と、注目領域抽出部１６で抽出した注目領域および顔画像処理部１８の判定結果に基づいて画像の特徴を示す特徴ベクトルを生成する特徴ベクトル生成部２０とを有して構成されている。さらに、検索条件を指定する検索条件指定部２２と、検索条件指定部２２で指定された検索条件および特徴ベクトル生成部２０で生成した特徴ベクトルに基づいて検索対象画像登録ＤＢ１０のなかから画像を検索する画像検索部２４と、検索結果の表示形態を指定する表示形態指定部２６と、表示形態指定部２６で指定された表示形態で検索結果の画像を表示する画像表示部２８とを有している。
【００４６】
顔画像処理部１８は、検索キー画像読出部１４で読み出した検索キー画像および検索対象画像登録ＤＢ１０の各検索対象画像についてその画像に人物画像の顔に相当する領域（以下、顔領域という。）が含まれているか否かを判定する顔領域判定部３４と、顔領域判定部３４の判定結果に基づいて画像に含まれる人物画像の顔の向き、大きさおよび重心位置を判定する顔情報判定部３６と、顔領域判定部３４の判定結果に基づいて各検索対象画像ごとにその検索対象画像に含まれる人物画像の顔と検索キー画像に含まれる人物画像の顔との類似度を判定する類似度判定部３８とを有して構成されている。
【００４７】
具体的に、画像検索装置１００は、図２に示すように、コンピュータ２００およびこれに実行させるプログラムとして実現することができる。コンピュータ２００の構成を図２を参照しながら説明する。
図２は、コンピュータ２００の構成を示すブロック図である。
コンピュータ２００は、図２に示すように、制御プログラムに基づいて演算およびシステム全体を制御するＣＰＵ５０と、所定領域にあらかじめＣＰＵ５０の制御プログラム等を格納しているＲＯＭ５２と、ＲＯＭ５２等から読み出したデータやＣＰＵ５０の演算過程で必要な演算結果を格納するためのＲＡＭ５４と、外部装置に対してデータの入出力を媒介するＩ／Ｆ５８とで構成されており、これらは、データを転送するための信号線であるバス５９で相互にかつデータ授受可能に接続されている。
【００４８】
Ｉ／Ｆ５８には、外部装置として、検索対象画像登録ＤＢ１０と、ヒューマンインターフェースとしてデータの入力が可能なキーボードやマウス等からなる入力装置６０と、画像信号に基づいて画面を表示する表示装置６４とが接続されている。
ＣＰＵ５０は、マイクロプロセッシングユニット（ＭＰＵ）等からなり、ＲＯＭ５２の所定領域に格納されている所定のプログラムを起動させ、そのプログラムに従って、図３のフローチャートに示す画像検索処理を実行するようになっている。
【００４９】
図３は、画像検索処理を示すフローチャートである。
画像検索処理は、入力装置６０から検索要求の入力を受けて実行される処理であって、ＣＰＵ５０において実行されると、図３に示すように、まず、ステップＳ１００に移行するようになっている。
ステップＳ１００では、検索条件の指定を入力する。検索条件としては、検索対象画像登録ＤＢ１０のなかから検索キー画像に最も類似する画像を検索する類似画像検索モード、検索対象画像登録ＤＢ１０のなかから検索キー画像に類似する複数の画像を検索する類似画像群検索モード、および検索対象画像登録ＤＢ１０のなかから性質の異なる複数の画像を検索するバラエティ検索モードを指定することができる。
【００５０】
次いで、ステップＳ１０２に移行して、表示形態の指定を入力する。表示形態としては、検索条件に合致する画像を大きく表示し検索条件に合致しない画像を小さく表示する拡大表示モード、および検索条件に合致する画像を鮮明に表示し検索条件に合致しない画像をぼかして表示する鮮明表示モードを指定することができる。
【００５１】
次いで、ステップＳ１０４に移行して、検索対象画像登録ＤＢ１０のなかから検索キー画像を指定する。なお、検索条件としてバラエティ検索モードを指定した場合は、検索キー画像の指定は不要となる。以下、ステップＳ１０６〜Ｓ１２６およびステップＳ１３４の処理は、検索キー画像を含むすべての検索対象画像について行う。
【００５２】
次いで、ステップＳ１０６に移行して、検索対象画像登録ＤＢ１０のなかから先頭の検索対象画像を読み出し、ステップＳ１０８に移行する。
ステップＳ１０８では、読み出した検索対象画像に基づいて誘目度を算出し、算出した誘目度に基づいて注目領域を抽出する。注目領域の抽出は、上記方法により行う。誘目度の絶対値は、検索対象画像に影響を受けることがあるので、すべての検索対象画像を等しく評価するためには、誘目度を正規化して注目領域の注目度合いを所定段階（例えば、１０段階）に区分する。以下、検索対象画像を構成する各画素について算出した誘目度をｅ’_ｘｙとする。ｘ，ｙは、検索対象画像における画素のＸ座標およびＹ座標を示す。
【００５３】
図４は、縦向きの検索対象画像の一例を示す図である。
図４（ａ）の例では、撮影向きが縦方向となっており、右下に花の画像が配置されている。この場合、注目領域を算出すると、例えば、図４（ｂ）に示すように、花の画像のうち花の部分およびその近傍に相当する領域が最も注目度合いの高い注目領域Ａとして抽出され、花の画像のうち茎および葉の部分並びにその近傍に相当する領域が２番目に注目度合いの高い注目領域Ｂとして抽出される。その他の領域は、注目度合いの低い領域Ｃとして抽出される。
【００５４】
図５は、横向きの検索対象画像の一例を示す図である。
図５（ａ）の例では、撮影向きが横方向となっており、右下に花の画像が配置されている。この場合、注目領域を算出すると、例えば、図５（ｂ）に示すように、花の画像のうち花の部分およびその近傍に相当する領域が最も注目度合いの高い注目領域Ａとして抽出され、花の画像のうち茎および葉の部分並びにその近傍に相当する領域が２番目に注目度合いの高い注目領域Ｂとして抽出される。その他の領域は、注目度合いの低い領域Ｃとして抽出される。このように、図４の検索対象画像とほぼ同様の領域が同様の注目の度合いで抽出されることが分かる。
【００５５】
次いで、ステップＳ１１０に移行して、読み出した検索対象画像に顔領域が含まれているか否かを判定し、ステップＳ１１８に移行する。
ステップＳ１１８では、ステップＳ１１０の判定結果に基づいて、検索対象画像に含まれる人物画像の顔の向き、大きさおよび重心位置を判定する。具体的には、検索対象画像内に複数の顔領域が含まれていることを想定し、それらを検出顔領域群とすると、検出顔領域群の検索対象画像内に占める面積の総和ｆ１、検出顔領域群の検索対象画像内に占める面積の平均値ｆ２、検出顔領域群の検索対象画像内に占める面積の分散ｆ３、検出顔領域群の各顔が水平方向どれぐらい正面を向いているかの水平方向正面向度合の平均値ｆ４（−π／２〜π／２）、検出顔領域群の各顔の水平方向正面向度合の分散ｆ５、検出顔領域群の各顔が垂直方向どれぐらい正面を向いているかの垂直方向正面向度合の平均値ｆ６（−π／２〜π／２）、検出顔領域群の各顔の垂直向方向正面度合の分散ｆ７、検出顔領域群の各重心位置の平均値ｆ８、および検出顔領域群の各重心位置の分散ｆ９をそれぞれ算出する。検索対象画像内に１つの顔領域しか含まれていない場合は、ｆ１およびｆ２は、その顔領域の面積を、ｆ４およびｆ６は、その顔領域の水平方向正面向度合および垂直方向正面向度合をそれぞれ算出する。なお、水平方向正面向度合は、検出顔領域の顔が正面を基準として水平方向に傾いているほど小さい値となり、垂直方向正面度合は、検出顔領域の顔が正面を基準として垂直方向に傾いているほど小さい値となる。以下、特に区別する場合を除き、水平方向正面向度合および垂直方向正面向度合を総称して正面向度合という。また、顔領域の面積は、検索対象画像の大きさで正規化して算出する。
【００５６】
次いで、ステップＳ１２０に移行して、検索対象画像に含まれる人物画像の顔と検索キー画像に含まれる人物画像の顔との類似度を判定する。例えば、検索キー画像に被写体Ａ，Ｂ，Ｃの人物画像が含まれている場合、検索対象画像に含まれる各顔領域ごとに、被写体Ａの顔領域の顔との類似度、被写体Ｂの顔領域の顔との類似度、および被写体Ｃの顔領域の顔との類似度をそれぞれ判定する。なお、検索条件としてバラエティ検索モードを指定した場合は、検索キー画像が存在しないので、検索対象画像に含まれる人物画像の顔と、あらかじめ設定した特定画像に含まれる人物画像の顔との類似度を判定する。
【００５７】
次いで、ステップＳ１２４に移行して、ステップＳ１０８で抽出した注目領域、およびステップＳ１１８，Ｓ１２０の判定結果に基づいて検索対象画像の特徴ベクトルＶを生成する。特徴ベクトルＶは、大別して、注目領域の誘目度に応じた第１要素群と、顔情報ｆ１〜ｆ９に応じた第２要素群と、類似度に応じた第３要素群とからなる。
【００５８】
特徴ベクトルＶの第１要素群は、検索対象画像を複数の領域（例えば、水平方向Ｎ個および垂直方向Ｍ個の矩形領域）に区分し、下式（１）により、各区分領域ごとにその区分領域（ｉ，ｊ）の誘目度の平均値ｅ_ｉｊを算出し、誘目度の平均値ｅ_ｉｊに基づいて決定する。区分領域（ｉ，ｊ）は、検索対象画像において水平方向ｉ（ｉ＝１〜Ｎ）番目でかつ垂直方向ｊ（ｊ＝１〜Ｍ）番目の領域を示す。
【００５９】
【数１】

【００６０】
上式（１）は、各区分領域を２ｓ×２ｓの画素からなる正方形の領域とした場合に、区分領域（ｉ，ｊ）の誘目度の平均値ｅ_ｉｊを算出している。上式（１）において、ｘｉは、区分領域（ｉ，ｊ）の中心点のｘ座標であり、ｘｊは、区分領域（ｉ，ｊ）の中心点のｙ座標である。
したがって、特徴ベクトルＶの第１要素群は、下式（２）により、各区分領域の誘目度の平均値ｅ_ｉｊにそれぞれ独立の係数Ｅ_ｉｊを乗算し、それらを各要素として羅列したものとなる。検索対象画像を水平方向Ｎ個および垂直方向Ｍ個の領域に区分した場合、特徴ベクトルＶの第１要素群は、Ｎ×Ｍ個の要素から構成される。
【００６１】
【数２】

【００６２】
特徴ベクトルＶの第２要素群は、下式（３）により、ステップＳ１１８で判定した顔情報ｆ１〜ｆ９にそれぞれ独立の係数Ｆ_１〜Ｆ_９を乗算し、それらを各要素として羅列したものとなる。
【００６３】
【数３】

【００６４】
特徴ベクトルＶの第３要素群は、下式（４）により、ステップＳ１２０で判定した類似度ｐ_ｋにそれぞれ独立の係数Ｐ_ｋを乗算し、それらを各要素として羅列したものとなる。例えば、検索キー画像にＫ個の人物画像が含まれている場合、検索対象画像に含まれる各顔領域ごとに、検索キー画像に含まれる顔領域ｋ（ｋ＝１〜Ｋ）の顔との類似度を算出する。このとき、顔領域ｋの顔と類似しているとき（類似度が所定値以上であるとき）は、ｐ_ｋ＝１とし、顔領域ｋの顔と類似していないとき（類似度が所定値未満であるとき）は、ｐ_ｋ＝０とする。
【００６５】
【数４】

【００６６】
以上により、特徴ベクトルＶは、下式（５）により、第１要素群、第２要素群および第３要素群の各要素を羅列したものして表される。
【００６７】
【数５】

【００６８】
次いで、ステップＳ１２６に移行して、検索対象画像登録ＤＢ１０のすべての検索対象画像についてステップＳ１０８〜Ｓ１２４の処理が終了したか否かを判定し、すべての検索対象画像について処理が終了したと判定したとき（Ｙｅｓ）は、ステップＳ１２８に移行する。
ステップＳ１２８では、検索対象画像の特徴ベクトルＶ同士のベクトル間距離に基づいて各検索対象画像を複数のクラスタにクラスタリングする。クラスタリングは、例えば、従来のＫ−平均法に基づいて行うことができる。Ｋ−平均法では、第１の処理として、Ｋ個の特徴ベクトルＶを適当に選択し、選択した特徴ベクトルＶ_ｋ（ｋ＝１〜Ｋ）をそれぞれクラスタｋの中心位置とする。クラスタｋの中心位置をｍ_ｋとする。次いで、第２の処理として、下式（６）により、特徴ベクトルＶ_ｉ（ｉ＝１〜Ｎ、Ｎは検索対象画像の総数）とクラスタｋの中心位置ｍ_ｋとのベクトル間距離を算出し、算出したベクトル間距離が最小となるクラスタｋに特徴ベクトルＶ_ｉを属させる。下式（６）は、特徴ベクトルＶ_Ａと特徴ベクトルＶ_Ｂとのベクトル間距離を算出している。
Ｓ＝｜Ｖ_Ａ−Ｖ_Ｂ｜ …（６）
次いで、第３の処理として、クラスタｋの中心位置ｍ_ｋをクラスタｋに属している特徴ベクトルＶ_ｉの平均値で置き換える。次いで、第４の処理として、ｉ＜Ｎの場合、ｉに「１」を加算して第２の処理および第３の処理を行う。そして、第５の処理として、第３の処理で変更前後のｍ_ｋに変化があるときは、ｉ＝１として第２の処理および第３の処理を行う。第３の処理で変更前後のｍ_ｋに変化がないとき、または一定回数以上繰り返し処理を行ったときは、処理を終了し、クラスタｋの中心位置ｍ_ｋおよびこれに属する特徴ベクトルＶ_ｉが決定する。
【００６９】
図６は、ｎ個の特徴ベクトルＶを２つのクラスタにクラスタリングした場合を示す図である。
図６の例では、特徴ベクトルＶ_１，Ｖ_２，Ｖ_３，Ｖ_ｎは、中心位置ｍ_１のクラスタに属し、特徴ベクトルＶ_４，Ｖ_５，Ｖ_６，Ｖ_ｎ−１は、中心位置ｍ_２のクラスタに属している。
【００７０】
次いで、ステップＳ１３０に移行して、指定された検索条件に基づいて検索対象画像登録ＤＢ１０のなかから画像を検索する。検索条件として類似画像検索モードを指定した場合は、検索キー画像の特徴ベクトルＶとのベクトル間距離が最も小さい特徴ベクトルＶに対応する検索対象画像を検索対象画像登録ＤＢ１０のなかから索出する。検索条件として類似画像群検索モードを指定した場合は、複数のクラスタのうち検索キー画像が属するもののすべての検索対象画像を検索対象画像登録ＤＢ１０のなかから索出する。検索条件としてバラエティ検索モードを指定した場合は、各クラスタごとに所定数の検索対象画像を検索対象画像登録ＤＢ１０のなかから索出する。
【００７１】
次いで、ステップＳ１３２に移行して、索出した検索対象画像を指定の表示形態で表示装置６４に表示し、一連の処理を終了して元の処理に復帰させる。
図７は、検索結果を表示した表示画面を示す図である。
図７の例では、検索対象画像１〜ｎを拡大表示モードおよび鮮明表示モードで表示した場合であり、画像１、２、３、ｎは、検索結果として索出された画像であるので、大きくかつ鮮明に表示されているのに対して、その他の画像４、５、６、ｎ−１は、小さくかつぼかして表示されている。
【００７２】
一方、ステップＳ１２６で、検索対象画像登録ＤＢ１０のすべての検索対象画像についてステップＳ１０８〜Ｓ１２４の処理が終了していないと判定したとき（Ｎｏ）は、ステップＳ１３４に移行して、検索対象画像登録ＤＢ１０のなかから次の検索対象画像を読み込み、ステップＳ１０８に移行する。
次に、本実施の形態の動作を説明する。
【００７３】
まず、類似画像検索モードにより画像の類似検索を行う場合を説明する。
類似画像検索モードにより画像の類似検索を行う場合、利用者は、検索要求を入力し、検索モードとして類似画像検索モードを指定するとともに検索キー画像を指定する。また、併せて表示形態も指定する。
画像検索装置１００では、類似画像検索モードおよび検索キー画像が指定されると、ステップＳ１０６〜Ｓ１１０を経て、検索対象画像登録ＤＢ１０のなかから先頭の検索対象画像が読み出され、読み出された検索対象画像から注目領域が抽出されるととともに検索対象画像に顔領域が含まれているか否かが判定される。次いで、ステップＳ１１８，Ｓ１２０を経て、ステップＳ１１０の判定結果に基づいて、検索対象画像に含まれる人物画像の顔の向き、大きさおよび重心位置が判定され、検索対象画像に含まれる人物画像の顔と検索キー画像に含まれる人物画像の顔との類似度が判定される。検索対象画像に複数の被写体の人物画像が含まれている場合は、ステップＳ１１０〜Ｓ１２０を繰り返し経て、各顔領域ごとに顔情報および類似度が判定される。
【００７４】
次いで、検索対象画像のすべての顔領域について顔情報および類似度が判定されると、ステップＳ１２４を経て、抽出された注目領域、並びに判定された顔情報および類似度に基づいて検索対象画像の特徴ベクトルＶが生成される。
このような処理が検索対象画像登録ＤＢ１０のすべての検索対象画像について行われると、ステップＳ１２８，Ｓ１３０を経て、検索対象画像の特徴ベクトルＶ同士のベクトル間距離に基づいて各検索対象画像が複数のクラスタにクラスタリングされ、類似画像検索モードが指定されていることから、検索キー画像の特徴ベクトルＶとのベクトル間距離が最も小さい特徴ベクトルＶに対応する検索対象画像が検索対象画像登録ＤＢ１０のなかから索出される。そして、ステップＳ１３２を経て、索出された検索対象画像が指定の表示形態で表示される。
【００７５】
次に、類似画像群検索モードにより画像の類似検索を行う場合を説明する。
類似画像群検索モードにより画像の類似検索を行う場合、利用者は、検索要求を入力し、検索モードとして類似画像群検索モードを指定するとともに検索キー画像を指定する。また、併せて表示形態も指定する。
各検索対象画像をクラスタリングするまでは、類似画像検索モードで類似検索する場合と同様である。画像検索装置１００では、各検索対象画像がクラスタリングされると、ステップＳ１３０を経て、類似画像群検索モードが指定されていることから、複数のクラスタのうち検索キー画像が属するもののすべての検索対象画像が検索対象画像登録ＤＢ１０のなかから索出される。そして、ステップＳ１３２を経て、索出された検索対象画像が指定の表示形態で表示される。
【００７６】
次に、バラエティ検索モードにより画像の類似検索を行う場合を説明する。
バラエティ検索モードにより画像の類似検索を行う場合、利用者は、検索要求を入力し、検索モードとしてバラエティ検索モードを指定する。また、併せて表示形態も指定する。
各検索対象画像をクラスタリングするまでは、類似画像検索モードで類似検索する場合と同様である。画像検索装置１００では、各検索対象画像がクラスタリングされると、ステップＳ１３０を経て、バラエティ検索モードが指定されていることから、各クラスタごとに所定数の検索対象画像が検索対象画像登録ＤＢ１０のなかから索出される。そして、ステップＳ１３２を経て、索出された検索対象画像が指定の表示形態で表示される。
【００７７】
このようにして、本実施の形態では、検索キー画像および各検索対象画像についてその画像から注目領域を抽出し、検索キー画像および各検索対象画像について抽出した注目領域に基づいてその画像の特徴ベクトルＶを生成し、生成した特徴ベクトルＶに基づいて検索対象画像登録ＤＢ１０のなかから検索キー画像に適合する画像を検索するようになっている。
【００７８】
これにより、利用者が注目する箇所を考慮して検索が行われるので、利用者の主観が検索結果に反映しやすくなる。したがって、従来に比して、利用者の希望に比較的添った検索結果を得ることができる。
さらに、本実施の形態では、検索キー画像および各検索対象画像についてその画像に含まれる人物画像の顔の向き、大きさまたは重心位置を判定し、検索キー画像および各検索対象画像について、判定した顔の向き、大きさまたは重心位置に基づいて、その画像の特徴ベクトルＶを生成するようになっている。
【００７９】
これにより、人物画像の顔の向き、大きさまたは重心位置を考慮して検索が行われるので、検索キー画像に含まれる人物画像の顔に適合する検索結果を得ることができる。
さらに、本実施の形態では、各検索対象画像ごとに、その検索対象画像に含まれる人物画像の顔と検索キー画像に含まれる人物画像の顔との類似度を判定し、検索キー画像および各検索対象画像について、判定した類似度に基づいて、その画像の特徴ベクトルＶを生成するようになっている。
【００８０】
これにより、人物画像の顔同士の類似度を考慮して検索が行われるので、検索キー画像に含まれる人物画像の顔に類似する検索結果を得ることができる。
さらに、本実施の形態では、検索キー画像の特徴ベクトルＶとのベクトル間距離が最も小さい特徴ベクトルＶに対応する検索対象画像を検索対象画像登録ＤＢ１０のなかから索出するようになっている。
【００８１】
これにより、利用者の希望に最も添うと思われる検索結果を得ることができる。
さらに、本実施の形態では、検索対象画像の特徴ベクトルＶ同士のベクトル間距離に基づいて各検索対象画像を複数のクラスタに分類し、複数のクラスタのうち検索キー画像が属するもののすべての検索対象画像を検索対象画像登録ＤＢ１０のなかから索出するようになっている。
【００８２】
これにより、利用者の希望に添うと思われるいくつかの検索結果を得ることができる。
さらに、本実施の形態では、検索対象画像の特徴ベクトルＶ同士のベクトル間距離に基づいて各検索対象画像を複数のクラスタに分類し、各クラスタごとに所定数の検索対象画像を検索対象画像登録ＤＢ１０のなかから索出するようになっている。
【００８３】
これにより、異なるクラスタから所定数の検索対象画像が索出されるので、多様な検索結果を得ることができる。
上記実施の形態において、検索対象画像は、発明８ないし１２、１５または１７の分類対象画像に対応し、検索対象画像登録ＤＢ１０は、発明２、５、６、１４若しくは１６の検索対象画像記憶手段、または発明９、１２若しくは１５の分類対象画像記憶手段に対応している。また、ステップＳ１０４および検索キー画像指定部１２は、発明２若しくは１４の検索キー画像入力手段、または発明１６の検索キー画像入力ステップに対応し、ステップＳ１０８および注目領域抽出部１６は、発明２、９、１４若しくは１５の注目領域抽出手段、発明１７の注目領域抽出ステップ、発明１６の第１注目領域抽出ステップ、または発明１６の第２注目領域抽出ステップに対応している。
【００８４】
また、上記実施の形態において、ステップＳ１１８および顔情報判定部３６は、発明３または１０の顔情報判定手段に対応し、ステップＳ１２０および類似度判定部３８は、発明４または１１の類似度判定手段に対応し、ステップＳ１２４および特徴ベクトル生成部２０は、発明２ないし４、９ないし１１、１４若しくは１５の特徴ベクトル生成手段、発明１６の第２特徴ベクトル生成ステップ、または発明１６の第１特徴ベクトル生成ステップに対応している。また、ステップＳ１２８，Ｓ１３０および画像検索部２４は、発明２、５、６若しくは１４の画像検索手段、発明９、１２若しくは１５の画像分類手段、発明１６の画像検索ステップ、または発明１７の画像分類ステップに対応している。
【００８５】
なお、上記実施の形態においては、検索対象画像のアスペクト比について特に説明しなかったが、検索対象画像のアスペクト比が異なる場合は、次のように画像の類否を判定する。
図８は、アスペクト比が異なる検索対象画像Ａ，Ｂを重ね合わせた場合を示す図である。
【００８６】
アスペクト比が異なる検索対象画像Ａ，Ｂの類否を判定する場合は、図８に示すように、検索対象画像Ａ，Ｂとを重ね合わせ、検索対象画像Ａのうち重複領域について検索対象画像Ａの特徴ベクトルＶ_Ａを生成し、検索対象画像Ｂのうち重複領域について検索対象画像Ｂの特徴ベクトルＶ_Ｂを生成し、生成した特徴ベクトルＶ_Ａ，Ｖ_Ｂに基づいて検索対象画像Ａ，Ｂの類否を判定する。
【００８７】
この場合、さらに、重複領域が異なるように検索対象画像Ａ，Ｂの重ね合わせ方を変えて検索対象画像Ａ，Ｂを重ね合わせ、各組み合わせごとに算出した検索対象画像Ａの特徴ベクトルＶ_Ａｉ（ｉ＝１〜Ｎ、Ｎは組み合わせ総数）の平均値を検索対象画像Ａの特徴ベクトルＶ_Ａとして生成し、各組み合わせごとに算出した検索対象画像Ｂの特徴ベクトルＶ_Ｂｉの平均値を検索対象画像Ｂの特徴ベクトルＶ_Ｂとして生成してもよい。
【００８８】
これにより、アスペクト比が異なる検索対象画像同士であっても、比較的正確に類否を判定することができるので、利用者の希望にさらに添った検索結果を得ることができる。
この場合において、ステップＳ１２４および特徴ベクトル生成部２０は、発明７または１３の特徴ベクトル生成手段に対応している。
【００８９】
また、上記実施の形態においては、特徴ベクトルＶの第１要素群を、上式（２）により、各区分領域の誘目度の平均値ｅ_ｉｊにそれぞれ独立の係数Ｅ_ｉｊを乗算し、それらを各要素として羅列したものとして生成するように構成したが、これに限らず、注目領域の算出に誘目度を用いる場合、分割された領域では誘目度は一定になるのでステップＳ１０８では、次のように生成することもできる。まず、検索対象画像のなかから誘目度の高い順にＨ個の注目領域を選択する。次いで、下式（７）により、注目領域ｈ（ｈ＝１〜Ｈ）の水平方向の中心座標ｘ_ｈに係数Ｘを乗算し、注目領域ｈの垂直方向の中心座標ｙ_ｈに係数Ｙ_ｈを乗算する。また、注目領域ｈの誘目度ｅ_ｈに係数Ｅを乗算し、注目領域ｈの面積ｓ_ｈに係数Ｓを乗算する。そして、それらＸｘ_ｈ、Ｙｙ_ｈ、Ｅｅ_ｈ、Ｓｓ_ｈを各要素として羅列したものを特徴ベクトルＶの第１要素群として生成する。
【００９０】
【数６】

【００９１】
この場合、抽出した注目領域の個数ｈが所定数（例えば、１０個）に満たない場合は、特徴ベクトルＶの第１要素群をすべて「０」とする。
また、上記実施の形態において、図３のフローチャートに示す処理を実行するにあたっては、ＲＯＭ５２にあらかじめ格納されている制御プログラムを実行する場合について説明したが、これに限らず、これらの手順を示したプログラムが記憶された記憶媒体から、そのプログラムをＲＡＭ５４に読み込んで実行するようにしてもよい。
【００９２】
ここで、記憶媒体とは、ＲＡＭ、ＲＯＭ等の半導体記憶媒体、ＦＤ、ＨＤ等の磁気記憶型記憶媒体、ＣＤ、ＣＤＶ、ＬＤ、ＤＶＤ等の光学的読取方式記憶媒体、ＭＯ等の磁気記憶型／光学的読取方式記憶媒体であって、電子的、磁気的、光学的等の読み取り方法のいかんにかかわらず、コンピュータで読み取り可能な記憶媒体であれば、あらゆる記憶媒体を含むものである。
【００９３】
また、上記実施の形態においては、本発明に係る画像検索システム、画像分類システム、画像検索プログラムおよび画像分類プログラム、並びに画像検索方法および画像分類方法を、利用者が注目する箇所を考慮して画像の類似検索を行う場合について適用したが、これに限らず、本発明の主旨を逸脱しない範囲で他の場合にも適用可能である。例えば、画像を分類する場合について適用することができる。
【図面の簡単な説明】
【図１】本発明に係る画像検索装置１００の構成を示す機能ブロック図である。
【図２】コンピュータ２００の構成を示すブロック図である。
【図３】画像検索処理を示すフローチャートである。
【図４】縦向きの検索対象画像の一例を示す図である。
【図５】横向きの検索対象画像の一例を示す図である。
【図６】ｎ個の特徴ベクトルＶを２つのクラスタにクラスタリングした場合を示す図である。
【図７】検索結果を表示した表示画面を示す図である。
【図８】アスペクト比が異なる検索対象画像Ａ，Ｂを重ね合わせた場合を示す図である。
【符号の説明】
１００…画像検索装置，２００…コンピュータ，１０…検索対象画像登録ＤＢ，１２…検索キー画像指定部，１４…検索キー画像読出部，１６…注目領域抽出部，１８…顔画像処理部，２０…特徴ベクトル生成部，２２…検索条件指定部，２４…画像検索部，２６…表示形態指定部，２８…画像表示部，３４…顔領域判定部，３６…顔情報判定部，３８…類似度判定部，５０…ＣＰＵ，５２…ＲＯＭ，５４…ＲＡＭ，５８…Ｉ／Ｆ，６０…入力装置，６４…表示装置[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a system, a program, and a method for searching for an image that matches a search key image or classifying a plurality of images from among a plurality of images, and particularly relates to a search result or a classification result that meets a user's desire. The present invention relates to an image search system, an image classification system, an image search program, an image classification program, and an image search method and an image classification method suitable for obtaining.
[0002]
[Prior art]
Conventionally, as a technique for searching for an image similar to a search key image from among a plurality of search target images based on a given search key image, for example, there is an image search apparatus disclosed in Patent Document 1. Was.
The image retrieval device described in Patent Literature 1 includes a feature vector extraction unit that extracts a plurality of feature amounts from acquired target image data or reference image data to generate a feature vector, and acquires a plurality of acquired reference image data. A reference vector group processing unit that integrates the feature vector extracted by the feature vector extraction unit for each reference image data and the address information of the reference image to create a reference vector, and generates a reference vector group. A similarity calculator that calculates a similarity between a feature vector extracted by the feature vector extracting unit from the acquired target image data and a feature vector of the reference image data selected from the reference vector group; A similarity determination unit that compares the similarity with a predetermined reference, and a unit that extracts address information of an image determined to be similar from a reference vector group.
[0003]
Here, colors, textures, structural features, and temporal features are used as image feature amounts. For texture, calculation of density histogram, co-occurrence matrix, difference statistic, etc., and for structural features such as edges, lines, contours, etc., by convolution or Hough transform of Laplacian filter, and for color, RGB For a temporal feature, a feature amount is obtained by calculation of an optical flow or conversion to a wavelet, for example, by conversion to a space, an HSV space, or a spectrum.
[0004]
[Patent Document 1]
JP 2001-52175 A
[0005]
[Problems to be solved by the invention]
A similar concept largely depends on the subjectivity of a person, so that an image that some people may feel similar to may not feel similar to others. Therefore, when performing an image similarity search, it is important how to define a similar concept.
[0006]
Considering the entire image and its parts, for example, an image that is similar overall but not similar to the key image, or similar to the search key image, Although there are images that are not similar overall, it is necessary to appropriately evaluate the similarity of each of these images. When a user views an image, the user pays attention to a characteristic portion (a flower portion in an image in which flowers are mainly captured) in the image. Therefore, an image that is similar in characteristic portions but not entirely similar is more similar than an image that is entirely similar but not similar in characteristic portions. You should feel. Therefore, rather than evaluating the similarity by taking the entire image, the similarity of the image is evaluated by emphasizing the similarity for a characteristic part of the image, and disregarding the similarity for the other part. It will be more in line with the actual situation.
[0007]
However, in the image search device described in Patent Document 1, a plurality of feature amounts are extracted from the entire image data to generate a feature vector, and an image similarity search is performed based on the generated feature vector. Therefore, the similarity of the images is not evaluated in consideration of the part to which the user pays attention. Therefore, there is a problem that the user's subjectivity cannot be sufficiently reflected in the search result, and it is difficult to obtain the search result that meets the user's wish.
[0008]
This is not limited to the case where the similarity search of images is performed, but the same problem is assumed in the case where a plurality of images are classified according to their similarity.
Therefore, the present invention has been made by paying attention to the unresolved problems of the conventional technology, and is an image search system suitable for obtaining a search result or a classification result that meets a user's desire. It is an object to provide an image classification system, an image search program, an image classification program, and an image search method and an image classification method.
[0009]
[Means for Solving the Problems]
[Invention 1]
In order to achieve the above object, an image search system according to Invention 1 includes:
Based on a given search key image, a system for searching for an image that matches the search key image from among a plurality of search target images,
For the search key image and each of the search target images, a region of interest is extracted from the image, and a feature vector indicating a feature of the image is generated based on the extracted region of interest,
An image matching the search key image is searched from the plurality of search target images based on the generated feature vector.
[0010]
With such a configuration, when a search key image is given, a region of interest is extracted from the search key image, and a feature vector indicating a feature of the search key image is generated based on the extracted region of interest. Similarly, for each search target image, a region of interest is extracted from the search target image, and a feature vector indicating the feature of the search target image is generated based on the extracted region of interest. Then, based on the generated feature vector, an image matching the search key image is searched from among the plurality of search target images.
[0011]
As a result, the search is performed in consideration of the part that the user pays attention to, so that the subjectivity of the user is easily reflected in the search result. Therefore, an effect is obtained that a search result that relatively complies with the user's wish can be obtained as compared with the related art.
Here, the attention area refers to an area in the search key image or the search target image which is considered to be noticed by the user. Hereinafter, the same applies to the image search system of invention 2, the image classification systems of inventions 8 and 9, the image search program of invention 14, the image classification program of invention 15, the image search method of invention 16, and the image classification method of invention 17. .
[0012]
The present system may be realized as a single device, terminal, or other device, or may be realized as a network system in which a plurality of devices, terminals, or other devices are communicably connected. In the latter case, each component may belong to any of a plurality of devices and the like as long as they are communicably connected to each other. Hereinafter, the same applies to the image retrieval system of the second aspect and the image classification systems of the eighth and ninth aspects.
[Invention 2]
Further, the image search system of the second aspect is
Based on a given search key image, a system for searching for an image that matches the search key image from among a plurality of search target images,
Search target image storage means for storing the plurality of search target images, search key image input means for inputting the search key image, search key image input by the search key image input means, and storage of the search target image A region of interest extraction means for extracting a region of interest from each of the search target images of the means; and a feature of the image based on the search key image and the region of interest extracted by the region of interest extraction for each of the search target images. A feature vector generating means for generating a feature vector to be shown, and an image search means for searching the search target image storage means for an image matching the search key image based on the feature vector generated by the feature vector generating means. It is characterized by having.
[0013]
With such a configuration, when the search key image is input from the search key image input means, the attention area is extracted from the input search key image by the attention area extraction means, and is extracted by the feature vector generation means. A feature vector indicating the feature of the search key image is generated based on the attention area. Similarly, for each search target image in the search target image storage unit, an attention region is extracted from the search target image by the attention region extraction unit, and the search is performed based on the extracted attention region by the feature vector generation unit. A feature vector indicating the feature of the target image is generated. Then, based on the generated feature vector, the image search means searches the search target image storage means for an image that matches the search key image.
[0014]
As a result, the search is performed in consideration of the part that the user pays attention to, so that the subjectivity of the user is easily reflected in the search result. Therefore, an effect is obtained that a search result that relatively complies with the user's wish can be obtained as compared with the related art.
Here, the attention area extracting means only needs to extract the attention area from the search key image and each search target image. For example, the degree of interest can be calculated based on the search key image or the search target image, and the attention area can be extracted based on the calculated degree of interest. The same applies to the image classification system of the ninth aspect, the image search program of the fourteenth aspect, and the image classification program of the fifteenth aspect.
[0015]
The search key image input means may have any configuration as long as the search key image is input. For example, a search target image selected from the search target image storage means may be used as the search key image. May be input, or a search key image may be input from an image storage medium, a network, or other image storage means. The same applies to the image classification system of the ninth aspect, the image search program of the fourteenth aspect, and the image classification program of the fifteenth aspect.
[0016]
The search target image storage means stores the search target image by any means and at any time, and may store the search target image in advance, or may store the search target image in advance. Alternatively, the search target image may be stored by an external input or the like during the operation of the present system. [Invention 3]
The image search system according to a third aspect of the present invention is the image search system according to the second aspect, further comprising a face information determination unit configured to determine face information of a person image included in the search key image and each of the search target images. ,
The feature vector generation unit is configured to generate, for the search key image and each of the search target images, a feature vector indicating a feature of the image based on the face information determined by the face information determination unit. It is characterized by.
[0017]
With such a configuration, the face information determination unit determines the face information of the person image included in the search key image, and the feature vector generation unit determines the characteristics of the search key image based on the determined face information. The feature vector shown is generated. Similarly, for each search target image, face information determination means determines face information of a person image included in the search target image, and the feature vector generation means determines the face information based on the determined face information. A feature vector indicating the feature of the target image is generated.
[0018]
Accordingly, since the search is performed in consideration of the face information of the person image, an effect that a search result matching the face of the person image included in the search key image can be obtained can be obtained.
[Invention 4]
Further, the image search system according to the fourth aspect is the image search system according to any one of the second and third aspects,
Further, a similarity determination unit that determines a similarity between a face of a person image included in the search target image and a face of a person image included in the search key image for each of the search target images,
The feature vector generation unit is configured to generate, for the search key image and each of the search target images, a feature vector indicating a feature of the image based on a determination result of the similarity determination unit. And
[0019]
With such a configuration, the similarity determination unit determines, for each search target image, the similarity between the face of the person image included in the search target image and the face of the person image included in the search key image. The feature vector generation unit generates a feature vector indicating the feature of the search key image and a feature vector indicating the feature of each search target image based on the determination result.
[0020]
Accordingly, since the search is performed in consideration of the similarity between the faces of the person image, an effect that a search result similar to the face of the person image included in the search key image can be obtained is also obtained.
[Invention 5]
Further, the image search system according to the fifth aspect is the image search system according to any one of the second to fourth aspects,
The image search means is configured to search the search target image storage means for a search target image corresponding to a feature vector having the smallest inter-vector distance from a feature vector of the search key image. And
[0021]
With such a configuration, the image search means searches the search target image storage means for the search target image corresponding to the feature vector having the smallest inter-vector distance from the feature vector of the search key image.
As a result, it is possible to obtain a search result most likely to meet the user's wish.
[Invention 6]
Further, the image search system of the invention 6 is the image search system of any of the inventions 2 to 4,
The image search means classifies each of the search target images into a plurality of groups based on an inter-vector distance between feature vectors of the search target images, and searches all of the plurality of groups to which the search key image belongs. A target image is retrieved from the search target image storage means.
[0022]
With such a configuration, each image to be searched is classified into a plurality of groups by the image search means based on the inter-vector distance between the feature vectors of the images to be searched. All the search target images are retrieved from the search target image storage means.
As a result, it is possible to obtain some search results that are considered to satisfy the user's wish.
[Invention 7]
Further, the image search system according to a seventh aspect is the image search system according to any one of the second to sixth aspects,
The feature vector generating unit, when the aspect ratio between the search key image and the search target image or the aspect ratio between the search target images is different, superimposes the first image and the second image having different aspect ratios, A feature vector of the first image is generated for an overlapping area of the first image, and a feature vector of the second image is generated for an overlapping area of the second image.
[0023]
With such a configuration, when the aspect ratio between the search key image and the search target image or the aspect ratio between the search target images is different, the first image and the second image having different aspect ratios are separated by the feature vector generation unit. The superposition is performed, and a feature vector of the first image is generated for the overlapping area of the first image, and a feature vector of the second image is generated for the overlapping area of the second image.
[0024]
As a result, similarity can be determined relatively accurately even between images having different aspect ratios, so that an effect of obtaining a search result that further meets the user's wish can be obtained.
[Invention 8]
On the other hand, in order to achieve the above object, the image classification system of Invention 8
A system for classifying a plurality of classification target images,
For each of the classification target images, a region of interest is extracted from the classification target image, and a feature vector indicating a characteristic of the classification target image is generated based on the extracted region of interest,
Each of the classification target images is classified into a plurality of groups based on the generated feature vector.
[0025]
With such a configuration, for each classification target image, a region of interest is extracted from the classification target image, and a feature vector indicating the characteristics of the classification target image is generated based on the extracted region of interest. Then, each classification target image is classified into a plurality of groups based on the generated feature vectors.
As a result, the classification is performed in consideration of the part to which the user pays attention, so that the subjectivity of the user is easily reflected in the classification result. Therefore, an effect is obtained that a classification result that relatively complies with the user's wish can be obtained as compared with the related art.
[Invention 9]
Furthermore, the image classification system of the ninth aspect is
A system for classifying a plurality of classification target images,
Classification target image storage means for storing the plurality of classification target images; attention region extraction means for extracting a region of interest from the classification target image for each classification target image of the classification target image storage means; A feature vector generation unit configured to generate a feature vector indicating a feature of the classification target image for each of the classification target images based on the attention area extracted by the extraction unit; and a feature vector generated by the feature vector generation unit. Image classification means for classifying each classification target image into a plurality of groups.
[0026]
With such a configuration, for each classification target image in the classification target image storage unit, the region of interest is extracted from the classification target image by the region of interest extraction unit, and the extracted region of interest is extracted by the feature vector generation unit. Based on this, a feature vector indicating the feature of the classification target image is generated. Then, the classification target images are classified into a plurality of groups by the image classification means based on the generated feature vectors.
[0027]
As a result, the classification is performed in consideration of the part to which the user pays attention, so that the subjectivity of the user is easily reflected in the classification result. Therefore, an effect is obtained that a classification result that relatively complies with the user's wish can be obtained as compared with the related art.
Here, the classification target image storage means stores the classification target image by any means and at any time, and may store the classification target image in advance, or store the classification target image in advance. Instead, the classification target image may be stored by an external input or the like during the operation of the present system.
[Invention 10]
Further, the image classification system of the tenth aspect is the image classification system of the ninth aspect,
Further, a face information determination unit that determines face information of a person image included in the classification target image for each of the classification target images,
The feature vector generation unit is configured to generate, for each of the classification target images, a feature vector indicating a feature of the classification target image based on the face information determined by the face information determination unit. And
[0028]
With such a configuration, for each classification target image, the face information of the person image included in the classification target image is determined by the face information determination unit, and based on the determined face information by the feature vector generation unit. Thus, a feature vector indicating the feature of the classification target image is generated.
Thus, since the classification is performed in consideration of the face information of the person image, it is possible to obtain an effect of obtaining a classification result in which images having matching faces of the person image belong to the same group.
[Invention 11]
Further, the image classification system according to Invention 11 is the image classification system according to any of Inventions 9 and 10, wherein
Further, a similarity determination unit that determines a similarity between a face of a person image included in the classification target image and a face of a person image included in the specific image for each of the classification target images,
The feature vector generation unit is configured to generate, for each of the classification target images, a feature vector indicating a feature of the classification target image based on a determination result of the similarity determination unit. .
[0029]
With such a configuration, for each classification target image, the similarity determination unit determines the similarity between the face of the person image included in the classification target image and the face of the person image included in the specific image, The feature vector generating means generates a feature vector indicating the feature of the classification target image based on the determination result.
Thus, since the classification is performed in consideration of the similarity between the faces of the human images, it is possible to obtain an effect of obtaining a classification result in which images having similar human image faces belong to the same group.
[Invention 12]
Further, the image classification system according to Invention 12 is the image classification system according to any one of Inventions 9 to 11,
The image classifying means classifies each of the classification target images into a plurality of groups based on a distance between feature vectors of the classification target images, and divides a predetermined number of classification target images for each group into the classification target images. It is characterized in that it is retrieved from the storage means.
[0030]
With such a configuration, the classification target images are classified into a plurality of groups by the image classification means based on the inter-vector distance between the feature vectors of the classification target images, and a predetermined number of classification target images are classified for each group. It is retrieved from the classification target image storage means.
As a result, a predetermined number of classification target images are retrieved from different groups, so that an effect that various search results can be obtained is also obtained.
[Invention 13]
Further, the image classification system according to Invention 13 is the image classification system according to any one of Inventions 9 to 12, wherein
The feature vector generating means superimposes the first image and the second image having different aspect ratios when the aspect ratios of the classification target images are different, and sets a feature of the first image for an overlapping area in the first image. The method is characterized in that a vector is generated, and a feature vector of the second image is generated for an overlapping area of the second image.
[0031]
With such a configuration, when the aspect ratios of the classification target images are different, the first image and the second image having different aspect ratios are superimposed by the feature vector generation unit, and the first region is overlapped with the first image. A feature vector of the image is generated, and a feature vector of the second image is generated for the overlapping area in the second image.
As a result, similarity can be determined relatively accurately even between images having different aspect ratios, so that an effect of obtaining a classification result that further meets the user's desire can be obtained.
[Invention 14]
On the other hand, in order to achieve the above object, an image search program of invention 14 is
A program for searching for an image that matches the search key image from among a plurality of search target images based on a given search key image,
For a computer that can use the search target image storage means for storing the plurality of search target images, and a search key image input means for inputting the search key image,
Attention area extraction means for extracting an attention area from each of the search key images input by the search key image input means and the respective search target images of the search target image storage means, and the search key image and each of the search target images A feature vector generating means for generating a feature vector indicating a feature of the image based on the attention area extracted by the attention area extracting means; and a feature vector storage means based on the feature vector generated by the feature vector generating means. A program for executing a process realized as an image search means for searching for an image matching the search key image.
[0032]
With such a configuration, when the program is read by the computer and the computer executes the processing in accordance with the read program, an operation and an advantage equivalent to those of the image search system in Aspect 2 are attained.
[Invention 15]
On the other hand, in order to achieve the above object, an image classification program of Invention 15
A program for classifying a plurality of classification target images,
For a computer that can use the classification target image storage means for storing the plurality of classification target images,
A region-of-interest extraction means for extracting a region of interest from the classification-target image for each classification-target image in the classification-target image storage unit; A process implemented as a feature vector generating unit that generates a feature vector indicating a feature of the target image, and an image classifying unit that classifies each of the classification target images into a plurality of groups based on the feature vector generated by the feature vector generating unit Is a program for executing
[0033]
With such a configuration, when the program is read by the computer and the computer executes the processing in accordance with the read program, an operation and an advantage equivalent to those of the image classification system according to Aspect 9 are attained.
[Invention 16]
On the other hand, in order to achieve the above object, an image search method according to Invention 16 includes:
Based on a given search key image, a method for searching for an image that matches the search key image from a search target image storage unit that stores a plurality of search target images,
A search key image input step of inputting the search key image,
A first region of interest extraction step of extracting a region of interest from the search key image input in the search key image input step;
A first feature vector generation step of generating a feature vector indicating a feature of the search key image based on the attention area extracted in the first attention area extraction step;
A second attention area extraction step of extracting an attention area from the search target image;
A second feature vector generation step of generating a feature vector indicating a feature of the search target image based on the attention area extracted in the second attention area extraction step;
A repetition step of repeatedly performing the second attention area extraction step and the second feature vector generation step for each search target image in the search target image storage means, and the first feature vector generation step and the second feature vector generation An image search step of searching the search target image storage means for an image that matches the search key image based on the feature vector generated in the step.
[0034]
Thereby, the same effect as that of the image search system of the second aspect is obtained.
Here, in the attention area extraction step, the attention area may be extracted from the search key image and each search target image from the images. For example, the degree of interest can be calculated based on the search key image or the search target image, and the attention area can be extracted based on the calculated degree of interest. Hereinafter, the same applies to the image classification method of the seventeenth aspect.
[0035]
The search key image input step may be performed by any method as long as the search key image is input. For example, a search target image selected from the search target image storage unit may be input as the search key image. Alternatively, a search key image may be input from an image storage medium, a network, or other image storage means.
[Invention 17]
On the other hand, in order to achieve the above object, the image classification method of Invention 17
A method for classifying a plurality of classification target images,
An attention area extracting step of extracting an attention area from the classification target image;
A feature vector generation step of generating a feature vector indicating a feature of the classification target image based on the attention area extracted in the attention area extraction step;
A repetition step of repeatedly performing the attention area extraction step and the feature vector generation step for each of the classification target images,
An image classification step of classifying each of the classification target images into a plurality of groups based on the feature vector generated in the feature vector generation step.
[0036]
Thereby, the same effect as that of the image classification system of Aspect 9 is obtained.
[0037]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings. 1 to 7 are diagrams showing an embodiment of an image search system, an image classification system, an image search program, an image classification program, and an image search method and an image classification method according to the present invention.
[0038]
The present embodiment relates to an image search system, an image classification system, an image search program, an image classification program, and an image search method and an image classification method according to the present invention. This is applied to the case of performing.
In the present embodiment, the concept of “attraction level” is used as an extraction criterion for a portion (hereinafter, referred to as a region of interest) that is likely to be noticed by a user in an image. The method of calculating the degree of attraction is disclosed in detail, for example, in Japanese Patent Application Laid-Open No. 2001-126070 (attention region extraction device and automatic composition determination device using the same).
[0039]
A brief description of the degree of attraction will be given.
To extract a region of interest, the degree of attraction is evaluated according to the physical characteristics of the original image. Here, the degree of attraction means a parameter suitable for human subjectivity. The region of interest is extracted by extracting a region that stands out from the evaluation result as a region of interest. In other words, when the attention area is evaluated, the evaluation is performed in accordance with the human subjectivity according to the physical characteristics, so that the attention area suitable for the human subjectivity can be extracted.
[0040]
For example, when the physical characteristics include the degree of heterogeneity of color, the degree of attraction can be evaluated based on the difference in color of each region.
In addition, since the physical characteristics further include, in addition to the color heterogeneity, the shape heterogeneity, the area heterogeneity, and the texture (pattern) heterogeneity, the physical heterogeneity is determined based on at least one of the four heterogeneities. By evaluating the degree of eye-catching, the degree of eye-catching can be accurately evaluated according to the characteristics of the original image.
[0041]
In the case where three color components (hue, saturation, and lightness) are also evaluated, a region close to a prominent color (red) according to human subjectivity can be evaluated as the most prominent region.
Furthermore, by evaluating the spatial frequency and the area of each region in the original image, the evaluation of the most prominent region can be determined more accurately.
[0042]
The evaluation of the degree of attraction is performed according to the following procedure.
(1) First, the original image is divided into regions. In this case, the original image is divided into a drawing area and a picture area. The method of area division is described in W.I. Y. Ma and B. S. A boundary detection method based on “edge flow” described in “Edge Flow: A Framework of Boundary Detection and Image Segmentation” by Manjunath et al. Is applied.
(2) Next, the divided figure region is extracted, and the degree of attraction of the region is evaluated.
[0043]
The evaluation of the degree of attraction is roughly performed as follows.
First, the degree of heterogeneity attraction in each region is determined. In this case, the heterogeneity of color, the heterogeneity of texture, the heterogeneity of shape, and the heterogeneity of area are respectively obtained, and weighting factors are assigned to each of them to perform linear combination, thereby obtaining the heterogeneity attracting degree of each region.
Next, the degree of feature attraction in each region is determined. In this case, the attraction of color, the attraction of spatial frequency, and the attraction of area are obtained, and weighting factors are assigned to each of them to perform linear combination to obtain the characteristic attraction of each region.
[0044]
Next, the degree of attraction of heterogeneity and the degree of feature attraction of each region are added to obtain a feature amount integrated value, and the feature amount integrated value is evaluated by a predetermined beta function to calculate the degree of attraction.
(3) Also, a pattern diagram in which the degree of attraction is evaluated is generated from the original image.
Next, the configuration of the image search device 100 according to the present invention will be described with reference to FIG.
[0045]
FIG. 1 is a functional block diagram showing a configuration of an image search device 100 according to the present invention.
As shown in FIG. 1, the image search apparatus 100 includes a search target image registration database (hereinafter, simply referred to as DB) 10 in which a plurality of search target images are registered, and a search for designating a search key image. The image processing apparatus includes a key image specifying unit 12 and a search key image reading unit 14 that reads out the search target image specified by the search key image specifying unit 12 from the search target image registration DB 10 as a search key image. Further, for the search key image read out by the search key image reading unit 14 and each search target image in the search target image registration DB 10, an attention area extraction unit 16 that extracts an attention area from the image, and the search key image reading unit 14 reads out the search key image. The face image processing unit 18 determines the face information and the similarity for each of the search key images and the search target images in the search target image registration DB 10, and the determination result of the attention area and the face image processing unit 18 extracted by the attention area extraction unit 16. And a feature vector generation unit 20 for generating a feature vector indicating the feature of the image based on the image data. Further, an image is searched from the search target image registration DB 10 based on the search condition specifying unit 22 for specifying the search condition and the search condition specified by the search condition specifying unit 22 and the feature vector generated by the feature vector generation unit 20. Image search unit 24, a display form designating unit 26 for designating the display form of the search result, and an image display unit 28 for displaying the image of the search result in the display form designated by the display form designating unit 26. I have.
[0046]
The face image processing unit 18 includes, for each of the search key images read by the search key image reading unit 14 and the search target images in the search target image registration DB 10, an area corresponding to the face of a human image (hereinafter referred to as a face area). And a face area determination unit that determines the orientation, size, and center of gravity of the face of the human image included in the image based on the determination result of the face area determination unit 34 The similarity between the face of the person image included in the search target image and the face of the person image included in the search key image is determined for each search target image based on the determination result of the unit 36 and the face area determination unit 34 And a similarity determination unit 38.
[0047]
Specifically, as shown in FIG. 2, the image search device 100 can be realized as a computer 200 and a program executed by the computer 200. The configuration of the computer 200 will be described with reference to FIG.
FIG. 2 is a block diagram showing a configuration of the computer 200.
As shown in FIG. 2, the computer 200 includes a CPU 50 for controlling the operation and the entire system based on the control program, a ROM 52 storing a control program for the CPU 50 in a predetermined area in advance, and data read from the ROM 52 or the like. The RAM 50 includes a RAM 54 for storing a calculation result required in a calculation process of the CPU 50, and an I / F 58 for mediating input / output of data with respect to an external device. These are signal lines for transferring data. Are connected to each other and capable of exchanging data.
[0048]
The I / F 58 includes, as external devices, a search target image registration DB 10, an input device 60 such as a keyboard and a mouse capable of inputting data as a human interface, and a display device 64 for displaying a screen based on image signals. Is connected.
The CPU 50 includes a microprocessing unit (MPU) or the like, starts a predetermined program stored in a predetermined area of the ROM 52, and executes an image search process shown in the flowchart of FIG. 3 according to the program. .
[0049]
FIG. 3 is a flowchart showing the image search process.
The image search process is a process executed in response to a search request input from the input device 60. When the image search process is executed in the CPU 50, the process first proceeds to step S100 as shown in FIG. .
In step S100, designation of a search condition is input. The search conditions include a similar image search mode for searching the search target image registration DB 10 for an image most similar to the search key image, and a similar image search mode for searching the search target image registration DB 10 for a plurality of images similar to the search key image. An image group search mode and a variety search mode for searching a plurality of images having different properties from the search target image registration DB 10 can be designated.
[0050]
Next, the process shifts to step S102 to input designation of a display mode. The display mode includes an enlarged display mode in which images that match the search conditions are displayed in large size and images that do not match the search conditions are displayed in a small size, and images that match the search conditions are clearly displayed and images that do not match the search conditions are blurred. A clear display mode to be displayed can be specified.
[0051]
Next, the process proceeds to step S104 to specify a search key image from the search target image registration DB 10. When the variety search mode is specified as the search condition, it is not necessary to specify the search key image. Hereinafter, the processing of steps S106 to S126 and step S134 is performed for all the search target images including the search key image.
[0052]
Next, the process moves to step S106, reads the first search target image from the search target image registration DB 10, and moves to step S108.
In step S108, the degree of attraction is calculated based on the read image to be searched, and a region of interest is extracted based on the calculated degree of attraction. The region of interest is extracted by the above method. Since the absolute value of the degree of attraction may be affected by the search target image, in order to evaluate all the search target images equally, the degree of attraction is normalized and the degree of attention of the attention area is determined at a predetermined stage (for example, 10 Stage). Hereinafter, the attractiveness calculated for each pixel constituting the search target image is e ′. _xy And x and y indicate the X and Y coordinates of the pixel in the search target image.
[0053]
FIG. 4 is a diagram illustrating an example of a vertical search target image.
In the example of FIG. 4A, the shooting direction is the vertical direction, and a flower image is arranged at the lower right. In this case, when the attention area is calculated, for example, as shown in FIG. 4B, an area corresponding to the flower part and its vicinity in the flower image is extracted as the attention area A having the highest attention degree, Is extracted as a region of interest B having the second highest degree of interest in the region corresponding to the stem and leaf portions and the vicinity thereof. Other areas are extracted as areas C of low interest.
[0054]
FIG. 5 is a diagram illustrating an example of a horizontal search target image.
In the example of FIG. 5A, the shooting direction is the horizontal direction, and a flower image is arranged at the lower right. In this case, when the attention area is calculated, for example, as shown in FIG. 5B, an area corresponding to the flower part and its vicinity in the flower image is extracted as the attention area A with the highest attention degree, Is extracted as a region of interest B having the second highest degree of interest in the region corresponding to the stem and leaf portions and the vicinity thereof. Other areas are extracted as areas C of low interest. As described above, it can be seen that the region substantially similar to the search target image in FIG. 4 is extracted with the same attention degree.
[0055]
Next, the process shifts to step S110 to determine whether or not the read search target image includes a face area, and shifts to step S118.
In step S118, the orientation, size, and center of gravity of the face of the human image included in the search target image are determined based on the determination result in step S110. Specifically, assuming that a plurality of face regions are included in the search target image, and these are set as a detected face region group, the sum total area f1 of the detected face region group in the search target image, The average value f2 of the area of the face area group in the search target image, the variance f3 of the area of the detected face area group in the search target image, and how much each face of the detected face area group faces in the horizontal direction. The average value f4 of horizontal frontal degrees (−π / 2 to π / 2), the variance f5 of the horizontal frontal degrees of the faces in the detected face area group, and the vertical direction of each face in the detected face area group , The average value f6 (−π / 2 to π / 2) of the degree of vertical frontal orientation, the variance f7 of the degree of vertical frontal orientation of each face of the detected face area group, and the position of each center of gravity of the detected face area group , And the variance f9 of each barycentric position of the detected face area group are calculated. That. When only one face area is included in the search target image, f1 and f2 denote the area of the face area, and f4 and f6 denote the horizontal front orientation and the vertical front orientation of the face area. Calculate each. Note that the horizontal frontal orientation degree becomes smaller as the face in the detected face area is inclined in the horizontal direction with respect to the front, and the vertical frontal degree is that the face in the detected face area is inclined in the vertical direction with the front as the reference. The smaller the value, the smaller the value. Hereinafter, the horizontal frontal orientation and the vertical frontal orientation are collectively referred to as the frontal orientation unless otherwise specified. The area of the face region is calculated by normalizing the size of the search target image.
[0056]
Next, the process proceeds to step S120 to determine the similarity between the face of the person image included in the search target image and the face of the person image included in the search key image. For example, when the search key image includes the person images of the subjects A, B, and C, the similarity to the face of the face area of the subject A and the face of the subject B are determined for each face area included in the search target image. The similarity to the face in the area and the similarity to the face in the face area of the subject C are determined. When the variety search mode is specified as the search condition, since the search key image does not exist, the similarity between the face of the person image included in the search target image and the face of the person image included in the predetermined specific image is set. Is determined.
[0057]
Next, the process proceeds to step S124, and the feature vector V of the search target image is generated based on the attention area extracted in step S108 and the determination results in steps S118 and S120. The feature vector V is roughly divided into a first element group corresponding to the degree of attraction of the attention area, a second element group corresponding to the face information f1 to f9, and a third element group corresponding to the similarity.
[0058]
The first element group of the feature vector V divides the search target image into a plurality of regions (for example, N rectangular regions in the horizontal direction and M rectangular regions in the vertical direction). Average value e of the attraction of the segmented area (i, j) _ij Is calculated, and the average value e of the degree of attraction is calculated. _ij Is determined based on The segmented area (i, j) indicates an i-th area in the horizontal direction (i = 1 to N) and a j-th vertical direction (j = 1 to M) in the search target image.
[0059]
(Equation 1)

[0060]
The above equation (1) indicates that when each of the divided areas is a square area composed of 2s × 2s pixels, the average value e of the attractiveness of the divided area (i, j) is obtained. _ij Is calculated. In the above equation (1), xi is the x coordinate of the center point of the segmented area (i, j), and xj is the y coordinate of the center point of the segmented area (i, j).
Therefore, the first element group of the feature vector V is represented by the following equation (2), and the average value e of the attractiveness of each segmented area is given by _ij Independent coefficients E _ij , And enumerate them as elements. If the search target image is divided into N regions in the horizontal direction and M regions in the vertical direction, the first element group of the feature vector V is composed of N × M elements.
[0061]
(Equation 2)

[0062]
The second element group of the feature vector V is calculated by the following equation (3), and the face information f1 to f9 determined in step S118 have independent coefficients F ₁ ~ F ₉ , And enumerate them as elements.
[0063]
[Equation 3]

[0064]
The third element group of the feature vector V is calculated by the following equation (4) using the similarity p determined in step S120. _k Have independent coefficients P _k , And enumerate them as elements. For example, when the search key image includes K person images, each face region included in the search target image is compared with the face of the face region k (k = 1 to K) included in the search key image. Calculate the similarity. At this time, when the face is similar to the face in the face area k (when the similarity is a predetermined value or more), p _k = 1, and when the face is not similar to the face in the face area k (when the similarity is less than a predetermined value), p _k = 0.
[0065]
(Equation 4)

[0066]
As described above, the feature vector V is represented by enumerating each element of the first element group, the second element group, and the third element group by the following equation (5).
[0067]
(Equation 5)

[0068]
Next, the process proceeds to step S126 to determine whether or not the processing of steps S108 to S124 has been completed for all the search target images in the search target image registration DB 10, and has determined that the processing has been completed for all the search target images. At this time (Yes), the process proceeds to step S128.
In step S128, each search target image is clustered into a plurality of clusters based on the inter-vector distance between the feature vectors V of the search target image. The clustering can be performed based on, for example, a conventional K-means method. In the K-means method, as a first process, K feature vectors V are appropriately selected, and the selected feature vectors V _k (K = 1 to K) are each set as the center position of the cluster k. The center position of cluster k is m _k And Next, as a second process, the feature vector V _i (I = 1 to N, N is the total number of search target images) and the center position m of the cluster k _k Is calculated, and the feature vector V is assigned to the cluster k in which the calculated distance between the vectors is the minimum. _i Belong. The following equation (6) indicates that the feature vector V _A And feature vector V _B Is calculated between the vectors.
S = | V _A -V _B ｜… (6)
Next, as a third process, the center position m of the cluster k _k Is a feature vector V belonging to the cluster k. _i Replace with the average value of Next, as a fourth process, when i <N, “1” is added to i, and the second process and the third process are performed. Then, as a fifth process, m before and after the change in the third process _k , The second and third processes are performed with i = 1. M before and after the change in the third process _k If there is no change, or if the process is repeated a certain number of times or more, the process is terminated and the center position m of the cluster k is _k And the feature vector V belonging to _i Is determined.
[0069]
FIG. 6 is a diagram illustrating a case where n feature vectors V are clustered into two clusters.
In the example of FIG. ₁ , V ₂ , V ₃ , V _n Is the center position m ₁ And the feature vector V ₄ , V ₅ , V ₆ , V _n-1 Is the center position m ₂ Belongs to the cluster.
[0070]
Next, the process proceeds to step S130 to search for an image from the search target image registration DB 10 based on the specified search condition. When the similar image search mode is specified as the search condition, a search target image corresponding to the feature vector V having the smallest inter-vector distance from the feature vector V of the search key image is searched from the search target image registration DB 10. When the similar image group search mode is specified as the search condition, all the search target images of the cluster to which the search key image belongs are searched from the search target image registration DB 10. When the variety search mode is specified as a search condition, a predetermined number of search target images are searched out of the search target image registration DB 10 for each cluster.
[0071]
Next, the process proceeds to step S132, where the retrieved search target image is displayed on the display device 64 in a specified display form, and a series of processing is completed to return to the original processing.
FIG. 7 is a diagram showing a display screen displaying the search result.
In the example of FIG. 7, the search target images 1 to n are displayed in the enlarged display mode and the clear display mode. Since images 1, 2, 3, and n are images searched out as search results, The other images 4, 5, 6, and n-1 are displayed small and blurred while being clearly displayed.
[0072]
On the other hand, if it is determined in step S126 that the processing of steps S108 to S124 has not been completed for all the search target images in the search target image registration DB 10 (No), the process proceeds to step S134 and the search target image registration DB 10 Then, the next search target image is read from among the images, and the process proceeds to step S108.
Next, the operation of the present embodiment will be described.
[0073]
First, a case where an image similarity search is performed in the similar image search mode will be described.
When performing an image similarity search in the similar image search mode, the user inputs a search request, specifies a similar image search mode as the search mode, and specifies a search key image. In addition, a display mode is specified.
When the similar image search mode and the search key image are designated, the image search device 100 reads the first search target image from the search target image registration DB 10 through steps S106 to S110, and reads the read search image. The attention area is extracted from the target image, and it is determined whether the face image is included in the search target image. Next, through steps S118 and S120, the orientation, size, and center of gravity of the face of the person image included in the search target image are determined based on the determination result of step S110, and the face of the person image included in the search target image is determined. And the similarity between the face and the face of the person image included in the search key image are determined. If the search target image includes a plurality of subject images, the steps S110 to S120 are repeated, and the face information and the similarity are determined for each face area.
[0074]
Next, when the face information and the similarity are determined for all the face areas of the search target image, the feature of the search target image is determined based on the extracted attention area and the determined face information and similarity through step S124. A vector V is generated.
When such a process is performed on all the search target images in the search target image registration DB 10, through steps S128 and S130, each search target image is set to a plurality of based on the inter-vector distance between the feature vectors V of the search target images. Since the similar image search mode is specified by clustering into a cluster, the search target image corresponding to the feature vector V having the smallest inter-vector distance from the feature vector V of the search key image is found in the search target image registration DB 10. Will be found. Then, through step S132, the searched image to be searched is displayed in a designated display form.
[0075]
Next, a case where an image similarity search is performed in the similar image group search mode will be described.
When performing an image similarity search in the similar image group search mode, the user inputs a search request, specifies the similar image group search mode as the search mode, and specifies a search key image. In addition, a display mode is specified.
Until each search target image is clustered, it is the same as the similarity search in the similar image search mode. In the image search apparatus 100, when the search target images are clustered, since the similar image group search mode is specified through step S130, all the search target images of the plurality of clusters to which the search key image belongs are assigned. Is retrieved from the search target image registration DB 10. Then, through step S132, the searched image to be searched is displayed in a designated display form.
[0076]
Next, a case where an image similarity search is performed in the variety search mode will be described.
When performing an image similarity search in the variety search mode, the user inputs a search request and specifies the variety search mode as the search mode. In addition, a display mode is specified.
Until each search target image is clustered, it is the same as the similarity search in the similar image search mode. In the image search device 100, when each search target image is clustered, the variety search mode is specified through step S130, and thus a predetermined number of search target images are stored in the search target image registration DB 10 for each cluster. Searched out from. Then, through step S132, the searched image to be searched is displayed in a designated display form.
[0077]
In this manner, in the present embodiment, a region of interest is extracted from the search key image and each search target image, and the feature vector of the image is extracted based on the search key image and the region of interest extracted for each search target image. V is generated, and an image matching the search key image is searched from the search target image registration DB 10 based on the generated feature vector V.
[0078]
As a result, the search is performed in consideration of the part that the user pays attention to, so that the subjectivity of the user is easily reflected in the search result. Therefore, compared with the related art, it is possible to obtain search results that relatively meet the user's wishes.
Further, in the present embodiment, for the search key image and each search target image, the face direction, size, or center of gravity of the person image included in the image is determined, and the search key image and each search target image are determined. The feature vector V of the image is generated based on the face direction, the size, or the position of the center of gravity.
[0079]
Thus, the search is performed in consideration of the orientation, size, or center of gravity position of the face of the person image, so that a search result that matches the face of the person image included in the search key image can be obtained.
Further, in the present embodiment, for each search target image, the similarity between the face of the person image included in the search target image and the face of the person image included in the search key image is determined, and the search key image and each For a search target image, a feature vector V of the image is generated based on the determined similarity.
[0080]
Thus, since the search is performed in consideration of the similarity between the faces of the person image, a search result similar to the face of the person image included in the search key image can be obtained.
Furthermore, in the present embodiment, a search target image corresponding to the feature vector V having the smallest inter-vector distance from the feature vector V of the search key image is searched from the search target image registration DB 10.
[0081]
As a result, it is possible to obtain a search result that seems to best meet the user's wishes.
Further, in the present embodiment, each search target image is classified into a plurality of clusters based on the inter-vector distance between the feature vectors V of the search target image, and all the search targets of the plurality of clusters to which the search key image belongs are included. Images are retrieved from the search target image registration DB 10.
[0082]
As a result, it is possible to obtain some search results that seem to meet the user's wishes.
Further, in the present embodiment, each search target image is classified into a plurality of clusters based on the inter-vector distance between the feature vectors V of the search target image, and a predetermined number of search target images are registered for each cluster. The database 10 is searched for.
[0083]
As a result, a predetermined number of search target images are retrieved from different clusters, so that various search results can be obtained.
In the above embodiment, the search target image corresponds to the classification target image of Inventions 8 to 12, 15, or 17, and the search target image registration DB 10 stores the search target image storage unit of

Invention

2, 5, 6, 14, or 16. Or the classification target image storage unit of the invention 9, 12, or 15. Step S104 and the search key image specifying unit 12 correspond to the search key image input unit of the invention 2 or 14, or the search key image input step of the invention 16, and step S108 and the attention area extraction unit 16 correspond to the invention 2, It corresponds to the attention area extraction means of 9, 14, or 15, the attention area extraction step of invention 17, the first attention area extraction step of invention 16, or the second attention area extraction step of invention 16.
[0084]
In the above embodiment, step S118 and the face information determination unit 36 correspond to the face information determination unit of invention 3 or 10, and step S120 and the similarity determination unit 38 correspond to the similarity determination unit of invention 4 or 11. In step S124 and the feature vector generation unit 20, the feature vector generation unit of inventions 2 to 4, 9 to 11, 14 or 15, the second feature vector generation step of invention 16, or the first feature vector of invention 16 It corresponds to the generation step. Steps S128 and S130 and the image search unit 24 are the image search means of the

invention

2, 5, 6, or 14, the image classification means of the invention 9, 12, or 15, the image search step of the invention 16, or the image classification of the invention 17. It corresponds to a step.
[0085]
In the above embodiment, the aspect ratio of the search target image is not particularly described. However, when the search target images have different aspect ratios, the similarity of the images is determined as follows.
FIG. 8 is a diagram illustrating a case where search target images A and B having different aspect ratios are superimposed.
[0086]
When judging the similarity of the search target images A and B having different aspect ratios, the search target images A and B are overlapped as shown in FIG. Feature vector V _A Is generated, and the feature vector V of the search target image B for the overlapping area in the search target image B is generated. _B And the generated feature vector V _A , V _B The similarity of the search target images A and B is determined on the basis of.
[0087]
In this case, the search target images A and B are further superimposed by changing the method of superimposing the search target images A and B so that the overlapping areas are different, and the feature vector V of the search target image A calculated for each combination is obtained. _Ai (I = 1 to N, N is the total number of combinations) is the average value of the feature vector V of the search target image A. _A And the feature vector V of the search target image B calculated for each combination _Bi Is the average value of the feature vector V of the search target image B. _B May be generated as
[0088]
Accordingly, similarity can be relatively accurately determined even between search target images having different aspect ratios, and thus a search result that further meets the user's wish can be obtained.
In this case, step S124 and the feature vector generation unit 20 correspond to the feature vector generation unit of the invention 7 or 13.
[0089]
Further, in the above-described embodiment, the first element group of the feature vector V is represented by the above expression (2), and the average value e of the attractiveness of each of the divided areas is calculated by the above equation (2). _ij Independent coefficients E _ij Are multiplied by each other, and they are generated as a list of each element. However, the present invention is not limited to this. When the degree of attraction is used to calculate the attention area, the degree of attraction is constant in the divided area. In S108, it can be generated as follows. First, H attention areas are selected from the search target images in descending order of the degree of attraction. Next, by the following equation (7), the center coordinate x in the horizontal direction of the attention area h (h = 1 to H) is obtained. _h Is multiplied by a coefficient X, and the vertical center coordinate y of the attention area h is _h To the coefficient Y _h Multiply by Also, the degree of interest e of the attention area h _h Is multiplied by a coefficient E, and the area s of the attention area h is _h Is multiplied by a coefficient S. And those Xx _h , Yy _h , Ee _h , Ss _h Are generated as the first element group of the feature vector V.
[0090]
(Equation 6)

[0091]
In this case, when the number h of the extracted attention areas is less than a predetermined number (for example, 10), all the first element groups of the feature vector V are set to “0”.
Further, in the above embodiment, the case where the control program stored in the ROM 52 is executed in executing the processing shown in the flowchart of FIG. 3 has been described. However, the present invention is not limited to this. The program may be read from the storage medium storing the program into the RAM 54 and executed.
[0092]
Here, the storage medium is a semiconductor storage medium such as a RAM or a ROM, a magnetic storage type storage medium such as an FD or HD, an optical read type storage medium such as a CD, CDV, LD, or DVD, or a magnetic storage type storage such as an MO. / Optical reading type storage media, including any storage media that can be read by a computer, regardless of an electronic, magnetic, optical, or other reading method.
[0093]
Further, in the above embodiment, the image search system, the image classification system, the image search program and the image classification program, and the image search method and the image classification method according to the present invention can be implemented by using an image Is applied to the case where the similarity search is performed. However, the present invention is not limited to this, and can be applied to other cases without departing from the gist of the present invention. For example, the present invention can be applied to a case of classifying images.
[Brief description of the drawings]
FIG. 1 is a functional block diagram showing a configuration of an image search device 100 according to the present invention.
FIG. 2 is a block diagram illustrating a configuration of a computer 200.
FIG. 3 is a flowchart illustrating an image search process.
FIG. 4 is a diagram showing an example of a vertical search target image.
FIG. 5 is a diagram illustrating an example of a horizontal search target image.
FIG. 6 is a diagram illustrating a case where n feature vectors V are clustered into two clusters.
FIG. 7 is a diagram showing a display screen on which search results are displayed.
FIG. 8 is a diagram illustrating a case where search target images A and B having different aspect ratios are superimposed.
[Explanation of symbols]
Reference Signs List 100 image search device, 200 computer, 10 search target image registration DB, 12 search key image designation unit, 14 search key image reading unit, 16 attention area extraction unit, 18 face image processing unit, 20 Feature vector generation unit, 22 ... search condition specification unit, 24 ... image search unit, 26 ... display form specification unit, 28 ... image display unit, 34 ... face area determination unit, 36 ... face information determination unit, 38 ... similarity determination 50, CPU, 52, ROM, 54, RAM, 58, I / F, 60, input device, 64, display device

Claims

Based on a given search key image, a system for searching for an image that matches the search key image from among a plurality of search target images,
For the search key image and each of the search target images, an attention area is extracted from the image, and a feature vector indicating a feature of the image is generated based on the extracted attention area,
An image search system characterized in that an image matching the search key image is searched from the plurality of search target images based on the generated feature vector.

Based on a given search key image, a system for searching for an image that matches the search key image from among a plurality of search target images,
Search target image storage means for storing the plurality of search target images, search key image input means for inputting the search key image, search key image input by the search key image input means, and storage of the search target image A region of interest extraction unit for extracting a region of interest from each image of the search target image of the unit; and a feature of the image based on the region of interest extracted by the region of interest extraction for the search key image and each image of the search target. A feature vector generating means for generating a feature vector to be shown, and an image search means for searching the search target image storage means for an image matching the search key image based on the feature vector generated by the feature vector generating means. An image retrieval system, comprising:

In claim 2,
Further, face information determining means for determining face information of a person image included in the search key image and each of the search target images included in the image,
The feature vector generation unit is configured to generate, for the search key image and each of the search target images, a feature vector indicating a feature of the image based on the face information determined by the face information determination unit. An image retrieval system characterized by the following.

In any one of claims 2 and 3,
Further, a similarity determination unit that determines a similarity between a face of a person image included in the search target image and a face of a person image included in the search key image for each of the search target images,
The feature vector generation unit is configured to generate, for the search key image and each of the search target images, a feature vector indicating a feature of the image based on a determination result of the similarity determination unit. Image search system.

In any one of claims 2 to 4,
The image search means is configured to search the search target image storage means for a search target image corresponding to a feature vector having the smallest inter-vector distance from a feature vector of the search key image. Image search system.

In any one of claims 2 to 4,
The image search means classifies each of the search target images into a plurality of groups based on an inter-vector distance between feature vectors of the search target images, and searches all of the plurality of groups to which the search key image belongs. An image search system, wherein a target image is retrieved from the search target image storage means.

In any one of claims 2 to 6,
The feature vector generation unit, when the aspect ratio between the search key image and the search target image or the aspect ratio between the search target images is different, superimpose the first image and the second image having different aspect ratios, An image characterized in that a feature vector of the first image is generated for an overlapping area in the first image, and a feature vector of the second image is generated for an overlapping area in the second image. Search system.

A system for classifying a plurality of classification target images,
For each of the classification target images, a region of interest is extracted from the classification target image, and a feature vector indicating a characteristic of the classification target image is generated based on the extracted region of interest,
An image classification system, wherein each of the classification target images is classified into a plurality of groups based on the generated feature vector.

A system for classifying a plurality of classification target images,
Classification target image storage means for storing the plurality of classification target images; attention region extraction means for extracting a region of interest from the classification target image for each classification target image of the classification target image storage means; A feature vector generation unit configured to generate a feature vector indicating a feature of the classification target image for each of the classification target images based on the attention area extracted by the extraction unit; and a feature vector generated by the feature vector generation unit. An image classification system, comprising: image classification means for classifying each classification target image into a plurality of groups.

In claim 9,
Further, a face information determination unit that determines face information of a person image included in the classification target image for each of the classification target images,
The feature vector generation unit is configured to generate, for each of the classification target images, a feature vector indicating a feature of the classification target image based on the face information determined by the face information determination unit. Image classification system.

In any one of claims 9 and 10,
Further, a similarity determination unit that determines a similarity between a face of a person image included in the classification target image and a face of a person image included in the specific image for each of the classification target images,
The feature vector generation unit is configured to generate, for each of the classification target images, a feature vector indicating a feature of the classification target image based on a determination result of the similarity determination unit. Image classification system.

In any one of claims 9 to 11,
The image classifying means classifies each of the classification target images into a plurality of groups based on a distance between feature vectors of the classification target images, and divides a predetermined number of classification target images for each group into the classification target images. An image classification system characterized by being searched out from a storage means.

In any one of claims 9 to 12,
The feature vector generating means superimposes the first image and the second image having different aspect ratios when the aspect ratios of the classification target images are different from each other. An image classification system, wherein a vector is generated, and a feature vector of the second image is generated for an overlapping area of the second image.

A program for searching for an image that matches the search key image from among a plurality of search target images based on a given search key image,
For a computer that can use the search target image storage means for storing the plurality of search target images, and a search key image input means for inputting the search key image,
Attention area extraction means for extracting an attention area from the search key image input by the search key image input means and the respective search target images in the search target image storage means, and the search key image and the respective search target images A feature vector generating means for generating a feature vector indicating a feature of the image based on the attention area extracted by the attention area extracting means; and a feature vector storage means based on the feature vector generated by the feature vector generating means. An image search program for executing a process realized as an image search means for searching for an image matching the search key image from the image search program.

A program for classifying a plurality of classification target images,
For a computer that can use the classification target image storage means for storing the plurality of classification target images,
A region-of-interest extraction means for extracting a region of interest from the classification-target image for each classification-target image in the classification-target image storage unit; A process implemented as a feature vector generating unit that generates a feature vector indicating a feature of the target image, and an image classifying unit that classifies each of the classification target images into a plurality of groups based on the feature vector generated by the feature vector generating unit An image classification program characterized by being a program for executing a program.

Based on a given search key image, a method for searching for an image that matches the search key image from a search target image storage unit that stores a plurality of search target images,
A search key image input step of inputting the search key image,
A first region of interest extraction step of extracting a region of interest from the search key image input in the search key image input step;
A first feature vector generation step of generating a feature vector indicating a feature of the search key image based on the attention area extracted in the first attention area extraction step;
A second attention area extraction step of extracting an attention area from the search target image;
A second feature vector generation step of generating a feature vector indicating a feature of the search target image based on the attention area extracted in the second attention area extraction step;
A repetition step of repeatedly performing the second attention area extraction step and the second feature vector generation step for each search target image in the search target image storage means;
An image search step of searching the search target image storage unit for an image that matches the search key image based on the feature vector generated in the first feature vector generation step and the second feature vector generation step. An image search method characterized by the following.

A method for classifying a plurality of classification target images,
An attention area extracting step of extracting an attention area from the classification target image;
A feature vector generation step of generating a feature vector indicating a feature of the classification target image based on the attention area extracted in the attention area extraction step;
A repetition step of repeatedly performing the attention area extraction step and the feature vector generation step for each of the classification target images,
An image classification step of classifying each of the classification target images into a plurality of groups based on the feature vector generated in the feature vector generation step.