JP2004258749A

JP2004258749A - Method and device for clustering feature vector of image

Info

Publication number: JP2004258749A
Application number: JP2003046014A
Authority: JP
Inventors: Masahiko Yamada; 雅彦山田
Original assignee: Fuji Photo Film Co Ltd
Current assignee: Fujifilm Holdings Corp
Priority date: 2003-02-24
Filing date: 2003-02-24
Publication date: 2004-09-16

Abstract

<P>PROBLEM TO BE SOLVED: To classify feature vectors extracted from image data by clustering by specifying the level of optimal classification matching the inputted image data from the image data themselves without the need of inputting any parameter to designate the level of desired classification. <P>SOLUTION: Low resolution images whose resolutions are different stepwise are derived from an original image, and the initial values of the respective components of the representative vectors of respective clusters and the initial value of a distortion threshold being the index of the level of optimal classification in an image whose resolution is the second lowest are calculated from the lowest resolution image whose resolution is the lowest. Then, processing to classify the feature vectors extracted from the image whose resolution is the second lowest and processing to update the representative vectors and the distortion threshold for the optimal classification in the subsequent image are successively repeated by using the initial values. Finally, the classification of the plurality of feature vectors extracted from the original image by clustering is carried out in the level of the optimal classification. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、画像から抽出した複数の特徴ベクトルを分類する方法および装置に関し、特に、特徴ベクトル空間におけるクラスタリングの手法を用いて、画像から抽出した複数の特徴ベクトルを、互いに類似したもの同士が属する複数のクラスターに分類する方法および装置に関する。
【０００２】
【従来の技術】
観測データや画像データの特徴が該特徴を表すｎ個のパラメータを成分とする特徴ベクトルとして表現できる場合のデータ分類手法として、複数の特徴ベクトルをｎ次元の特徴ベクトル空間に写像して、該特徴ベクトル空間における分布をもとにこれらの特徴ベクトルを複数のクラスターに分類する、いわゆるクラスタリングの手法が従来から知られている。クラスターとは、文字通り、特徴ベクトル空間における特徴ベクトルの「塊」であり、特徴ベクトル空間におけるユークリッド距離等を指標として、互いに近い特徴ベクトルが同一のクラスターに分類される。すなわち、同一のクラスターに属する特徴ベクトルは、互いに類似した特徴を表しているものである。
【０００３】
通常、クラスタリング処理に際しては、どの程度まで分類を行うかを示す一定値のパラメータを、処理開始に先立って指定しなくてはならない。たとえば、最終的なクラスターの個数や、クラスター中心と該クラスターに属する特徴ベクトル間の許容可能な最大距離が、かかるパラメータとして指定され得る。クラスターの個数等を固定しない自己収束形のアルゴリズムを使用する場合であっても、クラスターの拡がりに関するパラメータやクラスター間の距離に関するパラメータ等、どの程度まで分類を行うかを示す何らかの一定値のパラメータを指定する必要性は解消されない（たとえば、非特許文献１参照）。
【０００４】
【非特許文献１】
長尾真著、「画像認識論」、初版、コロナ社、昭和５８年２月１５日
、ｐ．１２０−１２６
【０００５】
【発明が解決しようとする課題】
クラスタリング処理に際してどの程度まで分類を行うべきかは、データの内容に強く依存する。とりわけ、多様性の高い画像データを対象とする場合には、適当な分類の程度は、各画像データによって大きく異なる。たとえば、１つの画像を、該画像に含まれる「空」、「海」、「建物」等の撮影対象要素に対応する画像領域に分割する領域分割処理を行うために、該画像の各画素から抽出した特徴ベクトルをクラスタリング処理により分類する場合等には、各画像に含まれる撮影対象要素の数は多様であり、どの程度まで分類を行うべきかの画一的な基準を定めることは不可能である。すなわち、撮影対象要素として「空」と「海」のみを含む画像であれば、２つの画像領域に分割できればよいのであるから、分類の程度は粗くて足りるが、さらに「建物」、「木」、「土」等の多くの撮影対象要素を含む画像については、より細かい分類が必要である。この場合に、前者の画像に合わせた画一的な分類の程度の基準を採用すれば、後者の画像については十分な分類が行えないこととなり、後者の画像に合わせた画一的な基準を採用すれば、実際には「空」と「海」しか撮影されていない前者の画像を不必要に多くの画像領域に分割することとなる。
【０００６】
したがって、画像データを対象としたクラスタリング処理においては、分類対象としていかなる画像データが入力されても、その画像データ自体から該画像データに最適な分類の程度を特定し、安定した分類性能を実現する手法が強く望まれる。
【０００７】
本発明は、かかる事情に鑑み、好ましい分類の程度を指定するパラメータの入力を要さずに、入力された画像データ自体から最適な分類の程度を特定して、該画像データから抽出した特徴ベクトルの分類を行うクラスタリング方法および装置を提供することを目的とするものである。
【０００８】
【課題を解決するための手段】
すなわち、本発明に係る第１の画像の特徴ベクトルのクラスタリング方法は、原画像から、段階的に解像度の異なる複数の複数画素からなる低解像度画像を導出する工程と、該複数の低解像度画像のうち解像度が最も低い最低解像度画像から、代表ベクトルの各成分の初期値および歪み閾値の初期値を求める工程と、解像度が次に低い画像から複数の特徴ベクトルを抽出する工程と、上記の歪み閾値を最適な分類の程度の指標として、上記の代表ベクトルに基づいて、上記の複数の特徴ベクトルを分類する工程と、代表ベクトルおよび歪み閾値を更新する工程と、原画像から抽出された複数の特徴ベクトルの最適な分類の程度による分類が求められるまで、上記の抽出する工程から上記の更新する工程を繰り返す工程を含むことを特徴とする方法である。
【０００９】
また、本発明に係る第１の画像の特徴ベクトルのクラスタリング装置は、原画像から、段階的に解像度の異なる複数の複数画素からなる低解像度画像を導出する手段と、該複数の低解像度画像のうち解像度が最も低い最低解像度画像から、代表ベクトルの各成分の初期値および歪み閾値の初期値を求める手段と、解像度が次に低い画像から複数の特徴ベクトルを抽出する手段と、上記の歪み閾値を最適な分類の程度の指標として、上記の代表ベクトルに基づいて、上記の複数の特徴ベクトルを分類する手段と、代表ベクトルおよび歪み閾値を更新する手段と、原画像から抽出された複数の特徴ベクトルの最適な分類の程度による分類が求められるまで、上記の抽出する手段から上記の更新する手段を繰返し動作させる手段を備えていることを特徴とする装置である。
【００１０】
ここで、本発明において「低解像度画像」とは、原画像よりも画素数の少ない複数画素からなる画像であって、原画像を基として、ガウシアンピラミッド、線形補間、スプライン補間、ウェーブレット変換等を利用した画像縮小処理により順次求められるものを指す。画像の縦方向と横方向に対して、異なる縮小率を適用してもよい。
【００１１】
また、本発明において「特徴ベクトル」とは、画像から抽出されるベクトルであって、当該画像の特徴を示す複数のパラメータ（以下、「特徴量」と呼ぶ）を成分とするベクトルを指す。特徴ベクトルは、典型的には抽出対象の画像を構成する画素ごとに抽出されるが、いくつかの画素からなるブロックごとに抽出してもよいし、抽出対象の画像に線形補間やガウシアンピラミッド等による画像縮小処理を施した縮小画像の各画素から抽出してもよい。特徴ベクトルの成分である特徴量としては、たとえば色の特徴、輝度の特徴、テクスチャーの特徴、奥行情報、該画像に含まれるエッジの特徴等を示す特徴量が使用され得る。
【００１２】
さらに、本発明において「代表ベクトル」とは、特徴ベクトル空間において規定される各クラスターを代表するベクトルを指し、クラスターごとに１つの代表ベクトルが割り当てられる。ここで、「特徴ベクトル空間」とは、上記の特徴ベクトルの各成分を座標とする空間を指す。たとえば、特徴ベクトルがＹＣＣ表色系における各画素の輝度成分および２つの色差成分を成分とするベクトルである場合には、３次元のＹＣＣ表色系空間が特徴ベクトル空間となる。
【００１３】
また、本発明において「歪み閾値」とは、各低解像度画像または原画像においてどの程度まで分類を行うかの基準となる値である。
【００１４】
また、本発明に係る第２の画像の特徴ベクトルのクラスタリング方法は、原画像から、段階的に解像度の異なる複数の複数画素からなる低解像度画像を導出する工程と、該複数の低解像度画像のうち解像度が最も低い最低解像度画像から、代表ベクトルの各成分の初期値および歪み閾値の初期値を求める工程と、解像度が次に低い画像を現在画像として、該現在画像から複数の特徴ベクトルを抽出する工程と、該複数の特徴ベクトルを特徴ベクトル空間に写像する工程と、特徴ベクトル空間において、上記の複数の特徴ベクトルを、上記の代表ベクトルの各々が代表するクラスターに分類して、現在画像における暫定的な分類を求める工程と、上記のクラスターの各々に含まれる特徴ベクトルに基づいて、代表ベクトルを更新する工程と、上記の暫定的な分類の歪みを示す歪みパラメータを求める工程と、該歪みパラメータを歪み閾値と比較し、歪みパラメータが歪み閾値より大きい場合には、代表ベクトルに新たなベクトルを追加する工程と、歪みパラメータが歪み閾値より小さくなるまで、上記の暫定的な分類を求める工程から上記の新たなベクトルを追加する工程を繰り返し、現在画像における最終的な分類を求める工程と、現在画像における暫定的な分類のいずれかまたは最終的な分類に基づいて、歪み閾値を更新する工程と、原画像が現在画像とされ、原画像における最終的な分類が求められるまで、上記の複数の特徴ベクトルを抽出する工程から上記の歪み閾値を更新する工程を繰り返す工程を含むことを特徴とする方法である。
【００１５】
また、本発明に係る第２の画像の特徴ベクトルのクラスタリング装置は、原画像から、段階的に解像度の異なる複数の複数画素からなる低解像度画像を導出する手段と、該複数の低解像度画像のうち解像度が最も低い最低解像度画像から、代表ベクトルの各成分の初期値および歪み閾値の初期値を求める手段と、解像度が次に低い画像を現在画像として、該現在画像から複数の特徴ベクトルを抽出する手段と、該複数の特徴ベクトルを特徴ベクトル空間に写像する手段と、特徴ベクトル空間において、上記の複数の特徴ベクトルを、上記の代表ベクトルの各々が代表するクラスターに分類して、現在画像における暫定的な分類を求める手段と、上記のクラスターの各々に含まれる特徴ベクトルに基づいて、代表ベクトルを更新する手段と、上記の暫定的な分類の歪みを示す歪みパラメータを求める手段と、該歪みパラメータを歪み閾値と比較し、歪みパラメータが歪み閾値より大きい場合には、代表ベクトルに新たなベクトルを追加する手段と、歪みパラメータが歪み閾値より小さくなるまで、上記の暫定的な分類を求める手段から上記の新たなベクトルを追加する手段を繰返し動作させ、現在画像における最終的な分類を求める手段と、現在画像における暫定的な分類のいずれかまたは最終的な分類に基づいて、歪み閾値を更新する手段と、原画像が現在画像とされ、原画像における最終的な分類が求められるまで、上記の複数の特徴ベクトルを抽出する手段から上記の歪み閾値を更新する手段を繰返し動作させる手段を備えていることを特徴とする装置である。
【００１６】
ここで、本発明において「歪みパラメータ」とは、歪み閾値と比較されるパラメータであって、いわばクラスターによる特徴ベクトルの暫定的な分類の大まかさを表す指標である。クラスターの数が多くなり分類が細かくなるほど、歪みパラメータは小さくなる。なお、本発明に係る第２の画像の特徴ベクトルのクラスタリング方法および装置は、上記の歪みパラメータが歪み閾値よりも大きい場合には代表ベクトルに新たなベクトルを追加し、歪みパラメータが歪み閾値よりも小さくなるまで現在画像における分類処理を繰り返すものであるが、歪みパラメータと歪み閾値が等しい場合の扱いが定められている方法および装置も、本発明の範囲に属するものとする。すなわち、歪みパラメータが歪み閾値以上である場合には代表ベクトルに新たなベクトルを追加し、歪みパラメータが歪み閾値よりも小さくなるまで現在画像における分類処理を繰り返すものと、歪みパラメータが歪み閾値よりも大きい場合には代表ベクトルに新たなベクトルを追加し、歪みパラメータが歪み閾値以下となるまで現在画像における分類処理を繰り返すもののいずれも、本発明の範囲に属するものとする。
【００１７】
ここで、本発明による第２のクラスタリング方法は、代表ベクトルを更新する工程と歪みパラメータを求める工程との間に、代表ベクトルに基づいて暫定的な分類を更新する工程をさらに含んでいてもよい。同様に、本発明による第２のクラスタリング装置も、代表ベクトルを更新する手段により更新された代表ベクトルに基づいて、暫定的な分類を更新する手段をさらに備えていてもよい。
【００１８】
また、上記の暫定的な分類を求める工程または手段は、複数の特徴ベクトルの各々を、上記の代表ベクトルのうち該特徴ベクトルと特徴ベクトル空間における距離が最も近い代表ベクトルが代表するクラスターに分類するものであってもよい。この場合において、上記の代表ベクトルを更新する工程または手段は、特徴ベクトル空間において、クラスターの各々に含まれる特徴ベクトルの重心を表すベクトルを、新たな代表ベクトルとするものであってもよい。さらに、上記の歪みパラメータが、各クラスターのうち特徴ベクトル空間における拡がりが最も大きいクラスターの該拡がりを示す値であって、上記の新たなベクトルを追加する工程または手段が、所属するクラスターを代表する代表ベクトルとの距離が最も遠い特徴ベクトルと同一のベクトルを、新たなベクトルとして代表ベクトルに追加するものであってもよい。
【００１９】
ここで、特徴ベクトル空間における「距離」とは、典型的には特徴ベクトル空間における２つのベクトル間のユークリッド距離を指すが、特徴ベクトル空間における２つのベクトルの近接度合いを適当に表す指標であれば、ユークリッド距離に限られないものとする。また、特徴ベクトル空間におけるクラスターの「拡がりを示す値」としては、たとえば、そのクラスターに属する各々の特徴ベクトルと該クラスターの代表ベクトルとの距離の最大値や平均値等が使用され得る。
【００２０】
また、上記の歪み閾値を更新する工程または手段は、現在画像における最終的な分類を求めるために求められた、歪みパラメータのいずれかを新たな歪み閾値とするものであってもよい。
【００２１】
さらに、上記の代表ベクトルの各成分の初期値および歪み閾値の初期値を求めるに際しては、ｎ個の画素からなる上記の最低解像度画像から、１つの原代表ベクトルを導出し、該ｎ個の画素の各々から抽出されたｎ個の特徴ベクトルと、原代表ベクトルとの、特徴ベクトル空間における距離を求め、最大距離を示す特徴ベクトルを代表ベクトル候補とするとともに最も高い候補順位を付け、続いて残りの特徴ベクトルの各々と、原代表ベクトルおよび代表ベクトル候補のうち該特徴ベクトルに最も近いものとの、特徴ベクトル空間における距離を求め、最大距離を示す特徴ベクトルを代表ベクトル候補に追加するとともに次に高い候補順位を付け、かかる次に高い候補順位を付ける処理を複数回繰り返し、この候補順位に従って、代表ベクトル候補のうちの上位いくつかの特徴ベクトルの各成分の値を、代表ベクトルの各成分の初期値とし、上記のｎ個の特徴ベクトルの特徴ベクトル空間における拡がりを示す値を歪み閾値の初期値としてもよい。この場合において、上記の上位いくつかの特徴ベクトルとして、代表ベクトル候補に追加された際の上記の最大距離の変化量が最大であった特徴ベクトルまでを選択してもよい。
【００２２】
ここで、「原代表ベクトル」とは、最低解像度画像を代表する１つのベクトルであって、たとえば最低解像度画像の各画素から抽出される特徴量の各平均値を成分とするベクトル等が使用され得る。
【００２３】
【発明の効果】
本発明に係る画像の特徴ベクトルのクラスタリング方法および装置は、入力された原画像自体から導出した最低解像度画像に基づいて代表ベクトルの各成分の初期値および歪み閾値の初期値を求め、これらの初期値を利用して、解像度が次に低い画像における特徴ベクトルの最適な分類を順次求めていき、最終的には原画像から抽出した特徴ベクトルの分類を求めるものであるので、好ましい分類の程度を指定するパラメータの入力を要さずに、原画像の内容に応じた最適な分類の程度により、原画像から抽出した特徴ベクトルの分類を行うことができる。したがって、いかなる原画像が入力されても、安定した分類性能を実現することができ、様々な撮影対象要素を含む多数の画像に対して、連続して領域分割処理を施す場合等に極めて有効である。
【００２４】
また、本発明に係る画像の特徴ベクトルのクラスタリング方法および装置は、低解像度化された画像すなわち特徴ベクトル数を減じた画像において順次分類処理を行い、しかも各低解像度画像における分類処理は、歪みパラメータが歪み閾値よりも小さくなった時点で打ち切られ、解像度が次に低い画像における処理へと進むので、従来のクラスタリング処理と比較して、計算量が過大となることもない。
【００２５】
【発明の実施の形態】
以下、図面により、本発明の例示的な実施形態を詳細に説明する。
【００２６】
図１は、本発明の１つの実施形態であるクラスタリング処理の手順を示したフローチャートである。このクラスタリング処理は、最終的には原画像の各画素から抽出した特徴ベクトルを、複数のクラスターに分類することを目的とするものであり、原画像を「空」、「海」、「建物」等の撮影対象要素に対応する複数の画像領域に分割する領域分割処理等に利用できる。なお、本実施形態では、原画像は１０２４×１０２４画素のデジタル写真画像であるとし、特徴ベクトルは、ＹＣＣ表色系で表された輝度成分および２つの色差成分を成分とする３次元のベクトルであるとする。
【００２７】
原画像が入力されると、まず図１のステップ１０において、段階的に解像度の異なる複数の低解像度画像が導出される。本実施形態では、図２に示すように、原画像から出発して、縦横の画素数をそれぞれ２分の１とした低解像度画像を順次求めていき、最終的には第９低解像度画像すなわち画素数２×２の低解像度画像まで求めるものとする。この画素数２×２の低解像度画像が、本実施形態における最低解像度画像である。
【００２８】
これらの各低解像度画像を求めるための画像縮小処理の手法としてはいかなる手法を用いてもよいが、本実施形態ではガウシアンピラミッドによる縮小処理を用いるものとする。ガウシアンピラミッドによる縮小処理とは、縮小対象の画像にガウシアンフィルターを適用していわばぼやかした画像を導出するステップと、そのようにぼやかした画像の画素を１つおきに拾って縦横の画素数をそれぞれ２分の１とした画像を導出するステップを順次繰り返して、段階的に解像度の異なる低解像度画像を導出していく処理である。使用されるガウシアンフィルターは、たとえばフィルターの大きさが５×５であれば、図３のようになる。なお、画像端の処理に関しては、たとえば図３に示した５×５の大きさのフィルターを使用する場合には、縮小対象の画像の各辺に２行分または２列分の適当な画素値を有する画素を便宜的に追加することにより、もとの縮小対象の画像と同じ大きさのぼやかした画像を導出することができる。追加する画素の「適当な画素値」としては、ゼロ値や、画像端の画素値を繰り返した値等が使用され得る。
【００２９】
低解像度画像の導出が終了すると、次に、図１のステップ１２において、最低解像度画像（２×２画素）を基に、初期代表ベクトルすなわち代表ベクトルの各成分の初期値が導出される。この初期代表ベクトル導出処理を、以下、図４のフローチャートならびに図５の概念図に沿って詳細に説明する。
【００３０】
まず、図４のステップ４０において、最低解像度画像から１つの原代表ベクトルが導出される。本実施形態では、最低解像度画像をなす４つの画素から抽出した特徴ベクトルの各成分の平均値を成分とする１つのベクトルを、原代表ベクトルとする。たとえば、図５の（ａ）に示した特徴ベクトル空間において、４つのベクトルＯＢ_１、ＯＢ_２、ＯＢ_３およびＯＢ_４が最低解像度画像をなす４つの画素から抽出された特徴ベクトルであるとすると、原代表ベクトルはＯＡのようになる。
【００３１】
次に、図４のステップ４２において、上記の４つの特徴ベクトルの各々と、原代表ベクトルとの距離を算出する。本実施形態では、この距離は特徴ベクトル空間におけるユークリッド距離であるとする。図５の（ａ）の例では、算出される距離はそれぞれｄ_１、ｄ_２、ｄ_３およびｄ_４である。
【００３２】
続いて、ステップ４４において、ステップ４２で算出した距離のうち最大の距離を示す特徴ベクトルを、候補順位第１位の代表ベクトル候補とする。図５の（ａ）の例では、距離ｄ_１、ｄ_２、ｄ_３およびｄ_４のうち最大距離はｄ_１であり、したがって最大距離を示す特徴ベクトルはベクトルＯＢ_１であるので、ベクトルＯＢ_１が、候補順位第１位の代表ベクトル候補すなわち第１代表ベクトル候補とされる。
【００３３】
次に、ステップ４６において、残りの特徴ベクトルすなわち３つのベクトルＯＢ_２、ＯＢ_３およびＯＢ_４の各々と、原代表ベクトルＯＡおよび代表ベクトル候補ＯＢ_１のうちその特徴ベクトルとの距離が最も近いものとの距離が特定される。この例では、図５の（ｂ）に示すように、ベクトルＯＢ_３およびＯＢ_４については、原代表ベクトルＯＡとの距離の方が代表ベクトル候補ＯＢ_１との距離よりも小さいため、図示のように距離ｄ_１’およびｄ_２’が特定される。ベクトルＯＢ_２については、代表ベクトル候補ＯＢ_１との距離の方が原代表ベクトルＯＡとの距離よりも小さいため、距離ｄ_３’が特定される。
【００３４】
続いて、ステップ４８において、ステップ４６で算出した距離のうち最大の距離を示す特徴ベクトルを、次に高い候補順位の代表ベクトル候補とする。図５の（ｂ）の例では、最大距離ｄ_１’を示す特徴ベクトルはベクトルＯＢ_２であるので、ベクトルＯＢ_２が、候補順位第２位の第２代表ベクトル候補とされる。
【００３５】
次に、ステップ５０において、まだ候補順位を付けていない特徴ベクトルがあるか否かが確認される。ここでは、まだ特徴ベクトルＯＢ_２およびＯＢ_４に候補順位が付けられていないので、処理はステップ４６に戻る。ここで再び行われるステップ４６では、図５の（ｃ）に示すように、ベクトルＯＢ_４については、原代表ベクトルＯＡおよび代表ベクトル候補ＯＢ_１ならびにＯＢ_３のうちもっとも距離が近いベクトルは原代表ベクトルＯＡであるので、図示のように原代表ベクトルＯＡとの距離ｄ_１’’が特定される。ベクトルＯＢ_２については、代表ベクトル候補ＯＢ_１との距離ｄ_２’’が特定される。続くステップ４８においては、距離ｄ_１’’の方がｄ_２’’よりも大きいので、特徴ベクトルＯＢ_４が候補順位第３位の第３代表ベクトル候補とされる。
【００３６】
このようにして、４つ全ての特徴ベクトルに候補順位が付けられるまで、図４のステップ４６からステップ５０が繰り返され、最終的には各特徴ベクトルがＯＢ_１、ＯＢ_３、ＯＢ_４、ＯＢ_２の順に代表ベクトル候補に追加され、候補順位第１位から第４位が付けられることとなる。
【００３７】
続いて、図４のステップ５２において、上記の候補順位に従って、代表ベクトル候補のうち上位いくつかの特徴ベクトルを初期代表ベクトルとするのであるが、本実施形態では、代表ベクトル候補に追加された際の上記の最大距離の変化量が最大であった特徴ベクトルまでを、初期代表ベクトルとする。すなわち、図５に示すとおり、特徴ベクトルＯＢ_１が第１代表ベクトル候補として代表ベクトル候補に追加された際には、最大距離はｄ_１からｄ_１’に変化しており、次に特徴ベクトルＯＢ_３が代表ベクトル候補に追加された際には、最大距離はｄ_１’からｄ_１’’に変化している。この最大距離の変化の様子を図示すると、図６のようになる。この例では、最大距離の変化量は、候補順位第２位の特徴ベクトルＯＢ_３が代表ベクトル候補に追加されたときが最大であるので、候補順位第２位までの特徴ベクトル、すなわちベクトルＯＢ_１およびＯＢ_３の２つが、初期代表ベクトルＯＲ_１およびＯＲ_２とされる。
【００３８】
ここで、本実施形態では、上記のとおり特徴ベクトル空間における最大距離の変化量を基準として初期代表ベクトルを選択したが、平均距離の変化量等を基準としてもよい。あるいは、最低解像度画像では画素数は相当に減じられているので（本実施形態では４画素）、最低解像度画像の各画素から抽出した特徴ベクトルの全てを初期代表ベクトルとしてもよい。要するに、外部からの何らかのパラメータの入力を要さずに、原画像から導出した最低解像度画像自体から、所定の処理により該原画像に応じた代表ベクトルの各成分の初期値を求めるものであれば、本発明の範囲に属するものである。
【００３９】
図１に戻り、続いてステップ１４において、歪み閾値の初期値が、やはり最低解像度画像に基づいて導出される。この歪み閾値の初期値は、次の低解像度画像すなわち画素数４×４の第８低解像度画像において、どの程度まで分類を行うかの基準となる値である。この歪み閾値の初期値は、最低解像度をなすｎ個（本実施形態では４個）の画素から抽出された特徴ベクトルの、特徴ベクトル空間における拡がりを示す値であることが好ましいが、必ずしもそれに限られず、上記した代表ベクトルの各成分の初期値の場合と同様、外部からの何らかのパラメータの入力を要さずに、原画像から導出した最低解像度画像自体から、所定の処理により該原画像に応じた歪み閾値の初期値を求めるものであれば、本発明の範囲に属するものである。本実施形態では、最低解像度画像の各画素から抽出した４つの特徴ベクトルと、原代表ベクトルとの距離の最大値、すなわち図５の（ａ）における距離ｄ_１を、歪み閾値Ｄの初期値とする。これに代えて、図５のより下の階層における最大距離（すなわち、ｄ_１’、ｄ_１’’等）や、距離ｄ_１、ｄ_２、ｄ_３およびｄ_４の平均値等を使用してもよい。
【００４０】
次に、図１のステップ１６において、解像度が次に低い画像、すなわち画素数４×４の第８低解像度画像が現在画像とされ、該現在画像における分類処理が開始される。以下、図７から９も参照しながら、画素数４×４の現在画像における分類処理について順を追って説明する。
【００４１】
まず、図１のステップ１８において、現在画像をなす１６個の画素の各々から、ＹＣＣ表色系における輝度成分および２つの色差成分を成分とする特徴ベクトルが抽出される。
【００４２】
次に、ステップ２０において、上記の１６個の特徴ベクトルが特徴ベクトル空間に写像される。１６個の特徴ベクトルの終点をＣ_１からＣ_１６で表すこととすると、この例ではそれらの特徴ベクトル空間における分布は、図７の（ａ）に示すような分布であるとする。
【００４３】
続いて、ステップ２２において、上記の１６個の特徴ベクトルを、各代表ベクトルが代表するクラスターに暫定的に分類する。現在の代表ベクトルは、最低解像度画像から導出した初期代表ベクトルＯＲ_１およびＯＲ_２であるので、これらに代表される２つのクラスターに１６個の特徴ベクトルを分類する。本実施形態では、特徴ベクトル空間におけるユークリッド距離を基準として、距離が近い方の代表ベクトルが代表するクラスターに、各特徴ベクトルを分類するものとする。その結果、暫定的な分類は図７の（ａ）に示すようになる。
【００４４】
次に、ステップ２４において、上記の２つのクラスターの各々に含まれる特徴ベクトルに基づいて、各代表ベクトルを、これらのクラスターをより適切に代表するものに更新する。本実施形態では、図７の（ｂ）に示すように、各クラスターに属する特徴ベクトルの重心を指すベクトルに、代表ベクトルＯＲ_１およびＯＲ_２を更新する。
【００４５】
続いて、ステップ２６において、現在の暫定的な分類の歪みを示す歪みパラメータを算出する。この歪みパラメータは、現在の暫定的な分類において、各クラスターのうち特徴ベクトル空間における拡がりが最も大きいクラスターの該拡がりを示す値であることが好ましい。本実施形態では、代表ベクトルと、該代表ベクトルが代表するクラスターに含まれる特徴ベクトルとの、特徴ベクトル空間におけるユークリッド距離の最大値を歪みパラメータとする。この例では、図７の（ｃ）に示すように、代表ベクトルＯＲ_１と、そのクラスターに属する特徴ベクトルＯＣ_１との距離ｄ_ｍａｘが最大であるので、このｄ_ｍａｘが歪みパラメータとされる。なお、最大値に代えて平均値等を歪みパラメータとしてもよい。
【００４６】
次に、ステップ２８において、ステップ２６で求めた歪みパラメータが歪み閾値Ｄ以下であるか否かが調べられる。現在の歪み閾値Ｄは、最低解像度画像から求めた歪み閾値の初期値である。この例では、この段階での歪みパラメータｄ_ｍａｘは歪み閾値Ｄよりも大きく、したがって図１に示した処理はステップ３０へと進むものとする。
【００４７】
ステップ３０では、クラスターの数を１つ増やしてより細かい暫定的な分類を求めるために、新たなベクトルが代表ベクトルに追加される。本実施形態では、図８の（ａ）に示すように、所属するクラスターを代表する代表ベクトルとの距離が最も遠い特徴ベクトル、すなわち上記の最大距離ｄ_ｍａｘを示す特徴ベクトルＯＣ_１と同一のベクトルが、新たな代表ベクトルＯＲ_３として追加される。
【００４８】
以下、３つの代表ベクトルに基づいて、再びステップ２２から２８の処理が行われる。すなわち、ステップ２２では、特徴ベクトル空間におけるユークリッド距離を基準として、各代表ベクトルが代表する３つのクラスターに特徴ベクトルＯＣ_１からＯＣ_１６が分類され（図８の（ａ）参照）、ステップ２４では、代表ベクトルＯＲ_１からＯＲ_３が各クラスターの重心を指すベクトルに更新され（図８の（ｂ）参照）、ステップ２６では、代表ベクトルと、該代表ベクトルが代表するクラスターに含まれる特徴ベクトルとの距離の最大値ｄ’_ｍａｘが、新たな暫定的な分類の歪みを示す歪みパラメータとして導出される（図８の（ｃ）参照）。この例では、新たな歪みパラメータｄ’_ｍａｘも歪み閾値Ｄより大きいものとする。したがって、ステップ２８から再びステップ３０へと進み、上記の最大距離ｄ’_ｍａｘを示す特徴ベクトルＯＣ_２と同一のベクトルが、新たな代表ベクトルＯＲ_４として代表ベクトルに追加される（図９の（ａ）参照）。
【００４９】
続いて、図９の（ａ）から（ｃ）に示すように、４つの代表ベクトルに基づいて、再びステップ２２から２８の処理が行われる。ここで、３回目のステップ２６で求められた、４つのクラスターによる暫定的な分類の歪みを示す歪みパラメータｄ’’_ｍａｘ（図９の（ｃ）参照）は、歪み閾値Ｄより小さいものとする。すると、図１に示す処理は、３回目のステップ２８からステップ３２へと進む。このときの分類が、現在画像における最終的な分類である。
【００５０】
ステップ３２では、現在画像が原画像であるか否かが確認される。この例では、現段階での現在画像は画素数４×４の第８低解像度画像であって原画像ではないので、図１に示す処理はステップ３４を経てステップ１６に戻り、解像度が次に低い画像、すなわち画素数８×８の第７低解像度画像における分類処理へと進むことになる。
【００５１】
解像度が次に低い画像へと進む前に行われるステップ３４においては、現在画像における上記の暫定的な分類のいずれかまたは最終的な分類に基づいて、次の画像における分類をどの程度まで行うかの基準となる新たな歪み閾値が導出され、歪み閾値Ｄの更新が行われる。新たな歪み閾値Ｄとしては、たとえば、現在画像における分類処理において求められた歪みパラメータ（この例では、上記のｄ_ｍａｘ、ｄ’_ｍａｘおよびｄ’’_ｍａｘ）のいずれかを使用することができる。本実施形態では、最初に導出された歪みパラメータ、すなわちｄ_ｍａｘを新たな歪み閾値Ｄとして採用するものとする。これに代えて、最後に導出した歪みパラメータ（この例ではｄ’’_ｍａｘ）を新たな歪み閾値Ｄとして採用することとしてもよい。
【００５２】
以下、原画像が現在画像とされ、原画像における最終的な分類が求められるまで、図１に示したステップ１６から３４の処理を繰り返すことにより、最終的には、原画像の各画素から抽出された１０２４×１０２４個の特徴ベクトルの分類を求めることができる。かかる分類結果は、原画像を「空」、「海」、「建物」等の撮影対象要素に対応する複数の画像領域に分割する領域分割処理等に利用できる。
【００５３】
上記の実施形態によれば、入力された原画像自体から導出した画素数２×２の最低解像度画像に基づいて初期代表ベクトルおよび歪み閾値の初期値が求められ、これらを利用して解像度が次に低い画像における特徴ベクトルの最適な分類を順次求めていくことにより、最終的には原画像の各画素から抽出した特徴ベクトルの分類が求められるので、外部からのパラメータの入力を要さずに、原画像の内容に応じた最適な分類の程度により分類を行うことができる。したがって、いかなる原画像が入力されても、安定した分類性能を実現することができる。
【００５４】
なお、上記の実施形態の変更例として、図１のステップ２４と２６の間に、ステップ２４で更新された代表ベクトルに基づいて暫定的な分類も更新するステップをさらに含む形態を採用してもよい。たとえば、更新された各代表ベクトルと各特徴ベクトルとの特徴ベクトル空間におけるユークリッド距離を求め直して、より近い方の代表ベクトルが代表するクラスターに各特徴ベクトルを分類し直す処理等がこれにあたる。
【００５５】
また、上記の実施形態では、ＹＣＣ表色系で表された輝度成分および２つの色差成分を成分とする３次元のベクトルを特徴ベクトルとしたが、特徴ベクトルの各成分はこれらに限られず、また、特徴ベクトルの次元も３に限られない。
【００５６】
また、上記の実施形態では、説明の便宜上、原画像を１０２４×１０２４画素の正方形の画像としたが、縦横の画素数が異なる原画像に対しても本発明を適用できることは言うまでもない。各低解像度画像も、正方形画像に限られない。さらに、最低解像度画像の画素数も４画素に限られず、原画像の画素数より少ない複数画素であれば、いかなる画素数であってもよい。
【００５７】
さらに、上記の実施形態は、原画像の各画素から抽出した特徴ベクトルの分類を目的とするものであったが、原画像に予め画像縮小処理を施し、縮小画像の各画素から抽出した特徴ベクトルの分類を行う形態等も、本発明の範囲に属する。この場合、縮小画像の各画素から抽出した特徴ベクトルの分類を行った後、該縮小画像にあらためて画像拡大処理を施すこと等により、上記の実施形態と同様、分類結果を原画像の領域分割処理等に利用することができる。
【００５８】
なお、本発明の上記およびその他の実施形態によるクラスタリング処理は、コンピュータ・プログラムにより実行することもできる。また、上記の説明は、説明の便宜上、方法に関して行ったが、上記の各ステップを行う手段を備えた装置も、本発明の範囲に属するものとする。
【００５９】
以上、本発明の実施形態について詳細に述べたが、上記の実施形態は例示的なものに過ぎず、本発明の技術的範囲は、本明細書中の特許請求の範囲のみによって定められるべきものであることは言うまでもない。
【図面の簡単な説明】
【図１】本発明の１つの実施形態であるクラスタリング処理の手順を示したフローチャート
【図２】図１の実施形態における、各低解像度画像の導出方法を示した概念図
【図３】各低解像度画像の導出に使用するガウシアンフィルターの例を示した図
【図４】図１の実施形態における、初期代表ベクトルの導出工程を詳細に示したフローチャート
【図５】図４に示した初期代表ベクトルの導出工程の各段階を示した概念図
【図６】図４に示した初期代表ベクトルの導出工程において、初期代表ベクトルの選択の基準となる最大距離の変化を示した棒グラフ
【図７】４×４画素の低解像度画像における分類処理の各段階を示した概念図
【図８】４×４画素の低解像度画像における分類処理の各段階を示した概念図
【図９】４×４画素の低解像度画像における分類処理の各段階を示した概念図[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a method and apparatus for classifying a plurality of feature vectors extracted from an image, and in particular, a plurality of feature vectors extracted from an image using a clustering technique in a feature vector space belong to each other. The present invention relates to a method and apparatus for classifying a plurality of clusters.
[0002]
[Prior art]
As a data classification method when features of observation data or image data can be expressed as feature vectors having n parameters representing the features, a plurality of feature vectors are mapped to an n-dimensional feature vector space, and the features A so-called clustering method for classifying these feature vectors into a plurality of clusters based on the distribution in the vector space is conventionally known. A cluster is literally a “lumps” of feature vectors in the feature vector space, and feature vectors that are close to each other are classified into the same cluster using Euclidean distance in the feature vector space as an index. That is, feature vectors belonging to the same cluster represent features that are similar to each other.
[0003]
In general, in the clustering process, a constant value parameter indicating how far the classification is to be performed must be specified prior to the start of the process. For example, the final number of clusters and the maximum allowable distance between the cluster center and feature vectors belonging to the clusters can be specified as such parameters. Even when using a self-converging algorithm that does not fix the number of clusters, etc., some constant value parameter that indicates how far the classification is performed, such as a parameter related to cluster expansion and a parameter related to the distance between clusters. The necessity to specify is not eliminated (for example, see Non-Patent Document 1).
[0004]
[Non-Patent Document 1]
Nagao Makoto, "Image recognition theory", first edition, Corona, February 15, 1983
, P. 120-126
[0005]
[Problems to be solved by the invention]
The degree to which classification should be performed in the clustering process strongly depends on the data contents. In particular, when image data with high diversity is targeted, the appropriate classification level varies greatly depending on the image data. For example, in order to perform region division processing for dividing one image into image regions corresponding to imaging target elements such as “sky”, “sea”, and “building” included in the image, from each pixel of the image When classifying extracted feature vectors by clustering processing, etc., the number of elements to be captured included in each image varies, and it is impossible to set a uniform standard for how much to classify It is. That is, if the image includes only “sky” and “sea” as elements to be photographed, it may be divided into two image regions, so that the degree of classification is coarse, but “building” and “tree” are sufficient. For images including many elements to be photographed such as “soil”, a finer classification is required. In this case, if the standard of the degree of uniform classification according to the former image is adopted, sufficient classification cannot be performed for the latter image, and the uniform standard according to the latter image cannot be achieved. If it is adopted, the former image, in which only “sky” and “sea” are actually photographed, is unnecessarily divided into many image areas.
[0006]
Therefore, in the clustering process for image data, regardless of what image data is input as a classification target, the optimum classification level for the image data is specified from the image data itself, and stable classification performance is realized. A method is strongly desired.
[0007]
In view of such circumstances, the present invention specifies the optimum classification level from the input image data itself and does not require input of a parameter that specifies the preferable classification level, and extracts the feature vector from the image data. It is an object of the present invention to provide a clustering method and apparatus for performing classification.
[0008]
[Means for Solving the Problems]
That is, in the first image feature vector clustering method according to the present invention, a step of deriving a low-resolution image composed of a plurality of pixels having different resolutions in stages from an original image, and the plurality of low-resolution images A step of obtaining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image having the lowest resolution, a step of extracting a plurality of feature vectors from an image having the next lowest resolution, and the above distortion threshold , Using the representative vector as a measure of the optimum classification degree, classifying the plurality of feature vectors, updating the representative vector and the distortion threshold, and a plurality of features extracted from the original image Characterized in that it includes a step of repeating the updating step from the extracting step until the classification according to the optimum classification degree of the vector is obtained. It is the law.
[0009]
In addition, the first image feature vector clustering apparatus according to the present invention includes: a unit for deriving a low-resolution image including a plurality of pixels having different resolutions in stages from an original image; Means for obtaining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image of the lowest resolution; means for extracting a plurality of feature vectors from an image with the next lowest resolution; and the above distortion threshold As an index of the degree of optimal classification, means for classifying the plurality of feature vectors based on the representative vector, means for updating the representative vector and the distortion threshold, and a plurality of features extracted from the original image Means for repeatedly operating the updating means from the extracting means until classification according to the optimum degree of vector classification is required. It is a device for the butterflies.
[0010]
Here, in the present invention, the “low resolution image” is an image composed of a plurality of pixels having a smaller number of pixels than the original image. Based on the original image, Gaussian pyramid, linear interpolation, spline interpolation, wavelet transform, etc. It refers to what is sequentially obtained by the image reduction processing used. Different reduction ratios may be applied to the vertical and horizontal directions of the image.
[0011]
In the present invention, a “feature vector” refers to a vector extracted from an image and including a plurality of parameters (hereinafter referred to as “feature amounts”) indicating the features of the image as components. The feature vector is typically extracted for each pixel constituting the extraction target image. However, the feature vector may be extracted for each block made up of several pixels, or linear interpolation, Gaussian pyramid, etc. may be added to the extraction target image. The image may be extracted from each pixel of the reduced image subjected to the image reduction process. As the feature amount that is a component of the feature vector, for example, a feature amount indicating a color feature, a brightness feature, a texture feature, depth information, an edge feature included in the image, or the like can be used.
[0012]
Further, in the present invention, the “representative vector” refers to a vector representing each cluster defined in the feature vector space, and one representative vector is assigned to each cluster. Here, the “feature vector space” refers to a space having the coordinates of each component of the feature vector. For example, when the feature vector is a vector having the luminance component of each pixel and two color difference components in the YCC color system as components, the three-dimensional YCC color system space is the feature vector space.
[0013]
Further, in the present invention, the “distortion threshold value” is a value serving as a reference for how much classification is performed in each low-resolution image or original image.
[0014]
The second feature vector clustering method according to the present invention includes a step of deriving a low-resolution image including a plurality of pixels having different resolutions in stages from an original image, and a step of extracting the plurality of low-resolution images. Of these, the step of obtaining the initial value of each component of the representative vector and the initial value of the distortion threshold from the lowest resolution image with the lowest resolution, and extracting a plurality of feature vectors from the current image with the next lowest resolution image as the current image A step of mapping the plurality of feature vectors to a feature vector space, and classifying the plurality of feature vectors into clusters represented by each of the representative vectors in the feature vector space. Obtaining a provisional classification, updating a representative vector based on a feature vector included in each of the clusters, and Determining a distortion parameter indicating the provisional classification of the distortion; comparing the distortion parameter with a distortion threshold; and if the distortion parameter is greater than the distortion threshold, adding a new vector to the representative vector; Until the parameter becomes smaller than the distortion threshold, the process of adding the new vector from the process of obtaining the provisional classification is repeated, and the process of obtaining the final classification in the current image and the provisional classification in the current image A step of updating a distortion threshold based on any one of or a final classification, and extracting the plurality of feature vectors until the original image is a current image and a final classification in the original image is obtained. To repeating the step of updating the distortion threshold value.
[0015]
According to the second image feature vector clustering apparatus of the present invention, there is provided a means for deriving a low-resolution image composed of a plurality of pixels having different resolutions from the original image, and a plurality of the low-resolution images. Means for obtaining the initial value of each component of the representative vector and the initial value of the distortion threshold from the lowest resolution image of the lowest resolution, and extracting a plurality of feature vectors from the current image with the next lowest resolution image as the current image Means for mapping the plurality of feature vectors to the feature vector space, and classifying the plurality of feature vectors into clusters represented by each of the representative vectors in the feature vector space, Means for obtaining a provisional classification, means for updating the representative vector based on the feature vector included in each of the clusters, and Means for determining a distortion parameter indicating a provisional classification distortion, and means for comparing the distortion parameter with a distortion threshold, and if the distortion parameter is greater than the distortion threshold, adding a new vector to the representative vector; Until the parameter becomes smaller than the distortion threshold, the means for adding the new vector from the means for obtaining the provisional classification is repeatedly operated to obtain the final classification in the current image, and the provisional in the current image. Based on one of the final classifications or the final classification, the means for updating the distortion threshold and the above feature vectors are extracted until the original image is the current image and the final classification in the original image is required It is an apparatus characterized by comprising means for repeatedly operating the means for updating the distortion threshold value from the means for performing the above.
[0016]
Here, in the present invention, the “distortion parameter” is a parameter that is compared with a distortion threshold value, which is an index that represents the roughness of provisional classification of feature vectors by clusters. The greater the number of clusters and the finer the classification, the smaller the distortion parameter. Note that the second image feature vector clustering method and apparatus according to the present invention adds a new vector to the representative vector when the distortion parameter is larger than the distortion threshold, and the distortion parameter is larger than the distortion threshold. Although the classification process in the current image is repeated until it becomes smaller, a method and an apparatus in which handling when the distortion parameter is equal to the distortion threshold are also included in the scope of the present invention. That is, when the distortion parameter is equal to or greater than the distortion threshold, a new vector is added to the representative vector, and the classification process in the current image is repeated until the distortion parameter becomes smaller than the distortion threshold. If it is larger, a new vector is added to the representative vector, and any one that repeats the classification process in the current image until the distortion parameter becomes equal to or less than the distortion threshold value falls within the scope of the present invention.
[0017]
Here, the second clustering method according to the present invention may further include a step of updating the provisional classification based on the representative vector between the step of updating the representative vector and the step of obtaining the distortion parameter. . Similarly, the second clustering apparatus according to the present invention may further include means for updating the provisional classification based on the representative vector updated by the means for updating the representative vector.
[0018]
Further, the step or means for obtaining the provisional classification classifies each of the plurality of feature vectors into a cluster represented by a representative vector that is closest to the feature vector in the feature vector space among the representative vectors. It may be a thing. In this case, the step or means for updating the representative vector may use a vector representing the centroid of the feature vector included in each cluster as a new representative vector in the feature vector space. Further, the distortion parameter is a value indicating the spread of the cluster having the largest spread in the feature vector space among the clusters, and the step or means for adding the new vector represents the cluster to which the cluster belongs. The same vector as the feature vector farthest from the representative vector may be added to the representative vector as a new vector.
[0019]
Here, the “distance” in the feature vector space typically refers to the Euclidean distance between two vectors in the feature vector space, but any index that appropriately represents the degree of proximity of the two vectors in the feature vector space. Suppose that it is not limited to the Euclidean distance. As the “value indicating the spread” of the cluster in the feature vector space, for example, the maximum value or average value of the distance between each feature vector belonging to the cluster and the representative vector of the cluster can be used.
[0020]
In addition, the step or means for updating the distortion threshold value described above may use any one of the distortion parameters obtained for obtaining the final classification in the current image as a new distortion threshold value.
[0021]
Further, when obtaining the initial value of each component of the representative vector and the initial value of the distortion threshold, one original representative vector is derived from the lowest resolution image consisting of n pixels, and the n pixels The distance in the feature vector space between the n feature vectors extracted from each and the original representative vector is obtained, the feature vector indicating the maximum distance is set as the representative vector candidate, and the highest candidate ranking is given, followed by The distance between each of the feature vectors in the feature vector space between the original representative vector and the representative vector candidate closest to the feature vector is obtained, and the feature vector indicating the maximum distance is added to the representative vector candidate and The process of assigning a higher candidate rank and repeating the process of assigning the next higher candidate rank is repeated several times, and the representative vector is The value of each component of the top few feature vectors among the candidates is set as the initial value of each component of the representative vector, and the value indicating the spread in the feature vector space of the n feature vectors is set as the initial value of the distortion threshold. Also good. In this case, as the top several feature vectors, up to the feature vector having the maximum change amount of the maximum distance when added to the representative vector candidate may be selected.
[0022]
Here, the “original representative vector” is one vector representing the lowest resolution image, and for example, a vector whose component is an average value of feature amounts extracted from each pixel of the lowest resolution image is used. obtain.
[0023]
【The invention's effect】
An image feature vector clustering method and apparatus according to the present invention obtain initial values of respective components of a representative vector and initial values of distortion threshold values based on a minimum resolution image derived from an input original image itself, The value is used to sequentially determine the optimum classification of feature vectors in the next lowest resolution image, and finally to determine the classification of feature vectors extracted from the original image. The feature vector extracted from the original image can be classified according to the optimum classification according to the content of the original image without requiring input of the designated parameter. Therefore, even if any original image is input, stable classification performance can be realized, and this is extremely effective when performing continuous segmentation on a large number of images including various elements to be photographed. is there.
[0024]
The image feature vector clustering method and apparatus according to the present invention sequentially performs classification processing on a low-resolution image, that is, an image with a reduced number of feature vectors, and the classification processing for each low-resolution image includes distortion parameters. Is cut off when the value becomes smaller than the distortion threshold value, and the process proceeds to an image with the next lowest resolution, so that the amount of calculation does not become excessive as compared with the conventional clustering process.
[0025]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings.
[0026]
FIG. 1 is a flowchart showing a procedure of clustering processing according to one embodiment of the present invention. The purpose of this clustering process is to classify the feature vectors extracted from each pixel of the original image into a plurality of clusters. The original image is “sky”, “sea”, “building”. It can be used for area division processing that divides into a plurality of image areas corresponding to the elements to be imaged. In the present embodiment, the original image is a digital photographic image having 1024 × 1024 pixels, and the feature vector is a three-dimensional vector having a luminance component and two color difference components expressed in the YCC color system. Suppose there is.
[0027]
When the original image is input, first, in step 10 of FIG. 1, a plurality of low resolution images having different resolutions are derived step by step. In this embodiment, as shown in FIG. 2, starting from the original image, low resolution images with half the vertical and horizontal pixels are sequentially obtained, and finally the ninth low resolution image, that is, Assume that a low-resolution image having 2 × 2 pixels is obtained. This low resolution image having 2 × 2 pixels is the lowest resolution image in this embodiment.
[0028]
Any method of image reduction processing for obtaining each of these low resolution images may be used, but in this embodiment, reduction processing using a Gaussian pyramid is used. The reduction process using the Gaussian pyramid is the step of deriving a blurred image by applying a Gaussian filter to the image to be reduced, and picking up every other pixel in such a blurred image and calculating the number of vertical and horizontal pixels respectively. This is a process of sequentially deriving low-resolution images having different resolutions by sequentially repeating the steps of deriving an image that has been reduced to half. The Gaussian filter used is, for example, as shown in FIG. 3 if the filter size is 5 × 5. Regarding the processing of the image edge, for example, when a 5 × 5 size filter shown in FIG. 3 is used, appropriate pixel values for two rows or two columns on each side of the image to be reduced are used. For convenience, a blurred image having the same size as the original image to be reduced can be derived. As the “appropriate pixel value” of the pixel to be added, a zero value, a value obtained by repeating the pixel value at the image end, or the like can be used.
[0029]
When the derivation of the low resolution image is finished, next, in step 12 of FIG. 1, the initial representative vector, that is, the initial value of each component of the representative vector, is derived based on the lowest resolution image (2 × 2 pixels). The initial representative vector derivation process will be described in detail below with reference to the flowchart of FIG. 4 and the conceptual diagram of FIG.
[0030]
First, in step 40 of FIG. 4, one original representative vector is derived from the lowest resolution image. In the present embodiment, one vector having the average value of each component of the feature vector extracted from the four pixels forming the lowest resolution image as the component is defined as the original representative vector. For example, in the feature vector space shown in FIG. ₁ , OB ₂ , OB ₃ And OB ₄ Is a feature vector extracted from four pixels forming the lowest resolution image, the original representative vector is OA.
[0031]
Next, in step 42 in FIG. 4, the distance between each of the above four feature vectors and the original representative vector is calculated. In the present embodiment, this distance is assumed to be the Euclidean distance in the feature vector space. In the example of FIG. 5A, the calculated distances are d ₁ , D ₂ , D ₃ And d ₄ It is.
[0032]
Subsequently, in step 44, the feature vector indicating the maximum distance among the distances calculated in step 42 is set as the first representative vector candidate in the candidate ranking. In the example of FIG. 5A, the distance d ₁ , D ₂ , D ₃ And d ₄ The maximum distance is d ₁ Therefore, the feature vector indicating the maximum distance is the vector OB ₁ So vector OB ₁ Is the first representative vector candidate in the candidate ranking, that is, the first representative vector candidate.
[0033]
Next, in step 46, the remaining feature vectors or three vectors OB ₂ , OB ₃ And OB ₄ , The original representative vector OA and the representative vector candidate OB ₁ The distance to the one closest to the feature vector is identified. In this example, as shown in FIG. ₃ And OB ₄ , The distance from the original representative vector OA is the representative vector candidate OB ₁ Distance d as shown in the figure. ₁ 'And d ₂ 'Is specified. Vector OB ₂ For representative vector candidate OB ₁ The distance d is smaller than the distance to the original representative vector OA. ₃ 'Is specified.
[0034]
Subsequently, in step 48, the feature vector indicating the maximum distance among the distances calculated in step 46 is set as a representative vector candidate of the next highest candidate rank. In the example of FIG. 5B, the maximum distance d ₁ The feature vector indicating 'is the vector OB ₂ So vector OB ₂ Is the second representative vector candidate in the second candidate ranking.
[0035]
Next, in step 50, it is confirmed whether there is a feature vector that has not yet been assigned a candidate rank. Here, the feature vector OB is still ₂ And OB ₄ Since no candidate rank has been assigned to, the process returns to step 46. In step 46 performed again here, as shown in FIG. ₄ For the original representative vector OA and representative vector candidate OB ₁ And OB ₃ Since the vector having the closest distance is the original representative vector OA, the distance d to the original representative vector OA as shown in the figure. ₁ '' Is specified. Vector OB ₂ For representative vector candidate OB ₁ Distance d ₂ '' Is specified. In the following step 48, the distance d ₁ '' Is d ₂ Is larger than '', so the feature vector OB ₄ Is the third representative vector candidate in the third candidate ranking.
[0036]
In this way, step 46 to step 50 in FIG. 4 are repeated until all four feature vectors are given candidate ranks. ₁ , OB ₃ , OB ₄ , OB ₂ Are added to the representative vector candidates in this order, and the first to fourth candidate rankings are given.
[0037]
Subsequently, in step 52 of FIG. 4, according to the above-described candidate ranking, the top several feature vectors among the representative vector candidates are set as initial representative vectors. In this embodiment, when added to the representative vector candidates, Up to the feature vector having the maximum amount of change in the maximum distance is set as the initial representative vector. That is, as shown in FIG. ₁ Is added to the representative vector candidate as the first representative vector candidate, the maximum distance is d ₁ To d ₁ And then the feature vector OB ₃ Is added to the representative vector candidate, the maximum distance is d ₁ 'To d ₁ '' Has changed. This change in the maximum distance is illustrated in FIG. In this example, the amount of change in the maximum distance is the second highest feature vector OB. ₃ Is added to the representative vector candidates, the feature vectors up to the second candidate rank, that is, the vector OB ₁ And OB ₃ Are the initial representative vector OR ₁ And OR ₂ It is said.
[0038]
Here, in the present embodiment, the initial representative vector is selected on the basis of the change amount of the maximum distance in the feature vector space as described above. However, the change amount of the average distance or the like may be used as a reference. Alternatively, since the number of pixels is considerably reduced in the lowest resolution image (four pixels in this embodiment), all of the feature vectors extracted from each pixel of the lowest resolution image may be used as the initial representative vector. In short, as long as the initial value of each component of the representative vector corresponding to the original image is obtained from the minimum resolution image itself derived from the original image by a predetermined process without requiring input of any parameters from the outside. It belongs to the scope of the present invention.
[0039]
Returning to FIG. 1, subsequently, in step 14, the initial value of the distortion threshold is derived also based on the lowest resolution image. The initial value of the distortion threshold is a value that serves as a reference for how much classification is performed in the next low-resolution image, that is, the eighth low-resolution image having 4 × 4 pixels. The initial value of the distortion threshold is preferably a value indicating the spread in the feature vector space of the feature vectors extracted from n pixels (four in the present embodiment) having the lowest resolution, but is not necessarily limited thereto. As in the case of the initial value of each component of the representative vector described above, it is possible to respond to the original image by performing a predetermined process from the lowest resolution image itself derived from the original image without requiring input of any external parameters. If the initial value of the distortion threshold is determined, it belongs to the scope of the present invention. In the present embodiment, the maximum value of the distance between the four feature vectors extracted from each pixel of the lowest resolution image and the original representative vector, that is, the distance d in FIG. ₁ Is the initial value of the distortion threshold D. Instead, the maximum distance (ie d) in the lower hierarchy of FIG. ₁ ', D ₁ '' Etc.) and distance d ₁ , D ₂ , D ₃ And d ₄ An average value or the like may be used.
[0040]
Next, in step 16 of FIG. 1, an image with the next lowest resolution, that is, an eighth low-resolution image having 4 × 4 pixels is set as the current image, and the classification process for the current image is started. Hereinafter, the classification process in the current image having 4 × 4 pixels will be described in order with reference to FIGS.
[0041]
First, in step 18 in FIG. 1, a feature vector having a luminance component and two color difference components in the YCC color system as components is extracted from each of the 16 pixels forming the current image.
[0042]
Next, in step 20, the 16 feature vectors are mapped into the feature vector space. Let the end points of 16 feature vectors be C ₁ To C ₁₆ In this example, it is assumed that the distribution in the feature vector space is a distribution as shown in FIG.
[0043]
Subsequently, in step 22, the 16 feature vectors are provisionally classified into clusters represented by the representative vectors. The current representative vector is the initial representative vector OR derived from the lowest resolution image. ₁ And OR ₂ Therefore, 16 feature vectors are classified into two clusters represented by these. In this embodiment, it is assumed that each feature vector is classified into a cluster represented by a representative vector having a shorter distance on the basis of the Euclidean distance in the feature vector space. As a result, the provisional classification is as shown in FIG.
[0044]
Next, in step 24, based on the feature vectors included in each of the two clusters, the representative vectors are updated to those that more appropriately represent these clusters. In this embodiment, as shown in FIG. 7B, a representative vector OR is added to a vector indicating the center of gravity of the feature vector belonging to each cluster. ₁ And OR ₂ Update.
[0045]
Subsequently, in step 26, a distortion parameter indicating the current provisional classification distortion is calculated. This distortion parameter is preferably a value indicating the spread of the cluster having the largest spread in the feature vector space among the clusters in the current provisional classification. In this embodiment, the maximum value of the Euclidean distance in the feature vector space between the representative vector and the feature vector included in the cluster represented by the representative vector is used as the distortion parameter. In this example, as shown in FIG. ₁ And the feature vector OC belonging to the cluster ₁ Distance d _max Is the maximum, so this d _max Is a distortion parameter. Note that an average value or the like may be used as the distortion parameter instead of the maximum value.
[0046]
Next, in step 28, it is checked whether or not the distortion parameter obtained in step 26 is equal to or less than the distortion threshold value D. The current distortion threshold D is an initial value of the distortion threshold obtained from the lowest resolution image. In this example, the distortion parameter d at this stage _max Is larger than the distortion threshold value D, and therefore the processing shown in FIG.
[0047]
In step 30, a new vector is added to the representative vector in order to increase the number of clusters by 1 to obtain a finer provisional classification. In this embodiment, as shown in FIG. 8A, the feature vector having the longest distance from the representative vector representing the cluster to which it belongs, that is, the maximum distance d described above. _max Feature vector OC indicating ₁ The same vector as the new representative vector OR ₃ Added as.
[0048]
Thereafter, the processing of steps 22 to 28 is performed again based on the three representative vectors. That is, in step 22, the feature vector OC is divided into three clusters represented by each representative vector on the basis of the Euclidean distance in the feature vector space. ₁ To OC ₁₆ Are classified (see FIG. 8A), and in step 24, the representative vector OR ₁ To OR ₃ Is updated to a vector indicating the center of gravity of each cluster (see FIG. 8B), and in step 26, the maximum distance d ′ between the representative vector and the feature vector included in the cluster represented by the representative vector. _max Is derived as a distortion parameter indicating the distortion of the new provisional classification (see FIG. 8C). In this example, a new distortion parameter d ′ _max Is also larger than the distortion threshold D. Therefore, the process proceeds from step 28 to step 30 again, and the maximum distance d ′ _max Feature vector OC indicating ₂ The same vector as the new representative vector OR ₄ Is added to the representative vector (see FIG. 9A).
[0049]
Subsequently, as shown in (a) to (c) of FIG. 9, the processing of steps 22 to 28 is performed again based on the four representative vectors. Here, the distortion parameter d ″ indicating the provisional classification distortion obtained by the four clusters obtained in the third step 26. _max (See (c) of FIG. 9) is smaller than the distortion threshold D. Then, the process shown in FIG. 1 proceeds from the third step 28 to step 32. The classification at this time is the final classification in the current image.
[0050]
In step 32, it is confirmed whether or not the current image is an original image. In this example, since the current image at the present stage is the 8th low resolution image having 4 × 4 pixels and not the original image, the processing shown in FIG. 1 returns to step 16 through step 34, and the resolution is next. The process proceeds to a classification process for a low image, that is, a seventh low resolution image having 8 × 8 pixels.
[0051]
In step 34, which is performed before proceeding to the next lower resolution image, to what extent will the next image be classified based on any of the above tentative classifications or final classification in the current image A new distortion threshold value serving as a reference for the above is derived, and the distortion threshold value D is updated. As the new distortion threshold value D, for example, the distortion parameter obtained in the classification process in the current image (in this example, the above d _max , D ' _max And d '' _max ) Can be used. In this embodiment, the initially derived distortion parameter, d _max Is adopted as a new distortion threshold value D. Instead, the last derived distortion parameter (d ″ in this example) _max ) May be adopted as the new distortion threshold D.
[0052]
Hereinafter, until the original image is made the current image and the final classification in the original image is obtained, the process of steps 16 to 34 shown in FIG. 1 is repeated, and finally extracted from each pixel of the original image. Classification of the 1024 × 1024 feature vectors thus obtained can be obtained. Such a classification result can be used for an area dividing process for dividing an original image into a plurality of image areas corresponding to imaging target elements such as “sky”, “sea”, and “building”.
[0053]
According to the above embodiment, the initial representative vector and the initial value of the distortion threshold value are obtained based on the 2 × 2 minimum resolution image derived from the input original image itself, and the resolution is determined using these. The feature vectors extracted from each pixel of the original image are finally obtained by sequentially finding the optimum classification of feature vectors in a low image, so there is no need to input parameters from the outside. Classification can be performed according to the optimum degree of classification according to the contents of the original image. Therefore, stable classification performance can be realized no matter what original image is input.
[0054]
In addition, as a modification of the above embodiment, a mode may be adopted that further includes a step of updating the provisional classification based on the representative vector updated in step 24 between steps 24 and 26 in FIG. Good. For example, this includes processing for recalculating the Euclidean distance in the feature vector space between each updated representative vector and each feature vector, and reclassifying each feature vector into a cluster represented by the closer representative vector.
[0055]
In the above embodiment, a three-dimensional vector having a luminance component and two color difference components expressed in the YCC color system is used as a feature vector. However, each component of the feature vector is not limited to these, and The dimension of the feature vector is not limited to 3.
[0056]
In the above embodiment, for convenience of explanation, the original image is a square image of 1024 × 1024 pixels, but it goes without saying that the present invention can be applied to original images having different numbers of vertical and horizontal pixels. Each low-resolution image is not limited to a square image. Further, the number of pixels of the minimum resolution image is not limited to four, and any number of pixels may be used as long as it is a plurality of pixels smaller than the number of pixels of the original image.
[0057]
Furthermore, the above embodiment is intended to classify the feature vectors extracted from each pixel of the original image. However, the feature vector extracted from each pixel of the reduced image by performing image reduction processing on the original image in advance. The form of performing the classification is also within the scope of the present invention. In this case, after classifying the feature vector extracted from each pixel of the reduced image, the image is subjected to image enlargement processing again on the reduced image. Etc. can be used.
[0058]
Note that the clustering process according to the above and other embodiments of the present invention may be executed by a computer program. In addition, the above description has been made with respect to a method for convenience of description, but an apparatus including means for performing each of the above steps is also within the scope of the present invention.
[0059]
Although the embodiments of the present invention have been described in detail above, the above-described embodiments are merely illustrative, and the technical scope of the present invention should be defined only by the claims in this specification. Needless to say.
[Brief description of the drawings]
FIG. 1 is a flowchart showing a procedure of clustering processing according to an embodiment of the present invention.
FIG. 2 is a conceptual diagram showing a method for deriving each low-resolution image in the embodiment of FIG.
FIG. 3 is a diagram showing an example of a Gaussian filter used for deriving each low-resolution image.
4 is a flowchart showing in detail an initial representative vector derivation step in the embodiment of FIG. 1;
5 is a conceptual diagram showing each stage of the initial representative vector derivation process shown in FIG. 4;
6 is a bar graph showing a change in the maximum distance which is a reference for selecting an initial representative vector in the initial representative vector derivation step shown in FIG.
FIG. 7 is a conceptual diagram showing each stage of classification processing in a low-resolution image of 4 × 4 pixels.
FIG. 8 is a conceptual diagram illustrating each stage of classification processing in a low-resolution image of 4 × 4 pixels.
FIG. 9 is a conceptual diagram showing each stage of classification processing in a low-resolution image of 4 × 4 pixels.

Claims

Deriving from the original image a low resolution image composed of a plurality of pixels having different resolutions in stages;
Obtaining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image having the lowest resolution among the plurality of low resolution images;
Extracting a plurality of feature vectors from the next lowest resolution image;
Classifying the plurality of feature vectors based on the representative vector, using the distortion threshold as an index of an optimal classification level;
Updating the representative vector and the distortion threshold;
Clustering of feature vectors of an image, including a step of repeating the updating step from the extracting step until classification according to the degree of the optimum classification of a plurality of feature vectors extracted from the original image is obtained Method.

Deriving from the original image a low resolution image composed of a plurality of pixels having different resolutions in stages;
Obtaining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image having the lowest resolution among the plurality of low resolution images;
Extracting a plurality of feature vectors from the current image, with the next lowest resolution image as the current image;
Mapping the plurality of feature vectors to a feature vector space;
Classifying the plurality of feature vectors into clusters represented by each of the representative vectors in the feature vector space to obtain a provisional classification in the current image;
Updating the representative vector based on the feature vectors included in each of the clusters;
Determining a distortion parameter indicative of the provisional classification distortion;
Comparing the distortion parameter with the distortion threshold and, if the distortion parameter is greater than the distortion threshold, adding a new vector to the representative vector;
Repeating the step of adding the new vector from the step of determining the tentative classification until the distortion parameter is less than the distortion threshold, and determining the final classification in the current image;
Updating the distortion threshold based on any of the tentative classifications in the current image or the final classification;
Including the step of repeating the step of updating the distortion threshold from the step of extracting the plurality of feature vectors until the original image is the current image and the final classification in the original image is obtained. Clustering method of feature vectors of images to be performed.

3. The clustering method according to claim 2, further comprising a step of updating the provisional classification based on the representative vector between the step of updating the representative vector and the step of obtaining the distortion parameter.

The step of obtaining the provisional classification is a step of classifying each of the plurality of feature vectors into a cluster represented by a representative vector that is closest to the feature vector in the feature vector space among the representative vectors. The clustering method according to claim 2 or 3, wherein:

5. The step of updating the representative vector is a step of setting a vector representing a centroid of the feature vector included in each of the clusters as a new representative vector in the feature vector space. Clustering method.

The distortion parameter is a value indicating the spread of the cluster having the largest spread in the feature vector space among the clusters,
The step of adding the new vector is a step of adding the same vector as the feature vector having the longest distance from the representative vector representing the cluster to which the cluster belongs to the representative vector as the new vector. 6. The clustering method according to claim 4, wherein the clustering method is characterized.

3. The step of updating the distortion threshold value is a step of setting any one of the distortion parameters obtained to obtain the final classification in the current image as a new distortion threshold value. 7. The clustering method according to any one of 6 to 6.

Obtaining the initial value comprises:
deriving one original representative vector from the lowest resolution image consisting of n pixels;
A distance in the feature vector space between the n feature vectors extracted from each of the n pixels and the original representative vector is obtained, and a feature vector indicating the maximum distance is set as a representative vector candidate and the highest Assigning candidate ranks,
A distance in the feature vector space between each of the remaining feature vectors and the original representative vector and the representative vector candidate closest to the feature vector is obtained, and a feature vector indicating a maximum distance is determined as the representative vector candidate. And adding the next highest candidate ranking,
Repeating the step of assigning the next highest candidate ranking multiple times;
In accordance with the candidate rank, the value of each component of the top several feature vectors of the representative vector candidates is set as the initial value of each component of the representative vector;
The clustering method according to any one of claims 2 to 7, further comprising a step of setting a value indicating a spread of the n feature vectors in the feature vector space as the initial value of the distortion threshold value.

9. The clustering method according to claim 8, wherein, as the top several feature vectors, up to a feature vector having a maximum amount of change in the maximum distance when added to the representative vector candidate is selected.

Means for deriving a low resolution image composed of a plurality of pixels having different resolutions in stages from the original image;
Means for determining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image having the lowest resolution among the plurality of low resolution images;
Means for extracting a plurality of feature vectors from the next lowest resolution image;
Means for classifying the plurality of feature vectors based on the representative vector, using the distortion threshold as an index of an optimal classification level;
Means for updating the representative vector and the distortion threshold;
Means for repeatedly operating the updating means from the extracting means until classification according to the optimum classification degree of the plurality of feature vectors extracted from the original image is obtained. Feature vector clustering device.

Means for deriving a low resolution image composed of a plurality of pixels having different resolutions in stages from the original image;
Means for determining an initial value of each component of the representative vector and an initial value of a distortion threshold from the lowest resolution image having the lowest resolution among the plurality of low resolution images;
Means for extracting an image having the next lowest resolution as a current image and extracting a plurality of feature vectors from the current image;
Means for mapping the plurality of feature vectors to a feature vector space;
Means for classifying the plurality of feature vectors into clusters represented by each of the representative vectors in the feature vector space to obtain a provisional classification in the current image;
Means for updating the representative vector based on the feature vectors included in each of the clusters;
Means for determining a distortion parameter indicative of the provisional classification distortion;
Means for comparing the distortion parameter with the distortion threshold and, if the distortion parameter is greater than the distortion threshold, adding a new vector to the representative vector;
Means for repeatedly operating means for adding the new vector from means for determining the tentative classification until the distortion parameter is less than the distortion threshold, and for determining a final classification in the current image;
Means for updating the distortion threshold based on any of the provisional classifications in the current image or the final classification;
Means for repeatedly operating the means for extracting the plurality of feature vectors from the means for extracting the plurality of feature vectors until the original image is the current image and the final classification in the original image is obtained; An image feature vector clustering device.

12. The clustering apparatus according to claim 11, further comprising means for updating the provisional classification based on the representative vector updated by the means for updating the representative vector.

The means for obtaining the provisional classification is means for classifying each of the plurality of feature vectors into a cluster represented by a representative vector having the closest distance in the feature vector space to the feature vector among the representative vectors. The clustering apparatus according to claim 11 or 12, wherein:

14. The means for updating the representative vector is a means for setting a vector representing the centroid of the feature vector included in each of the clusters as a new representative vector in the feature vector space. Clustering equipment.

The distortion parameter is a value indicating the spread of the cluster having the largest spread in the feature vector space among the clusters,
The means for adding the new vector is means for adding, to the representative vector, the same vector as the feature vector having the longest distance from the representative vector representing the cluster to which the new vector belongs. The clustering device according to claim 13 or 14, characterized in that

12. The means for updating the distortion threshold value is a means for setting any one of the distortion parameters obtained to obtain the final classification in the current image as a new distortion threshold value. 15. The clustering device according to any one of 15 to 15.

Means for determining the initial value;
means for deriving one original representative vector from the lowest resolution image comprising n pixels;
A distance in the feature vector space between the n feature vectors extracted from each of the n pixels and the original representative vector is obtained, and a feature vector indicating the maximum distance is set as a representative vector candidate and the highest A means to rank candidates,
A distance in the feature vector space between each of the remaining feature vectors and the original representative vector and the representative vector candidate closest to the feature vector is obtained, and a feature vector indicating a maximum distance is determined as the representative vector candidate. Means to add and rank the next highest candidate ranking,
Means for repeatedly operating the means for assigning the next highest candidate ranking;
Means for setting the values of the components of the top several feature vectors of the representative vector candidates according to the candidate rank as the initial values of the components of the representative vector;
17. The clustering apparatus according to claim 11, further comprising a unit that sets a value indicating a spread of the n feature vectors in the feature vector space as the initial value of the distortion threshold.

18. The feature vector having the largest change amount of the maximum distance when added to the representative vector candidate is selected as the top several feature vectors. Clustering device.