JP5385105B2

JP5385105B2 - Image search method and system

Info

Publication number: JP5385105B2
Application number: JP2009267620A
Authority: JP
Inventors: 智史上野; 真幸橋本; 亮一川田
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2009-11-25
Filing date: 2009-11-25
Publication date: 2014-01-08
Anticipated expiration: 2029-11-25
Also published as: JP2011113197A

Description

本発明は、クエリ画像に類似した画像を多数の検索対象画像の中から検索する画像検索方法およびシステムに係り、特に各画像の局所特徴量同士を比較して画像間の類似性を判断する画像検索方法およびシステムに関する。 The present invention relates to an image search method and system for searching an image similar to a query image from among a large number of search target images, and in particular, an image for determining similarity between images by comparing local feature amounts of each image. The present invention relates to a search method and system.

建物などの剛体を撮影した画像をロバストに検出する手法の一つとして、非特許文献１にSIFT(Scale Invariant Feature Transform)が開示されている。このSIFTでは、クエリ画像および検索対象画像の双方から予め局所特徴量が抽出され、各画像の局所特徴量間のユークリッド距離Lに基づいて最近傍探索が実行される。そして、距離の近い局所特徴量同士が対応点ペアとされ、最終的に対応点ペアの多い検索対象画像が検索結果とされる。 Non-Patent Document 1 discloses SIFT (Scale Invariant Feature Transform) as one method for robustly detecting an image of a rigid body such as a building. In this SIFT, local feature amounts are extracted in advance from both the query image and the search target image, and the nearest neighbor search is executed based on the Euclidean distance L between the local feature amounts of each image. Then, local feature quantities having a short distance are used as corresponding point pairs, and finally, a search target image having many corresponding point pairs is used as a search result.

David G. Lowe，"Distinctive image features from scale-invariant keypoints" International Journal of Computer Vision, 60, 2 (2004), pp.91-110.David G. Lowe, "Distinctive image features from scale-invariant keypoints" International Journal of Computer Vision, 60, 2 (2004), pp.91-110.

建築構造物、特にビルディングのように窓、柱あるいは壁などの直線を基調とする構成要素が一様に組み合わされる建物の画像では、これら構成要素の形状に建物間でのオブジェクト相違が小さい。したがって、異なる建物の画像同士を比較した場合でも、窓部等の局所特徴量は近い値を示してしまい、個々の対応点が実際に正しい対応点を示さない場合があった。すなわち、建物画像同士の比較では建物以外の画像同士を比較する場合に較べて対応点を誤認する場合があった。 In an image of a building structure, particularly a building image in which components based on straight lines such as windows, columns, or walls are uniformly combined like a building, the object difference between the buildings is small in the shape of these components. Therefore, even when images of different buildings are compared with each other, local feature amounts such as window portions show close values, and individual corresponding points may not actually show correct corresponding points. That is, in the comparison between building images, the corresponding points may be mistaken as compared with the case where images other than buildings are compared.

また、対応点の一部が誤認であっても、同一の建物画像間での誤認であれば、対応点に基づいてクエリ画像と同一の建物を含む類似画像を検索対象の中から検索できる場合がある。しかしながら、検索結果の類似画像からクエリ画像の撮影位置すなわちユーザ位置を各対応点に基づいてさらに特定しようとすると、誤認された対応点の影響により撮影位置を正確に推定することができないことがあった。 Also, even if some of the corresponding points are misidentified, if the misidentification is between the same building images, a similar image including the same building as the query image can be searched from the search target based on the corresponding points There is. However, if an attempt is made to further specify the shooting position of the query image, that is, the user position based on each corresponding point from the similar images of the search results, the shooting position may not be accurately estimated due to the influence of the misidentified corresponding point. It was.

本発明の目的は、上記した従来技術の課題を解決し、窓や柱などの同一形状物を多数備える建物画像のように、似通った形状物を多く備える被写体の画像を比較する場合でも対応点を高い確度で識別できる画像検索方法およびシステムを提供することにある。 The object of the present invention is to solve the above-mentioned problems of the prior art, and even when comparing images of a subject having many similar shaped objects, such as a building image having many identical shaped objects such as windows and columns. It is an object of the present invention to provide an image search method and system that can identify an image with high accuracy.

上記の目的を達成するために、本発明は、クエリ画像に類似した画像を検索対象画像の集合から検索する画像検索システムにおいて、以下のような手段を講じた点に特徴がある。 In order to achieve the above object, the present invention is characterized in that the following measures are taken in an image search system that searches an image similar to a query image from a set of search target images.

(1)クエリ画像および各検索対象画像を、その被写体の平行成分が画像上でも平行となるように幾何変換する手段と、クエリ画像および各検索対象画像の特徴点から局所特徴量を抽出する局所特徴量抽出手段と、クエリ画像および検索対象画像の各特徴点から抽出した局所特徴量を比較し、類似度が上位の特徴点を対応点候補として抽出する対応点候補抽出手段と、幾何変換されたクエリ画像および検索対象画像からエッヂ成分を抽出する手段と、クエリ画像および各検索対象画像からエッヂ成分の出現パターンを検出する手段と、エッヂ成分の出現パターンに基づいて、前記対応点候補から対応点を抽出する対応点抽出手段と、抽出された対応点に基づいて、クエリ画像に類似した検索対象画像を決定する類似画像決定手段とを具備したことを特徴とする。 (1) Means for geometrically transforming the query image and each search target image so that the parallel component of the subject is also parallel on the image, and a local feature amount that extracts a local feature amount from the feature points of the query image and each search target image A feature amount extraction unit, a corresponding point candidate extraction unit that compares local feature amounts extracted from each feature point of the query image and the search target image, and extracts a feature point having a higher similarity as a corresponding point candidate, and is subjected to geometric transformation Means for extracting the edge component from the query image and the search target image, means for detecting the appearance pattern of the edge component from the query image and each search target image, and corresponding from the corresponding point candidates based on the appearance pattern of the edge component Corresponding point extracting means for extracting points and similar image determining means for determining a search target image similar to the query image based on the extracted corresponding points And features.

(2)対応点候補抽出手段は、クエリ画像の特徴点ごとに、各検索対象画像から類似度が上位Nベストの特徴点を抽出する手段と、Nベスト特徴点の局所特徴量をクエリ画像の対応する特徴点の局所特徴量と比較し、局所領域のスケールおよびオリエンテーションの差分で定義されるプロットの分布を算出する手段と、プロットの分布に基づいて、対応点としての尤度が高い複数の特徴点ペアを対応点候補として抽出する手段とを具備し、各プロットには、クエリ画像および各検索対象画像の各特徴点の局所特徴量の類似度に応じた重み値が付与されていることを特徴とする。 (2) Corresponding point candidate extracting means extracts, for each feature point of the query image, a feature point having the highest N-best similarity from each search target image; A means for calculating the distribution of the plot defined by the local area scale and the difference in orientation compared to the local feature of the corresponding feature point, and a plurality of likelihoods as the corresponding points based on the distribution of the plot. Means for extracting a feature point pair as a corresponding point candidate, and each plot is given a weight value according to the similarity of local feature amounts of each feature point of the query image and each search target image It is characterized by.

本発明によれば、以下のような効果が達成される。 According to the present invention, the following effects are achieved.

(1)クエリ画像および各検索対象画像から抽出したエッジ成分に基づいてクエリ画像および各検索対象画像のエッヂ成分に関する出現パターンを検出し、この出現パターンに基づいて、各対応点候補のクエリ画像Iq側の位置と検索対象画像Iw(k)側の位置とのズレ量を算出し、ズレ量が小さい対応点候補のみが真の対応点とされるようにしたので、対応点以外を対応点と判定してしまう誤りを防止できるようになる。 (1) An appearance pattern related to an edge component of the query image and each search target image is detected based on the query image and the edge component extracted from each search target image, and the query image Iq of each corresponding point candidate based on the appearance pattern Since the amount of deviation between the position on the search side and the position on the search target image Iw (k) side is calculated, and only corresponding point candidates with a small amount of deviation are regarded as true corresponding points, It becomes possible to prevent errors that are judged.

(2)クエリ画像Iqおよび各検索対象画像Iwからエッヂ成分に関する出現パターンを検出する前に、クエリ画像Iqおよび各検索対象画像Iwに対して、その被写体の平行成分が画像上でも平行となるように幾何変換するようにしたので、エッヂ成分に関する出現パターンの検出精度を向上させることができる。 (2) Before detecting the appearance pattern related to the edge component from the query image Iq and each search target image Iw, the parallel component of the subject is parallel to the query image Iq and each search target image Iw even on the image. Thus, the appearance pattern detection accuracy for the edge component can be improved.

(3)クエリ画像Iqの各特徴点と局所特徴量が類似したNベストの特徴点を各検索対象画像から抽出する際、各画像の特徴点間の局所特徴量に関する類似性を「オリエンテーションの角度差」および「スケール比」の２つのパラメータで定義されるプロットの分布で表すと共に、各プロットに各局所特徴量間の類似度に関するユークリッド距離に依存した重み値を付与するようにしたので、局所特徴量間の類似性を前記プロットの分布から正確に求められるようになる。 (3) When extracting the N best feature points whose local feature values are similar to each feature point of the query image Iq from each search target image, the similarity regarding the local feature values between the feature points of each image is expressed as `` Orientation Angle Since it is expressed by the distribution of plots defined by two parameters of “difference” and “scale ratio”, each plot is given a weight value depending on the Euclidean distance regarding the similarity between each local feature amount. The similarity between the feature quantities can be accurately obtained from the distribution of the plot.

本発明の画像検索システムを適用した位置案内システムのブロック図である。It is a block diagram of the position guidance system to which the image search system of the present invention is applied. 対応点抽出部３０４のブロック図である。4 is a block diagram of a corresponding point extraction unit 304. FIG. 相違分布の算出方法を模式的に示した図である。It is the figure which showed the calculation method of difference distribution typically. 相違分布の一例を示した図である。It is the figure which showed an example of difference distribution. 対応点候補の抽出方法の一例を示した図である。It is the figure which showed an example of the extraction method of a corresponding point candidate. 対応点候補の抽出方法の他の一例を示した図である。It is the figure which showed another example of the extraction method of a corresponding point candidate. 画像補正部４０７による幾何変換を説明するための図である。It is a figure for demonstrating the geometric transformation by the image correction part 407. FIG. 領域分割部４０８による領域分割を説明するための図である。FIG. 10 is a diagram for explaining region division by a region division unit 408; エッジ出現パターンの検出方法を示した図である。It is the figure which showed the detection method of the edge appearance pattern. エッジ出現パターンに基づく対応点候補の評価方法を示した図である。It is the figure which showed the evaluation method of the corresponding point candidate based on an edge appearance pattern. 本発明の一実施形態の動作を示したメインフローである。It is the main flow which showed operation | movement of one Embodiment of this invention. 対応点候補抽出処理の手順を示したフローチャートである。It is the flowchart which showed the procedure of the corresponding point candidate extraction process. 対応点抽出処理の手順を示したフローチャートである。It is the flowchart which showed the procedure of the corresponding point extraction process.

以下、図面を参照して本発明の実施形態について詳細に説明する。図１は、本発明の画像検索システムが適用される位置案内システムの主要部の構成を示したブロック図であり、ここでは、本発明の説明に不要な構成は図示が省略されている。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing the configuration of the main part of a position guidance system to which the image search system of the present invention is applied. Here, the configuration unnecessary for the description of the present invention is omitted.

位置案内システムは、車両等の移動体に搭載されて街並みを全方位カメラで撮影する撮影装置１と、撮影された静止画像から検索対象となる多数の街並画像Iwを生成し、さらに各街並画像Iwから局所特徴量fwを抽出してデータベース化すると共に、ユーザ端末２で撮影された目印画像（クエリ画像Iq）に類似した街並画像Iwを前記データベースから検索する画像検索部３と、クエリ画像Iqに最も類似した街並画像Iw(best)に基づいてユーザの現在位置を推定し、これをユーザ端末２へ応答する位置推定部４とを主要な構成としている。 The position guidance system is mounted on a moving body such as a vehicle and shoots a street with an omnidirectional camera, generates a number of street images Iw to be searched from captured still images, An image search unit 3 that extracts a local feature quantity fw from the parallel image Iw and creates a database, and searches the database for a street image Iw similar to the landmark image (query image Iq) captured by the user terminal 2; Based on the street image Iw (best) that is most similar to the query image Iq, the current position of the user is estimated, and the position estimation unit 4 that responds to the user terminal 2 is the main component.

前記撮影装置１において、全方位カメラ１０１は多数の撮像部を備えて全方位の静止画像を撮影する。位置検知部１０２は、撮影タイミングの現在位置（撮影位置）を、例えば複数のGPS信号により測位する。撮影された各静止画像は、その撮影位置の情報と紐付けられて画像蓄積部１０３に次々と蓄積される。なお、例えばGoogle マップのストリートビューのように、街並画像Iwを撮影して構築された外部のデータベースを利用できるならば、前記撮影装置１に代えて、このような既存の外部データベースから街並画像を取得するようにしても良い。 In the photographing apparatus 1, the omnidirectional camera 101 includes a large number of imaging units and photographs omnidirectional still images. The position detection unit 102 measures the current position (shooting position) of the shooting timing using, for example, a plurality of GPS signals. Each captured still image is linked to the shooting position information and is stored in the image storage unit 103 one after another. If an external database constructed by photographing the street image Iw, such as street view of Google Maps, can be used, the streetscape can be obtained from such an existing external database instead of the photographing device 1. An image may be acquired.

画像検索部３において、街並画像生成部３０１は、前記全方位カメラ１０１で撮影された多数の静止画像に基づいてパノラマ画像を生成し、このパノラマ画像から所定の規則で多数の画像を街並画像Iw(k)として切り出す。符号kは各街並画像Iwの識別子である。 In the image search unit 3, the cityscape image generation unit 301 generates a panoramic image based on a large number of still images captured by the omnidirectional camera 101, and a large number of images are converted from the panoramic image according to a predetermined rule. Cut out as image Iw (k). A symbol k is an identifier of each street image Iw.

局所特徴量抽出部３０２は、クエリ画像Iqおよび各街並画像Iw(k)の各特徴点i，jを基準にして局所特徴量fq(i)，fw(k,j)を抽出する。本実施形態では、SIFTを利用して特徴点の抽出および対応付けが行われる。局所領域の特定には、非特許文献１に開示されているDifference of Gaussian (DoG)によるスケールスペース内の極値に基づく特徴点抽出が用いられる。この特徴点抽出の結果、特徴点の位置およびその局所領域の範囲（スケール）が算出され、この局所領域の特徴記述子として、輝度勾配の方向ヒストグラムが用いられる。 The local feature quantity extraction unit 302 extracts local feature quantities fq (i) and fw (k, j) with reference to the feature points i and j of the query image Iq and the street image Iw (k). In this embodiment, feature points are extracted and associated using SIFT. For specifying the local region, feature point extraction based on the extreme value in the scale space by Difference of Gaussian (DoG) disclosed in Non-Patent Document 1 is used. As a result of this feature point extraction, the position of the feature point and the range (scale) of the local region are calculated, and a luminance histogram direction histogram is used as the feature descriptor of the local region.

このような方向ヒストグラムは、特徴領域の各ピクセルの輝度勾配を算出し、それに重みを付けてヒストグラムを生成し、最も多いbin領域の方向を基準にして、その方向に特徴領域の回転（オリエンテーション）を行い、再度輝度方向の方向ヒストグラムを作成し、さらに前記ヒストグラムをブロックに分割し、各ブロック内で方向ヒストグラムを算出し、これを正規化してベクトル化することにより得られる。本実施形態では、ブロック内の輝度方向を８方向、対象領域内を１６分割しているため、一つの特徴記述子は８＊１６＝１２８次元となる。 Such a direction histogram calculates the luminance gradient of each pixel in the feature region, weights it to generate a histogram, and rotates the feature region in that direction based on the direction of the most bin region (orientation) The direction histogram in the luminance direction is generated again, the histogram is further divided into blocks, the direction histogram is calculated in each block, and this is normalized and vectorized. In this embodiment, since the luminance direction in the block is divided into 8 directions and the target area is divided into 16 areas, one feature descriptor has 8 * 16 = 128 dimensions.

これら特徴記述子の特徴として、局所領域の特徴を生成するのでオクルージョンに耐性があり、特徴点に対してスケールを決定するので画像サイズに不変であり、また輝度勾配に基づき画像平面内でオリエンテーションを行うので、画像平面に対する回転に不変であることなどが挙げられる。さらに、エッヂ成分を利用しているので輝度変化に耐性がある。このような特徴点検出が画像の全てのピクセルに対して行われるが、ある特徴点が極値を取った場合でも特徴点として不適な場合は特徴領域から除外される。 As feature of these feature descriptors, local region features are generated, so it is resistant to occlusion, scales are determined for feature points, and the image size is unchanged, and orientation is set in the image plane based on the luminance gradient. Since it is performed, it is invariable to rotation with respect to the image plane. Furthermore, since the edge component is used, it is resistant to luminance changes. Such feature point detection is performed on all pixels of the image, but even if a certain feature point takes an extreme value, it is excluded from the feature region if it is inappropriate as a feature point.

本実施形態では、クエリ画像Iqの各局所特徴量fq(i)が次式(1)で表される。pq(i)は同次座標で表した特徴点の位置、op(i)は特徴点のオリエンテーション、σq(i)は特徴点が発見されたスケール、dq(i)は特徴記述子であり、符号iはクエリ画像Iqの特徴点識別子である。
In the present embodiment, each local feature quantity fq (i) of the query image Iq is expressed by the following equation (1). pq (i) is the position of the feature point expressed in homogeneous coordinates, op (i) is the orientation of the feature point, σq (i) is the scale at which the feature point was found, dq (i) is the feature descriptor, A symbol i is a feature point identifier of the query image Iq.

fq(i)＝｛pq(i)，op(i), σq(i)，dq(i)｝・・・(1)
fq (i) = {pq (i), op (i), σq (i), dq (i)} (1)

同様に、各街並画像Iw(k)の局所特徴量fw(k,j)は次式(2)で表される。pw(k,j)は同次座標で表した特徴点の位置、ow(k,j)は特徴点のオリエンテーション、σw(k,j)は特徴点が発見されたスケール、dw(k,j)は特徴記述子であり、符号jは街並画像Iw(k)の特徴点識別子である。
Similarly, the local feature amount fw (k, j) of each street image Iw (k) is expressed by the following equation (2). pw (k, j) is the position of the feature point expressed in homogeneous coordinates, ow (k, j) is the orientation of the feature point, σw (k, j) is the scale where the feature point was found, dw (k, j ) Is a feature descriptor, and symbol j is a feature point identifier of the street image Iw (k).

fw(k,j)＝｛pw(k,j)，ow(k,j),σw(k,j)，dw(k,j)｝・・・(2)
fw (k, j) = {pw (k, j), ow (k, j), σw (k, j), dw (k, j)} (2)

このようにして抽出された局所特徴量fw(k,j)は、その街並画像Iw(k)と共に画像データベース３０３に蓄積される。 The local feature quantity fw (k, j) extracted in this way is stored in the image database 303 together with the street image Iw (k).

対応点抽出部３０４は、後に詳述するように、クエリ画像Iqの各局所特徴量fq(i)を各街並画像Iw(k)の各局所特徴量fw(k,j)と比較して対応点候補を求めると共に、この対応点候補の数に基づいて各街並画像Iw(k)から類似画像候補Iwc(k)を抽出し、さらに各類似画像候補Iwc(k)の対応点候補を、クエリ画像Iqおよび各類似画像候補Iwc(k)の幾何拘束の条件に基づいて尤度の高い真の対応点まで絞り込む。
類似画像決定部３０５は、対応点が最も多い各類似画像候補Iwc(k)を抽出し、これを検索結果Iw(best)として出力する。あるいは更に、前記抽出された複数の対応点を利用して射影変換を行い、このうち、射影変換できた対応点の個数が最も多い各類似画像候補Iwc(k)を検索結果Iw(best)とするようにしても良い。 As will be described later in detail, the corresponding point extraction unit 304 compares each local feature quantity fq (i) of the query image Iq with each local feature quantity fw (k, j) of each street image Iw (k). In addition to obtaining corresponding point candidates, the similar image candidate Iwc (k) is extracted from each street image Iw (k) based on the number of corresponding point candidates, and the corresponding point candidates for each similar image candidate Iwc (k) are further extracted. Based on the geometric constraint conditions of the query image Iq and each similar image candidate Iwc (k), narrow down to true corresponding points with high likelihood.
The similar image determination unit 305 extracts each similar image candidate Iwc (k) having the largest number of corresponding points, and outputs this as a search result Iw (best). Alternatively, projective transformation is performed using the plurality of extracted corresponding points, and among these, similar image candidates Iwc (k) having the largest number of corresponding points that have been projective transformed are obtained as search results Iw (best). You may make it do.

位置推定部４は、クエリ画像Iqおよび当該クエリ画像Iqに最も類似する街並画像Iw(best)に基づいてユーザの撮影位置および撮影方向を推定し、これをユーザ端末２へ応答する。 The position estimation unit 4 estimates the shooting position and shooting direction of the user based on the query image Iq and the street image Iw (best) that is most similar to the query image Iq, and responds to this to the user terminal 2.

図２は、前記対応点抽出部３０４の主要部の構成を示したブロック図であり、前記と同一の符号は同一または同等の内容を示している。 FIG. 2 is a block diagram showing a configuration of a main part of the corresponding point extraction unit 304, and the same reference numerals as those described above indicate the same or equivalent contents.

検索範囲絞込部４０１は、ユーザ端末２のGPSシステムにより測位されているユーザの現在位置に基づいて街並画像Iw(k)の検索範囲を予め絞り込む。Nベスト抽出部４０２は、クエリ画像Iqの各特徴点の局所特徴量と前記検索範囲内の全ての街並画像Iw(k)の各特徴点の局所特徴量とを比較し、クエリ画像Iqの特徴点ごとに、各街並画像Iw(k)から局所特徴量の類似度が高い上位N個の特徴点をNベスト特徴点として抽出する。したがって、クエリ画像Iqにm個の特徴点が設定されていれば、(m×N)個のNベスト特徴点が、前記検索範囲内の街並画像Iw(k)から抽出されることになる。 The search range narrowing unit 401 narrows down the search range of the cityscape image Iw (k) in advance based on the current position of the user that is measured by the GPS system of the user terminal 2. The N best extraction unit 402 compares the local feature amount of each feature point of the query image Iq with the local feature amount of each feature point of all the street image Iw (k) within the search range, and the query image Iq For each feature point, the top N feature points with high local feature similarity are extracted as N best feature points from each street image Iw (k). Therefore, if m feature points are set in the query image Iq, (m × N) N best feature points are extracted from the street image Iw (k) within the search range. .

本実施形態では、クエリ画像Iqの各特徴点について、全ての街並画像Iw(k)の各特徴点を対象に局所特徴量の最近傍探索が行われ、街並画像Iw(k)ごとに、局所特徴量間の距離に基づいて類似度が算出される。本実施形態では、各特徴点の局所特徴量fq(i)，fw(k,j)間の類似度が、次式(3)で与えられる各特徴記述子dq(i)，dw(k,j)間のユークリッド距離Lで代表される。
In this embodiment, for each feature point of the query image Iq, a local feature nearest neighbor search is performed for each feature point of all the street image Iw (k), and for each street image Iw (k). The similarity is calculated based on the distance between the local feature amounts. In this embodiment, the similarity between the local feature quantities fq (i) and fw (k, j) of each feature point is represented by the feature descriptors dq (i), dw (k, represented by Euclidean distance L between j).

L＝|dq(i)−dw(k,j)| ・・・(3)
L = | dq (i) −dw (k, j) | (3)

相違分布算出部４０３は、Nベスト特徴点の局所特徴量をクエリ画像Iqの対応する特徴点の局所特徴量と比較し、各局所領域の相違に関する分布を算出する。本実施形態では、初めに局所領域のスケール比（または差）、およびオリエンテーションの角度差（または角度比）に関する分布が算出される。 The difference distribution calculation unit 403 compares the local feature amount of the N best feature point with the local feature amount of the corresponding feature point of the query image Iq, and calculates a distribution related to the difference of each local region. In the present embodiment, first, a distribution relating to a local area scale ratio (or difference) and an orientation angle difference (or angle ratio) is calculated.

次いで、図３に一例を示したように、Nベスト特徴点ごとに、その局所特徴量とクエリ画像Iqの対応する局所特徴量との局所領域（スケール）のサイズ比およびオリエンテーションの角度差を算出して対応付け、これを全てのNベスト特徴点について実施して仮想的な二次元空間上にプロットすることにより、図４に一例を示した相違分布を算出する。
なお、本実施形態では各プロットに前記ユークリッド距離Lに基づく重み値が設定されている。すなわち、前記ユークリッド距離Lの短いNベスト特徴点ほど大きな重み値が付与される。図４に示した例では、各プロットの重み値が便宜的にプロットの大きさで代表されている。
図２へ戻り、対応点候補抽出部４０４は、前記相違分布に基づいて、対応点としての尤度が高い複数の対応点候補を抽出する。 Next, as shown in FIG. 3, for each N best feature point, the size ratio of the local region (scale) between the local feature amount and the corresponding local feature amount of the query image Iq and the angular difference of the orientation are calculated. Then, the distribution is performed for all the N best feature points and plotted on a virtual two-dimensional space, thereby calculating the difference distribution shown in FIG.
In the present embodiment, a weight value based on the Euclidean distance L is set for each plot. That is, a larger weight value is assigned to the N best feature point having a shorter Euclidean distance L. In the example shown in FIG. 4, the weight value of each plot is represented by the size of the plot for convenience.
Returning to FIG. 2, the corresponding point candidate extraction unit 404 extracts a plurality of corresponding point candidates having high likelihood as corresponding points based on the difference distribution.

図５は、対応点候補の抽出方法の一例を示した図であり、本実施形態では、分布密度の最も高い位置を中心に所定の範囲内に存在するNベスト特徴点が対応点候補とされ、それ以外のNベスト特徴点はノイズとして除去される。なお、前記分布密度に基づいて対応点候補を抽出する方法は上記に限定されるものではなく、図６に一例を示したように、前記相違分布をクラスタリングし、各クラスタから大きく外れるNベスト特徴点を除去し、残りを対応点候補とするようにしても良い。 FIG. 5 is a diagram showing an example of a method for extracting corresponding point candidates. In this embodiment, N best feature points existing within a predetermined range centering on a position having the highest distribution density are set as corresponding point candidates. The other N best feature points are removed as noise. Note that the method of extracting corresponding point candidates based on the distribution density is not limited to the above, and as shown in an example in FIG. 6, the difference distributions are clustered, and the N best feature greatly deviates from each cluster. The points may be removed and the remaining points may be used as corresponding point candidates.

図２へ戻り、類似画像候補抽出部４０５は、対応点候補の多い上位M個の街並画像Iwを類似画像候補Iwcとして抽出する。正規化部４０６は、各類似画像候補Iwc(k)の大きさを、その対応点候補の局所特徴量fwc(k)のスケール比等に基づいて正規化する。
画像補正部４０７は、各類似画像候補Iwc(k)およびクエリ画像Iqから水平成分および垂直成分を抽出し、各成分に基づいて消失点を抽出することで撮影の視点方向を求めると共に、この視点方向に基づいて、被写体の平行線が画像上でも平行となるように各画像に幾何変換を実施する。これ以後、幾何変換により被写体の平行線が画像上でも平行とされている画像を「平面化画像」と表現する場合もある。 Returning to FIG. 2, the similar image candidate extraction unit 405 extracts the top M cityscape images Iw having many corresponding point candidates as similar image candidates Iwc. The normalizing unit 406 normalizes the size of each similar image candidate Iwc (k) based on the scale ratio of the local feature quantity fwc (k) of the corresponding point candidate.
The image correction unit 407 extracts a horizontal component and a vertical component from each similar image candidate Iwc (k) and the query image Iq, and obtains a shooting viewpoint direction by extracting a vanishing point based on each component. Based on the direction, geometric transformation is performed on each image so that the parallel lines of the subject are parallel on the image. Thereafter, an image in which the parallel lines of the subject are parallel even on the image by geometric transformation may be expressed as a “planarized image”.

図７は、前記画像補正部４０７において、被写体の平行線が画像上でも平行となるようにクエリ画像Iqおよび各類似画像候補Iwc(k)に対して実行される幾何変換を説明するために図であり、同図(a)は幾何変換の対象となる原画像を示している。 FIG. 7 is a diagram for explaining the geometric transformation performed on the query image Iq and each similar image candidate Iwc (k) so that the parallel lines of the subject are parallel on the image in the image correction unit 407. FIG. 5A shows an original image to be subjected to geometric transformation.

本実施形態では、初めにクエリ画像Iqおよび各各類似画像候補Iwc(k)を対象に、Canny法[J. Canny:"A computational approach to edge detection,"IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), November 1986]を適用することでエッヂが検出される。次いで、前記検出されたエッヂを対象にHough変換[Duda, R. O. and P. E. Hart, "Use of the Hough Transformation to Detect Lines and Curves in Pictures," Comm. ACM, Vol. 15, pp. 11-15 (January, 1972).]が実行され、同図(b)に示したように垂直成分および水平成分のエッヂが抽出される。 In this embodiment, the Canny method [J. Canny: “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 8 is first applied to the query image Iq and each similar image candidate Iwc (k). (6), November 1986] is applied to detect edges. Next, Hough transformation [Duda, RO and PE Hart, "Use of the Hough Transformation to Detect Lines and Curves in Pictures," Comm. ACM, Vol. 15, pp. 11-15 (January , 1972).] Is executed, and the edges of the vertical component and the horizontal component are extracted as shown in FIG.

次いで、無限平面を有限の変形領域に変換する消失点抽出方法[松藤，齋藤：" 無限平面内での消失点抽出，"情報処理学会研究報告，Vol.99, No.70, pp.25-30, 1999]に基づいて、同図(c)に示したように消失点が求められる。次いで、この消失点に基づいて撮影時の視点方向が求められ、この視点方向に基づいて各画像が幾何変換されることにより、同図(d)に示したように、被写体の平行線が画像上でも平行な平面化画像が得られる。 Next, vanishing point extraction method that transforms an infinite plane into a finite deformation region [Matsufuji, Saito: "Disappearance of vanishing points in an infinite plane," IPSJ Research Report, Vol.99, No.70, pp.25- 30, 1999], the vanishing point is obtained as shown in FIG. Next, the viewpoint direction at the time of shooting is obtained based on this vanishing point, and each image is geometrically transformed based on this viewpoint direction, so that the parallel lines of the subject are imaged as shown in FIG. A parallel planarized image can be obtained even above.

領域分割部４０８は、クエリ画像Iqおよび各類似画像候補Iwc(k)を領域分割し、画像認識に不適な領域を予め検索対象から除外する。図８は、前記領域分割部４０８による領域分割の様子を示した図であり、本実施形態では、(1)道路領域の白線、(2)車や樹木などの移動体あるいは形態が変化する物体、のように、対応点検出に悪影響を及ぼす物体の領域が、同図(a)の原画像から予め除外される。 The region dividing unit 408 divides the query image Iq and each similar image candidate Iwc (k) into regions, and excludes regions that are inappropriate for image recognition from the search target in advance. FIG. 8 is a diagram showing a state of region division by the region division unit 408. In this embodiment, (1) a white line in a road region, (2) a moving object such as a car or a tree, or an object whose form changes. As described above, the region of the object that adversely affects the corresponding point detection is excluded in advance from the original image of FIG.

本実施形態では、初めに学習用の街並画像に対して手動で領域のラベリングが行われ、画像内の各領域が、例えば建物・空・道路・車・人・植物といった複数のカテゴリに分類される。次いで、各領域内のSIFT特徴点を利用して学習が行われ、この学習データを利用して、クエリ画像Iqおよび各類似画像候補Iw(k)が複数のブロックに分割されると共に、各ブロックがいずれかのカテゴリに分類される。この学習には、例えばSVM(Support Vector Machine)を利用できる。このような領域分割により、同図(b)に示したように、移動体や季節によって変化する物体、あるいは似たような形状が多い道路の白線などの、対応点検出に悪影響を及ぼす物体の領域が検索対象から予め除外され、四角枠で示した領域のみが検索対象とされることになる。
エッヂ検出部４０９は、クエリ画像Iqおよび各類似画像候補Iwc(k)からエッヂ成分を検出する。垂直エッヂ抽出部４１０は、各エッヂ成分から垂直エッヂを抽出する。エッヂ出現パターン検出部４１１は、クエリ画像Iqおよび各類似画像候補Iwc(k)の垂直エッヂの出現パターンを検出する。 In this embodiment, first, region labeling is manually performed on the learning streetscape image, and each region in the image is classified into a plurality of categories such as buildings, sky, roads, cars, people, and plants. Is done. Next, learning is performed using SIFT feature points in each region, and using this learning data, the query image Iq and each similar image candidate Iw (k) are divided into a plurality of blocks, and each block Fall into one of the categories. For this learning, for example, SVM (Support Vector Machine) can be used. As shown in Fig. 5 (b), this region division can be used to detect objects that have a negative effect on detection of corresponding points, such as moving objects, objects that change according to the season, or white lines on roads that have many similar shapes. The area is excluded from the search target in advance, and only the area indicated by the square frame is set as the search target.
The edge detection unit 409 detects an edge component from the query image Iq and each similar image candidate Iwc (k). The vertical edge extraction unit 410 extracts a vertical edge from each edge component. The edge appearance pattern detection unit 411 detects the vertical edge appearance pattern of the query image Iq and each similar image candidate Iwc (k).

対応点抽出部４１２は、対応点候補が抽出されているクエリ画像Iqおよび類似画像候補Iwc(k)の画像ペアについて、各対応点候補のクエリ画像側の位置と類似画像候補側の位置とのズレ量を、各画像の垂直エッヂの出現パターンに基づいて算出するズレ量算出部４１２ａを具備し、このズレ量の算出結果に基づいて、前記対応点候補から真の対応点を抽出する。 The corresponding point extraction unit 412 obtains the position of each corresponding point candidate on the query image side and the position on the similar image candidate side for the image pair of the query image Iq and the similar image candidate Iwc (k) from which the corresponding point candidates are extracted. A shift amount calculation unit 412a that calculates the shift amount based on the appearance pattern of the vertical edge of each image is provided, and a true corresponding point is extracted from the corresponding point candidate based on the calculation result of the shift amount.

図９は、前記エッヂ検出および垂直エッヂ抽出により得られた垂直エッヂの一例を示した図であり、窓が配置されている領域では多数の垂直エッヂが検出されている。このような垂直エッヂの出現パターンは各建物に固有なので、本実施形態ではクエリ画像Iqおよび各類似画像候補Iwc(k)に関して、前記エッヂ出現パターン検出部４１１により、所定の単位間隔で垂直エッヂ数Nxが計数され、さらに垂直エッヂの出現パターン[N1-N2-N3-N4…]が検出される。 FIG. 9 is a view showing an example of the vertical edge obtained by the edge detection and the vertical edge extraction. A large number of vertical edges are detected in the region where the window is arranged. Since such vertical edge appearance patterns are unique to each building, in the present embodiment, the edge appearance pattern detection unit 411 performs vertical edge counts at predetermined unit intervals for the query image Iq and each similar image candidate Iwc (k). Nx is counted, and the vertical edge appearance pattern [N1-N2-N3-N4...] Is detected.

前記対応点抽出部４１２は、クエリ画像Iqおよび各類似画像候補Iwc(k)のエッヂ出現パターンに基づいて各画像間の相対的な位置関係を把握し、各対応点候補のズレ量を検出する。そして、ズレ量の小さい対応点候補を真の対応点とする一方、ズレ量の多い対応点候補を誤対応として破棄する。 The corresponding point extraction unit 412 grasps the relative positional relationship between the images based on the edge appearance pattern of the query image Iq and each similar image candidate Iwc (k), and detects the shift amount of each corresponding point candidate. . Then, the corresponding point candidate with a small amount of deviation is regarded as a true corresponding point, while the corresponding point candidate with a large amount of deviation is discarded as an incorrect correspondence.

図１０は、前記垂直エッヂの出現パターンに基づいて対応点候補を評価し、真の対応点を抽出する方法を説明するための図であり、同図(a)はクエリ画像Iqの一例を示しており、同図(b)は当該クエリ画像Iqとの間で対応点候補が抽出されている類似画像候補Iwc(k)の一例を示している。ここでは、図面を見やすくするために垂直エッヂの表示が省略されている。 FIG. 10 is a diagram for explaining a method of evaluating a corresponding point candidate based on the vertical edge appearance pattern and extracting a true corresponding point. FIG. 10A shows an example of a query image Iq. FIG. 6B shows an example of a similar image candidate Iwc (k) from which corresponding point candidates are extracted from the query image Iq. Here, in order to make it easy to see the drawing, the display of the vertical edge is omitted.

図１０では、対応点候補が抽出されているクエリ画像Iqおよび類似画像候補Iwcのペア（以下、対応画像ペアと表現する場合もある）間で２組の対応点候補[q1，w1]および[q2，w2]が抽出されている。そして、一方の対応点候補[q1，w1]については、クエリ画像Iq側の対応点q1の縦位置が出現パターン[N1-N2-N3-N4…]のN1の位置であるのに対して、類似画像候補Iwc側の対応点w1の縦位置は出現パターン[N1-N2-N3-N4…]のN2〜N3の位置であり、両者はズレ量が多いので破棄される。 In FIG. 10, two pairs of corresponding point candidates [q1, w1] and [q] between a pair of query image Iq and similar image candidate Iwc from which corresponding point candidates have been extracted (hereinafter also referred to as corresponding image pairs). q2, w2] are extracted. For one corresponding point candidate [q1, w1], the vertical position of the corresponding point q1 on the query image Iq side is the position of N1 of the appearance pattern [N1-N2-N3-N4 ...], The vertical position of the corresponding point w1 on the similar image candidate Iwc side is the position of N2 to N3 of the appearance pattern [N1-N2-N3-N4...], And both are discarded because of the large amount of deviation.

これに対して、他方の対応点候補[q2，w2]については、クエリ画像Iq側の対応点q1の縦位置が出現パターン[N1-N2-N3-N4…]のN2の位置であり、類似画像候補Iw側の対応点w2の縦位置も同様に出現パターン[N1-N2-N3-N4…]のN2の位置であり、両者は距離が近いので対応点とされる。 On the other hand, for the other corresponding point candidate [q2, w2], the vertical position of the corresponding point q1 on the query image Iq side is the position of N2 of the appearance pattern [N1-N2-N3-N4. Similarly, the vertical position of the corresponding point w2 on the image candidate Iw side is also the position of N2 in the appearance pattern [N1-N2-N3-N4.

図１へ戻り、類似画像決定部３０５は、以上のようにして抽出された真の対応点が最も多い類似画像候補Iwc(k)を抽出し、これを検索結果Iw(best)として出力する。あるいは更に、前記抽出された複数の対応点を利用して射影変換を行い、このうち、射影変換できた対応点の個数が最も多い類似画像候補Iwc(k)を検索結果Iw(best)とするようにしても良い。 Returning to FIG. 1, the similar image determination unit 305 extracts a similar image candidate Iwc (k) having the largest number of true corresponding points extracted as described above, and outputs this as a search result Iw (best). Alternatively, projective transformation is performed using the plurality of extracted corresponding points, and among these, the similar image candidate Iwc (k) having the largest number of corresponding points that can be projective transformed is set as the search result Iw (best). You may do it.

前記位置推定部４は、前記検索結果の街並画像Iw(best)と類似する街並画像を前記データベース３０３から抽出する。本実施形態では、検索対象の街並画像が道路に沿って離散的に撮影されているので、街並画像Iw(best)の両隣の街並画像Iw(best-1)，Iw(best+1)のうち、前記街並画像Iw(best)により類似する一方（ここでは、例えばIw(best-1)がデータベース３０３から抽出される。次いで、これら２つの画像Iq，Iw(best-1)の対応点が上記と同様に検出されてエピポーラ拘束が算出され、撮影位置が既知の２枚の画像Iq，Iw(best-1)から特徴点の３次元位置が推定される。 The position estimation unit 4 extracts a streetscape image similar to the streetscape image Iw (best) as the search result from the database 303. In this embodiment, since the cityscape images to be searched are discretely taken along the road, the cityscape images Iw (best-1) and Iw (best + 1) on both sides of the cityscape image Iw (best) are taken. ) Of the street images Iw (best) (here, for example, Iw (best-1) is extracted from the database 303. Next, these two images Iq and Iw (best-1) are extracted. Corresponding points are detected in the same manner as described above, epipolar constraints are calculated, and the three-dimensional position of the feature point is estimated from two images Iq and Iw (best-1) whose shooting positions are known.

未知の空間上の点X を画像Iw(best)に射影した点x = [u v 1]^T と画像Iw(best-1)に射影した点x'= [u' v' 1]^T とには次式(4)の関係が成立する。
The point x = [uv 1] ^T projected from the point X in the unknown space onto the image Iw (best) and the point x '= [u' v '1] ^T projected onto the image Iw (best-1) The following equation (4) is established.

x^TFx'= 0 … (4)
x ^T Fx '= 0… (4)

これはエピポーラ拘束と呼ばれ、F は基礎行列である。上記関係に基づいて街並画像のデータベース内での対応点の３次元位置が推定されたのち、クエリ画像Iqと街並画像Iw(best)とに関しても同様に対応関係が算出され、これに基づいてユーザ位置が推定される。 This is called epipolar constraint, and F is the fundamental matrix. After the three-dimensional position of the corresponding point in the database of cityscape images is estimated based on the above relationship, the correspondence relationship is similarly calculated for the query image Iq and the cityscape image Iw (best). The user position is estimated.

次いで、フローチャートを参照して本発明の一実施形態の動作を詳細に説明する。図１１は、本発明の一実施形態の動作を示したメインフローであり、主に前記対応点抽出部３０４の動作を示している。 Next, the operation of the embodiment of the present invention will be described in detail with reference to a flowchart. FIG. 11 is a main flow showing the operation of the embodiment of the present invention, and mainly shows the operation of the corresponding point extraction unit 304.

ステップＳ１では、ユーザ端末２からクエリ画像Iqと共に受信されている当該ユーザ端末２の現在位置に基づいて、前記検索範囲絞込部４０１により街並画像Iwの検索範囲がユーザ位置近傍に予め絞り込まれる。ステップＳ２では、前記Nベスト抽出部４０２により、クエリ画像Iqの各特徴点の局所特徴量と各街並画像Iw(k)の各特徴点の局所特徴量とがクエリ画像Iqおよび街並画像Iw(k)の組み合わせ毎に比較され、クエリ画像Iqの特徴点ごとに、局所特徴量が類似するNベストの特徴点が各街並画像Iw(k)から抽出される。ステップＳ３では、前記Nベスト特徴点の中から対応点候補が抽出される。 In step S1, based on the current position of the user terminal 2 received together with the query image Iq from the user terminal 2, the search range of the street image Iw is narrowed down to the vicinity of the user position by the search range narrowing unit 401. . In step S2, the N best extraction unit 402 determines that the local feature amount of each feature point of the query image Iq and the local feature amount of each feature point of each street image Iw (k) are the query image Iq and the street image Iw. For each feature point of the query image Iq, N best feature points having similar local feature amounts are extracted from each street image Iw (k). In step S3, corresponding point candidates are extracted from the N best feature points.

図１２は、前記対応点候補の抽出手順を示したフローチャートであり、主に前記違分布算出部４０３および対応点候補抽出部４０４の動作を示している。 FIG. 12 is a flowchart showing the procedure for extracting the corresponding point candidates, and mainly shows the operations of the different distribution calculating unit 403 and the corresponding point candidate extracting unit 404.

ステップＳ１０１では、前記検索範囲内の街並画像Iw(k)の一つが今回の注目画像として選択される。ステップＳ１０２では、今回の街並画像Iwから抽出されているNベスト特徴点の一つに注目し、これに対応するクエリ画像Iqの特徴点が取得される。ステップＳ１０３では、前記取得された各特徴点の局所特徴量同士が比較され、そのサイズ比およびオリエンテーションの角度差が算出される。ステップＳ１０４では、前記図３を参照して説明したように、今回のNベスト特徴点が、前記サイズ比およびオリエンテーション角度差に基づいて仮想的な二次元空間にプロットされる。 In step S101, one of the cityscape images Iw (k) within the search range is selected as the current attention image. In step S102, attention is paid to one of the N best feature points extracted from the current street image Iw, and the feature points of the query image Iq corresponding to the N best feature points are acquired. In step S103, the local feature amounts of the acquired feature points are compared with each other, and the size ratio and orientation angle difference are calculated. In step S104, as described with reference to FIG. 3, the current N best feature points are plotted in a virtual two-dimensional space based on the size ratio and the orientation angle difference.

ステップＳ１０５では、今回の街並画像Iwから抽出されている全てのNベスト特徴点に関して上記のプロットが完了したか否かが判定される。完了していなければステップＳ１０２へ戻り、残りのNベスト特徴点について同様の手順が繰り返される。 In step S105, it is determined whether or not the plot has been completed for all N best feature points extracted from the current street image Iw. If not completed, the process returns to step S102, and the same procedure is repeated for the remaining N best feature points.

全てのNベスト特徴点に関してプロットが完了し、前記図４の相違分布が完成するとステップＳ１０６へ進み、分布密度の最大位置が検出（図５）され、あるいはクラスタリングが実施（図６）される。ステップＳ１０７では、前記分布密度に基づいて所定の範囲内のNベスト特徴点が対応点候補とされ、それ以外のNベスト特徴点は誤対応のノイズとして破棄される。 When plotting is completed for all N best feature points and the difference distribution of FIG. 4 is completed, the process proceeds to step S106, where the maximum position of the distribution density is detected (FIG. 5) or clustering is performed (FIG. 6). In step S107, N best feature points within a predetermined range are determined as corresponding point candidates based on the distribution density, and other N best feature points are discarded as miscorresponding noise.

図１１へ戻り、ステップＳ４では、前記類似画像候補決定部４０５において、対応点候補が多い上位M個の街並画像Iw(k)が抽出され、これが類似画像候補Iwcとして出力される。ステップＳ５では、前記M個の類似画像候補Iwcを対象に改めて対応点探索が実行され、真の対応点が抽出される。 Returning to FIG. 11, in step S <b> 4, the similar image candidate determination unit 405 extracts the top M cityscape images Iw (k) with many corresponding point candidates, and outputs these as similar image candidates Iwc. In step S5, a corresponding point search is performed again on the M similar image candidates Iwc, and true corresponding points are extracted.

図１３は、前記M個の類似画像候補Iwcから真の対応点を抽出する処理の手順を示したフローチャートであり、主に前記対応点抽出部４１２の動作を示している。 FIG. 13 is a flowchart showing a processing procedure for extracting true corresponding points from the M similar image candidates Iwc, and mainly shows the operation of the corresponding point extracting unit 412.

ステップＳ２０１では、類似画像候補Iwcの一つが今回の注目画像として選択される。ステップＳ２０２では、今回の注目画像Iwcから抽出されている対応点候補の一つが今回の注目候補として選択される。ステップＳ２０３では、今回の注目画像Iwcの対応点候補の一つと当該対応点候補と対応するクエリ画像Iq側の特徴点とのズレ量が、前記エッヂ出現パターン検出部４１１により別途に検出されているエッヂ出現パターンに基づいて検出される。ステップＳ２０４では、前記ズレ量の算出結果が所定の許容範囲内であるか否かが判定され、許容範囲外であればステップＳ２０５へ進み、誤対応点と判定されて破棄される。 In step S201, one of the similar image candidates Iwc is selected as the current attention image. In step S202, one of the corresponding point candidates extracted from the current attention image Iwc is selected as the current attention candidate. In step S203, the edge appearance pattern detection unit 411 separately detects a shift amount between one of the corresponding point candidates of the current image of interest Iwc and the feature point on the query image Iq side corresponding to the corresponding point candidate. Detected based on the edge appearance pattern. In step S204, it is determined whether or not the calculation result of the deviation amount is within a predetermined allowable range. If it is out of the allowable range, the process proceeds to step S205, where it is determined as an erroneous correspondence point and discarded.

ステップＳ２０６では、今回の注目画像Iwcから抽出されている全ての対応点候補に関して上記の処理が完了したか否かが判定される。完了していなければステップＳ２０２へ戻り、注目する対応点候補を切り替えながら上記の処理が繰り返される。今回の注目画像Iwcの全ての対応点候補に関して上記の処理が完了するとステップＳ２０７へ進み、破棄されることなく残っている全ての対応点候補が真の対応点とされる。ステップＳ２０８では、全ての類似画像候補Iwcに関して上記の処理が完了したか否かが判定され、完了していなければステップＳ２０１へ戻り、注目する類似画像候補Iwcを切り替えながら上記の各処理が繰り返される。 In step S206, it is determined whether or not the above processing has been completed for all corresponding point candidates extracted from the current attention image Iwc. If not completed, the process returns to step S202, and the above processing is repeated while switching the corresponding candidate point of interest. When the above processing is completed for all corresponding point candidates of the current image of interest Iwc, the process proceeds to step S207, and all corresponding point candidates remaining without being discarded are set as true corresponding points. In step S208, it is determined whether or not the above process has been completed for all similar image candidates Iwc. If not, the process returns to step S201, and the above processes are repeated while switching the similar image candidate Iwc of interest. .

前記類似画像決定部３０５は、対応点が最も多い類似画像候補Iwc(k)を抽出し、これを検索結果Iw(best)として出力する。あるいは更に、前記抽出された複数の対応点を利用して射影変換が行われ、このうち、射影変換できた対応点の個数が最も多い類似画像候補Iwc(k)が検索結果Iw(best)として出力される。 The similar image determination unit 305 extracts a similar image candidate Iwc (k) having the largest number of corresponding points, and outputs this as a search result Iw (best). Alternatively, projective transformation is performed using the plurality of extracted corresponding points, and among them, the similar image candidate Iwc (k) having the largest number of corresponding points that can be projective transformed is the search result Iw (best). Is output.

１…撮影装置，２…ユーザ端末，３…画像検索部，４…位置推定部，１０１…全方位カメラ，１０２…位置検知部，１０３…画像蓄積部，３０１…街並画像生成部，３０２…局所特徴量抽出部，３０３…データベース，３０４…対応点抽出部，３０５…類似画像決定部，４０１…検索範囲絞込部，４０２…Nベスト抽出部，４０３…相違分布算出部，４０４…対応点候補抽出部，４０５…類似画像候補抽出部，４０６…正規化部，４０７…画像補整部，４０８…領域分割部，４０９…エッヂ検出部，４１０…垂直エッヂ抽出部，４１１…エッヂ出現パターン検出部，４１２…対応点抽出部 DESCRIPTION OF SYMBOLS 1 ... Imaging device, 2 ... User terminal, 3 ... Image search part, 4 ... Position estimation part, 101 ... Omnidirectional camera, 102 ... Position detection part, 103 ... Image storage part, 301 ... Streetscape image generation part, 302 ... Local feature amount extraction unit, 303 ... database, 304 ... corresponding point extraction unit, 305 ... similar image determination unit, 401 ... search range narrowing unit, 402 ... N best extraction unit, 403 ... difference distribution calculation unit, 404 ... corresponding point Candidate extraction unit, 405 ... Similar image candidate extraction unit, 406 ... Normalization unit, 407 ... Image correction unit, 408 ... Region division unit, 409 ... Edge detection unit, 410 ... Vertical edge extraction unit, 411 ... Edge appearance pattern detection unit , 412 ... corresponding point extraction unit

Claims

In an image search system that searches an image similar to a query image from a set of search target images,
A local feature amount extraction means for extracting a local feature amount from the query image and feature points of each search target image;
Means for geometrically transforming the query image and each search target image so that the parallel component of the subject is parallel on the image;
A corresponding point candidate extracting unit that compares local feature amounts extracted from the feature points of the query image and the search target image, and extracts a feature point having a higher similarity as a corresponding point candidate;
Means for detecting an appearance pattern of an edge component from the geometrically transformed query image and search target image;
For pairs of query images and search target images from which corresponding point candidates have been extracted, the amount of deviation between the position of each corresponding point candidate on the query image side and the position on the search target image side is used as the appearance pattern of the edge component of each image. Means for calculating on the basis of;
Corresponding point extraction means for extracting corresponding points from the corresponding point candidates based on the calculation result of the deviation amount;
Similar image determination means for determining a search target image similar to a query image based on the extracted corresponding points ;
The corresponding point candidate extraction means includes:
For each feature point of the query image, means for extracting feature points with the highest N best similarity from each search target image;
Means for comparing the local feature amount of the N best feature points with the local feature amount of the corresponding feature point of the query image, and calculating a distribution of a plot defined by the difference in scale and orientation of the local region;
A plurality of feature point pairs having a high likelihood as corresponding points based on the distribution of the plot,
A weight value corresponding to the similarity of local feature amounts of each feature point of a query image and each search target image is assigned to each plot .

The means for geometric transformation is:
Means for extracting a linear component from the query image and each search target image;
Means for detecting the vanishing point based on the linear component,
Means for calculating a viewpoint direction of each image based on the vanishing point,
Image retrieval system according to claim 1, wherein based on the designated direction, the parallel component of the object is characterized in that each image in parallel even on the image is the geometric transformation.

Image retrieval system according to claim 1 or 2, wherein the search target image, a cityscape image building is subject.

In an image search method for searching an image similar to a query image from a set of search target images,
The computer geometrically transforms the query image and each search target image so that the parallel component of the subject is parallel on the image,
A procedure for a computer to extract a local feature amount from a feature point of a query image and each search target image;
A procedure in which the computer compares local feature amounts extracted from the feature points of the query image and the search target image, and extracts feature points with higher similarity as corresponding point candidates;
A computer for detecting an appearance pattern of an edge component from the geometrically transformed query image and the search target image;
For a pair of query images and search target images from which corresponding point candidates have been extracted , the computer calculates the amount of deviation between the position of each corresponding point candidate on the query image side and the position on the search target image side as the edge component of each image. A procedure to calculate based on the appearance pattern;
A computer extracting a corresponding point from the corresponding point candidate based on the calculation result of the deviation amount;
Based on the corresponding point which is the extracted, viewed contains a procedure for determining the search target image similar to the query image,
The procedure for extracting the corresponding point candidates is as follows:
For each feature point of the query image, extract the feature point with the highest N best similarity from each search target image,
Compare the local feature amount of the N best feature points with the local feature amount of the corresponding feature point of the query image, calculate the distribution of the plot defined by the local area scale and orientation difference,
Based on the distribution of the plot, a plurality of feature point pairs with high likelihood as corresponding points are extracted as corresponding point candidates,
A weight search value according to the similarity of local feature amounts of feature points of a query image and each search target image is assigned to each plot .