JP2013016073A

JP2013016073A - Image collation device, image collation method and computer program

Info

Publication number: JP2013016073A
Application number: JP2011149428A
Authority: JP
Inventors: Shinichi Sato; 真一佐藤
Original assignee: Research Organization of Information and Systems
Current assignee: Research Organization of Information and Systems
Priority date: 2011-07-05
Filing date: 2011-07-05
Publication date: 2013-01-24
Anticipated expiration: 2031-07-05
Also published as: JP5713398B2

Abstract

PROBLEM TO BE SOLVED: To provide an image collation device and an image collation method in which image collation is executed at high speed and similarity of images can be determined highly accurately.SOLUTION: An image collation device 1 has: image dividing sections 22, 25 that divide the first image and the second image into a plurality of blocks; feature vector calculation sections 23, 26 that calculate a feature vector of the first image and a feature vector of the second image on the basis of the sum total and the square sum of a normalized pixel value of pixels included in each block regarding the first image and the second image; an upper limit value calculation section 27 that calculates an upper limit value of normalized cross collation of the first image and the second image on the basis of the feature vector of the first image and the feature vector of the second image by using a formula for calculating the upper limit value of normalized cross collation which is derived by using Lagrange's undetermined multiplication determining method; and a first collation section 28 that collates the first image and the second image on the basis of whether or not the upper limit value is a first threshold value or higher.

Description

本発明は、画像照合装置、画像照合方法及びコンピュータプログラムに関し、特に、二つの画像が類似するか否かを判定する画像照合装置、画像照合方法及びコンピュータプログラムに関する。 The present invention relates to an image collation apparatus, an image collation method, and a computer program, and more particularly, to an image collation apparatus, an image collation method, and a computer program that determine whether two images are similar.

画像照合技術は、テンプレートマッチング、ブロック動き補償、画像圧縮、ステレオビジョン等の、多くの画像を用いたアプリケーションにおける重要な技術の一つである。最近は、画像の近似コピー検出、画像マイニング及びその画像アノテーションへの応用等、大規模な画像／映像データベースにおける画像照合技術の新しいアプリケーションについても研究が開始されている。 Image matching technology is one of important technologies in applications using many images, such as template matching, block motion compensation, image compression, and stereo vision. Recently, research has also begun on new applications of image matching technology in large-scale image / video databases, such as approximate copy detection of images, image mining and its application to image annotation.

一般に、画像照合の方法として、画素値の差の絶対値の合計（ＳＡＤ：Sum of Absolute Difference）、画素値の差の二乗の合計（ＳＳＤ：Sum of Squared Difference）、正規化相互相関（ＮＣＣ：Normalized Cross-Correlation）等が用いられている。特に、ＮＣＣは、画像間の輝度のずれ、コントラストの違い等による影響が少なく、輝度の補正等がなされた画像間の照合にも適している。例えば、水平方向をｘ軸、垂直方向をｙ軸とする画像Ｉ¹と画像Ｉ²についてのＮＣＣ値は、式（１）で定義される。

ここで、φ_x,yは画像Ｉ¹の座標（ｘ、ｙ）の画素の画素値であり、ψ_x,yは画像Ｉ²の座標（ｘ、ｙ）の画素の画素値であり、φ_aveは画像Ｉ¹の全画素の画素値の平均値であり、ψ_aveは画像Ｉ²の全画素の画素値の平均値である。式（１）に示すように、ＮＣＣ値を算出するためには、対応する画素毎に画素値の相関を計算する必要があるため、特に、照合する画像のサイズが大きい場合には、計算量が大きくなり、算出処理に多大な時間が必要となる。 In general, as an image matching method, the sum of absolute values of pixel value differences (SAD: Sum of Absolute Difference), the sum of squares of pixel value differences (SSD: Sum of Squared Difference), and normalized cross-correlation (NCC: Normalized Cross-Correlation) is used. In particular, NCC is less affected by a luminance shift between images, a difference in contrast, and the like, and is also suitable for collation between images whose luminance has been corrected. For example, the NCC values for the images I ¹ and I ² with the horizontal direction as the x-axis and the vertical direction as the y-axis are defined by equation (1).

Here, φ _{x, y} is the pixel value of the pixel at the coordinates (x, y) of the image I ¹ , ψ _{x, y} is the pixel value of the pixel at the coordinates (x, y) of the image I ² , φ _ave is an average value of the pixel values of all the pixels of the image I ¹ , and ψ _ave is an average value of the pixel values of all the pixels of the image I ² . As shown in the equation (1), in order to calculate the NCC value, it is necessary to calculate the correlation of the pixel value for each corresponding pixel. Therefore, particularly when the size of the image to be collated is large, the calculation amount Becomes large, and a long time is required for the calculation process.

そこで、例えば、非特許文献１には、ＮＣＣに基づくテンプレートマッチング処理を高速化するための技術が開示されている。非特許文献１に開示された技術は、画像全体に対してテンプレートをずらしながらスキャンする場合、畳み込み演算を用いて各位置での照合結果を求める。一方、空間領域間の畳み込み演算は、周波数領域間での単純な掛け算に変換できるので、空間領域のデータを高速フーリエ変換（ＦＦＴ：Fast Fourie Transform）を用いて周波数領域に変換することにより処理を高速化できる。しかし、この技術はある画像に対してテンプレートをスキャンする場合には効果的であるが、複数の画像に対して照合をする場合は各画像に対してＦＦＴ及び逆ＦＦＴを行う必要があり、かえって処理が遅くなるおそれがある。 Thus, for example, Non-Patent Document 1 discloses a technique for speeding up template matching processing based on NCC. The technique disclosed in Non-Patent Document 1 obtains a collation result at each position using a convolution operation when scanning with shifting the template with respect to the entire image. On the other hand, convolution operations between spatial domains can be converted into simple multiplications between frequency domains, so processing can be performed by converting spatial domain data to the frequency domain using Fast Fourie Transform (FFT). Speed can be increased. However, this technique is effective when scanning a template for a certain image. However, when collating a plurality of images, it is necessary to perform FFT and inverse FFT on each image. Processing may be slow.

一方、非特許文献２には、ＮＣＣの演算処理を効率化するための技術として、ＭＳＥＡ（multilevel successive elimination algorithm）が開示されている。非特許文献２に開示された技術では、式（２）に示すCauchy-Schwarzの不等式に基づいてＮＣＣの上限値が算出される。

On the other hand, Non-Patent Document 2 discloses MSEA (multilevel successive elimination algorithm) as a technique for improving the efficiency of NCC arithmetic processing. In the technique disclosed in Non-Patent Document 2, the upper limit value of NCC is calculated based on the Cauchy-Schwarz inequality shown in Expression (2).

画像Ｉ¹と画像Ｉ²についてブロック内画素数がｍである、ｎ個のブロックに分割した場合、式（１）のＮＣＣ値は、ブロックの番号ｉ（１≦ｉ≦ｎ）及びブロック内の画素の番号ｊ（１≦ｊ≦ｍ）を用いて式（３）で表される。

ここで、ｘ_i,j及びｙ_i,jは各画素の正規化画素値であり、式（４）で定義される。

式（２）、（３）から、ＮＣＣの上限値ＵＢ^MSEAは、式（５）で表される。

ここで、ＸＸ_i及びＹＹ_iは式（６）で定義される。

ＮＣＣ値はこの上限値ＵＢ^MSEAを越えることがないので、ＮＣＣ値の代わりに、ＮＣＣ値よりも簡易に算出できる上限値ＵＢ^MSEAを用いて二つの画像の類似性を評価することができる。 When the image I ¹ and the image I ² are divided into n blocks with the number of pixels in the block being m, the NCC value of Expression (1) is the block number i (1 ≦ i ≦ n) and the block The pixel number j (1 ≦ j ≦ m) is used to express the equation (3).

Here, x _{i, j} and y _{i, j} are normalized pixel values of each pixel and are defined by Expression (4).

From Expressions (2) and (3), the upper limit value UB ^{MSEA of} NCC is expressed by Expression (5).

Here, XX _i and YY _i are defined by Equation (6).

Since NCC values never exceed the upper limit UB ^MSEA, instead of NCC value, it is possible to evaluate the similarity of two images by using the upper limit value UB ^MSEA can be calculated easily than NCC value.

Tsai, D.-M., Lin, C.-T., 2003. Fast normalized cross correlation for defect detection. Pattern Recognition Letters 24 (15), 2625-2631.Tsai, D.-M., Lin, C.-T., 2003.Fast normalized cross correlation for defect detection.Pattern Recognition Letters 24 (15), 2625-2631. Wei, S.-D., Lai, S.-H., 2007. Efficient normalized cross correlation based on adaptive multilevel successive elimination. In: Proceedings of the 8th Asian conference on Computer vision - Volume Part I. ACCV’07. Springer-Verlag, Berlin, Heidelberg, pp. 638-646Wei, S.-D., Lai, S.-H., 2007. Efficient normalized cross correlation based on adaptive multilevel successive elimination. In: Proceedings of the 8th Asian conference on Computer vision-Volume Part I. ACCV'07. Springer -Verlag, Berlin, Heidelberg, pp. 638-646

非特許文献２に開示された技術を用いることにより、高速に画像照合を実施することができる。しかしながら、非特許文献２に開示されたＮＣＣの上限値は、ＮＣＣ値より大きくなりすぎる場合がある。その場合、照合する二画像のＮＣＣ値が低いにも関わらず、その二画像の類似性が高いと判定するおそれがあった。 By using the technique disclosed in Non-Patent Document 2, image matching can be performed at high speed. However, the upper limit value of NCC disclosed in Non-Patent Document 2 may be too larger than the NCC value. In this case, there is a possibility that the similarity between the two images is determined to be high although the NCC values of the two images to be collated are low.

そこで、本発明の目的は、画像照合を高速に実施するとともに、画像の類似性を高精度に判定することが可能な画像照合装置、画像照合方法及びそのような画像照合方法をコンピュータに実行させるコンピュータプログラムを提供することにある。 SUMMARY OF THE INVENTION An object of the present invention is to execute an image matching at a high speed and to cause a computer to execute an image matching device, an image matching method, and such an image matching method that can determine the similarity of images with high accuracy. To provide a computer program.

本発明に係る画像照合装置は、第１画像を複数のブロックに分割し、第２画像を第１画像と同数のブロックに分割する画像分割部と、第１画像及び第２画像のそれぞれについて、各ブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づき、第１画像の特徴ベクトル及び第２画像の特徴ベクトルを算出する特徴ベクトル算出部と、ラグランジュの未定乗算決定法を利用して２つの画像の正規化相互相関値を求める式から導いた２つの画像の正規化相互相関の上限値を算出する式を用いて、算出された第１画像の特徴ベクトル及び第２画像の特徴ベクトルに基づき、第１画像と第２画像の正規化相互相関の上限値を算出する上限値算出部と、上限値が第１の閾値以上であるか否かに基づいて第１画像と第２画像を照合する第１照合部と、を有する。 An image collation device according to the present invention divides a first image into a plurality of blocks, and divides the second image into the same number of blocks as the first image, and each of the first image and the second image, Feature vector calculation for calculating the feature vector of the first image and the feature vector of the second image based on the sum of the normalized pixel values of the pixels included in each block and the square sum of the normalized pixel values of the pixels included in each block And an equation for calculating the upper limit value of the normalized cross-correlation between the two images derived from the equation for obtaining the normalized cross-correlation value between the two images using the Lagrange's undetermined multiplication determination method. An upper limit calculation unit that calculates an upper limit value of the normalized cross-correlation between the first image and the second image based on the feature vector of the first image and the feature vector of the second image, and the upper limit value is greater than or equal to the first threshold value Based on whether or not It has a first matching portion for matching the first image and the second image.

さらに、本発明に係る画像照合装置において、特徴ベクトル算出部は、ｍをブロック内の画素数、ｎをブロックの数、φ_i,jを第１画像におけるｉ番目のブロック内のｊ番目の画素の画素値、φ_aveを第１画像の全画素の画素値の平均値として、次のベクトルξ_xを第１画像の特徴ベクトルとして算出し、

ψ_i,jを第２画像におけるｉ番目のブロック内のｊ番目の画素の画素値、ψ_aveを第２画像の全ての画素の画素値の平均値として、次のベクトルξ_yを第２画像の特徴ベクトルとして算出し、

上限値算出部は、第１画像の特徴ベクトルと第２画像の特徴ベクトルの内積を上限値として算出することが好ましい。 Further, in the image collation device according to the present invention, the feature vector calculation unit includes m as the number of pixels in the block, n as the number of blocks, and φ _{i, j} as the j-th pixel in the i-th block in the first image. And the following vector ξ _x is calculated as the feature vector of the first image, and φ _ave is calculated as an average value of the pixel values of all the pixels of the first image.

ψ _{i, j} is the pixel value of the j-th pixel in the i-th block in the second image, ψ _ave is the average value of the pixel values of all the pixels in the second image, and the next vector ξ _y is the second image As a feature vector of

The upper limit calculation unit preferably calculates the inner product of the feature vector of the first image and the feature vector of the second image as the upper limit value.

さらに、本発明に係る画像照合装置において、上限値が第１の閾値以上である第１画像と第２画像の正規化相互相関値を算出する正規化相互相関値算出部と、正規化相互相関値が第１の閾値以下の値である第２の閾値以上であるか否かに基づいて第１画像と第２画像を詳細に照合する第２照合部と、をさらに有することが好ましい。 Furthermore, in the image collation device according to the present invention, a normalized cross-correlation value calculation unit that calculates a normalized cross-correlation value between the first image and the second image whose upper limit value is equal to or greater than the first threshold, and normalized cross-correlation It is preferable to further include a second collation unit that collates the first image and the second image in detail based on whether or not the value is equal to or greater than a second threshold that is a value equal to or less than the first threshold.

また、本発明に係る画像照合方法は、第１画像を複数のブロックに分割するステップと、第１画像について、各ブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づき、第１画像の特徴ベクトルを算出するステップと、第２画像を第１画像と同数のブロックに分割するステップと、第２画像について、各ブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づき、第２画像の特徴ベクトルを算出するステップと、ラグランジュの未定乗算決定法を利用して２つの画像の正規化相互相関値を求める式から導いた２つの画像の正規化相互相関の上限値を算出する式を用いて、算出された第１画像の特徴ベクトル及び第２画像の特徴ベクトルに基づき、第１画像と第２画像の正規化相互相関の上限値を算出するステップと、上限値が第１の閾値以上であるか否かに基づいて第１画像と第２画像を照合するステップと、を含む。 The image collating method according to the present invention includes a step of dividing the first image into a plurality of blocks, a sum of normalized pixel values of pixels included in each block, and a pixel included in each block. A step of calculating a feature vector of the first image based on a sum of squares of normalized pixel values, a step of dividing the second image into the same number of blocks as the first image, and a pixel included in each block with respect to the second image Calculating a feature vector of the second image based on the sum of the normalized pixel values of the pixel and the square sum of the normalized pixel values of the pixels included in each block, and two images using Lagrange's undetermined multiplication determination method The feature vector of the first image and the feature vector of the second image calculated using the formula for calculating the upper limit value of the normalized cross-correlation of the two images derived from the formula for obtaining the normalized cross-correlation value of And calculating the upper limit value of the normalized cross-correlation between the first image and the second image, and collating the first image with the second image based on whether the upper limit value is equal to or greater than the first threshold value. Steps.

さらに、本発明に係る画像照合方法において、第１画像の特徴ベクトルを算出するステップにおいて、ｍをブロック内の画素数、ｎをブロックの数、φ_i,jを第１画像におけるｉ番目のブロック内のｊ番目の画素の画素値、φ_aveを第１画像の全画素の画素値の平均値として、次のベクトルξ_xを第１画像の特徴ベクトルとして算出し、

第２画像の特徴ベクトルを算出するステップにおいて、ｍをブロック内の画素数、ｎをブロックの数、ψ_i,jを第２画像におけるｉ番目のブロック内のｊ番目の画素の画素値、ψ_aveを第２画像の全ての画素の画素値の平均値として、次のベクトルξ_yを第２画像の特徴ベクトルとして算出し、

上限値を算出するステップにおいて、第１画像の特徴ベクトルと第２画像の特徴ベクトルの内積を上限値として算出することが好ましい。 Further, in the image matching method according to the present invention, in the step of calculating the feature vector of the first image, m is the number of pixels in the block, n is the number of blocks, and φ _{i, j} is the i-th block in the first image. pixel value of the j-th pixel in the, the phi _ave as the average value of the pixel values of all pixels of the first image, and calculates the following vector xi] _x as a feature vector of the first image,

In the step of calculating the feature vector of the second image, m is the number of pixels in the block, n is the number of blocks, ψ _{i, j} is the pixel value of the j-th pixel in the i-th block in the second image, ψ calculating _ave as the average value of all the pixels of the second image and the next vector ξ _y as the feature vector of the second image;

In the step of calculating the upper limit value, it is preferable to calculate the inner product of the feature vector of the first image and the feature vector of the second image as the upper limit value.

さらに、本発明に係る画像照合方法において、上限値が第１の閾値以上である第１画像と第２画像の正規化相互相関値を算出するステップと、正規化相互相関値が第１の閾値以下の値である第２の閾値以上であるか否かに基づいて第１画像と第２画像を詳細に照合するステップと、をさらに有することが好ましい。 Furthermore, in the image matching method according to the present invention, a step of calculating a normalized cross-correlation value between the first image and the second image whose upper limit value is equal to or greater than a first threshold value, and the normalized cross-correlation value is a first threshold value. It is preferable that the method further includes a step of comparing the first image and the second image in detail based on whether or not the second value is equal to or greater than a second value which is the following value.

また、本発明に係るコンピュータプログラムは、第１画像を複数のブロックに分割するステップと、第１画像について、各ブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づき、第１画像の特徴ベクトルを算出するステップと、第２画像を第１画像と同数のブロックに分割するステップと、第２画像について、各ブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づき、第２画像の特徴ベクトルを算出するステップと、ラグランジュの未定乗算決定法を利用して２つの画像の正規化相互相関値を求める式から導いた２つの画像の正規化相互相関の上限値を算出する式を用いて、算出された第１画像の特徴ベクトル及び第２画像の特徴ベクトルに基づき、第１画像と第２画像の正規化相互相関の上限値を算出するステップと、上限値が第１の閾値以上であるか否かに基づいて第１画像と第２画像を照合するステップと、をコンピュータに実行させる。 The computer program according to the present invention also includes a step of dividing the first image into a plurality of blocks, a sum of normalized pixel values of pixels included in each block, and a normalization of pixels included in each block. A step of calculating a feature vector of the first image based on the sum of squares of the pixel values, a step of dividing the second image into the same number of blocks as the first image, and for the second image, the pixels included in each block Based on the sum of normalized pixel values and the sum of squares of normalized pixel values of pixels included in each block, a step of calculating a feature vector of the second image, and using Lagrange's undetermined multiplication determination method, Using the equation for calculating the upper limit value of the normalized cross-correlation between the two images derived from the equation for obtaining the normalized cross-correlation value, the calculated feature vector of the first image and the second image Calculating an upper limit value of the normalized cross-correlation between the first image and the second image based on the characteristic vector; and determining whether the first image and the second image are based on whether the upper limit value is greater than or equal to the first threshold value. And causing the computer to execute the matching step.

本発明によれば、画像照合を高速に実施するとともに、画像の類似性を高精度に判定することが可能な画像照合装置、画像照合方法及びそのような画像照合方法をコンピュータに実行させるコンピュータプログラムを提供することができる。 According to the present invention, an image collation apparatus, an image collation method, and a computer program for causing a computer to execute such an image collation method can perform image collation at high speed and determine the similarity of images with high accuracy. Can be provided.

本発明を適用した画像照合装置の概略構成図である。It is a schematic block diagram of the image collation apparatus to which this invention is applied. 画像照合処理の動作を示すフローチャートである。It is a flowchart which shows operation | movement of an image collation process. ブロック毎に分割した画像の例である。It is an example of the image divided | segmented for every block. 制御部の他の例を示す概略構成図である。It is a schematic block diagram which shows the other example of a control part. 詳細な画像照合処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of a detailed image collation process. 制御部の他の例を示す概略構成図である。It is a schematic block diagram which shows the other example of a control part. 画像の取得処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of an image acquisition process. 画像の照合処理の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the collation process of an image. 特徴量次元数毎のプレシジョンを表すグラフである。It is a graph showing the precision for every feature-value dimension. ＤＣＴを用いた低次元特徴量を説明するための概略図である。It is the schematic for demonstrating the low-dimensional feature-value using DCT. （ａ）、（ｂ）は、リコールとプレシジョンの関係のグラフである。(A), (b) is a graph of the relationship between recall and precision. ＮＩＨを用いた低次元特徴量を説明するための概略図である。It is the schematic for demonstrating the low-dimensional feature-value using NIH. （ａ）、（ｂ）は、リコールとプレシジョンの関係のグラフである。(A), (b) is a graph of the relationship between recall and precision. （ａ）〜（ｄ）は、特徴量次元数毎のプレシジョンを表すグラフである。(A)-(d) is a graph showing the precision for every feature-value dimension number. （ａ）〜（ｄ）は、特徴量次元数毎のプレシジョンを表すグラフである。(A)-(d) is a graph showing the precision for every feature-value dimension number. 特徴量次元数毎の照合処理時間を示す表である。It is a table | surface which shows the collation processing time for every feature-value dimension number. 画像照合装置の他の例を示す概略構成図である。It is a schematic block diagram which shows the other example of an image collation apparatus. 画像の符号化処理の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the encoding process of an image. インテグラル画像を説明するための概略図である。It is the schematic for demonstrating an integral image.

以下、本発明に係る画像照合装置、画像照合方法及びコンピュータプログラムについて図を参照しつつ説明する。但し、本発明の技術的範囲はそれらの実施の形態に限定されず、特許請求の範囲に記載された発明とその均等物に及ぶ点に留意されたい。 Hereinafter, an image matching apparatus, an image matching method, and a computer program according to the present invention will be described with reference to the drawings. However, it should be noted that the technical scope of the present invention is not limited to these embodiments, but extends to the invention described in the claims and equivalents thereof.

図１は、本発明を適用した画像照合装置の概略構成を示す図である。図１に示すように、画像照合装置１は、インターフェース部１１、記憶部１２、表示部１３及び制御部２０を有する。以下、画像照合装置１の各部について詳細に説明する。 FIG. 1 is a diagram showing a schematic configuration of an image collating apparatus to which the present invention is applied. As illustrated in FIG. 1, the image collation apparatus 1 includes an interface unit 11, a storage unit 12, a display unit 13, and a control unit 20. Hereinafter, each part of the image collation apparatus 1 will be described in detail.

インターフェース部１１は、例えばインターネット、電話回線網（携帯端末回線網、一般電話回線網を含む）、イントラネット等のネットワークを介して他のコンピュータ等と画像データ及び各種のデータを送受信する通信インターフェースであり、接続するネットワークの通信インターフェース回路を有する。また、インターフェース部１１は、例えばＵＳＢ等のシリアルバスに準じるインターフェース回路を有し、フラッシュメモリ等を接続し、そのフラッシュメモリ等から画像データ及び各種のデータを取得するようにしてもよい。インターフェース部１１は、制御部２０と接続されており、制御部２０により制御される。 The interface unit 11 is a communication interface that transmits and receives image data and various data to and from other computers via a network such as the Internet, a telephone line network (including a mobile terminal line network and a general telephone line network), and an intranet. And a communication interface circuit of a network to be connected. Further, the interface unit 11 may include an interface circuit conforming to a serial bus such as a USB, for example, and may be connected to a flash memory or the like to acquire image data and various data from the flash memory or the like. The interface unit 11 is connected to the control unit 20 and is controlled by the control unit 20.

記憶部１２は、ＲＡＭ、ＲＯＭ等のメモリ装置、ハードディスク等の固定ディスク装置、又はフレキシブルディスク、光ディスク等の可搬用の記憶装置等を有する。また、記憶部１２には、画像照合装置１の各種処理に用いられるコンピュータプログラム、データベース、テーブル等が格納される。記憶部１２は、制御部２０と接続され、インターフェース部１１を介して取得した画像データを格納するとともに、制御部２０により画像データについてなされた演算結果を格納する。 The storage unit 12 includes a memory device such as a RAM and a ROM, a fixed disk device such as a hard disk, or a portable storage device such as a flexible disk and an optical disk. In addition, the storage unit 12 stores computer programs, databases, tables, and the like used for various processes of the image collating apparatus 1. The storage unit 12 is connected to the control unit 20 and stores the image data acquired via the interface unit 11 and stores the calculation results performed on the image data by the control unit 20.

制御部２０は、複数の画像について特徴ベクトルを算出し、各画像が類似するか否かを判定する。そのために、制御部２０は、第１画像取得部２１、第１画像分割部２２、第１特徴ベクトル算出部２３、第２画像取得部２４、第２画像分割部２５、第２特徴ベクトル算出部２６、上限値算出部２７及び第１照合部２８を有する。また、制御部２０は、インターフェース部１１、記憶部１２及び表示部１３と接続され、インターフェース部１１のデータ送受信制御、記憶部１２の制御、表示部１３の表示制御等を行う。制御部２０は、予め記憶部１２に記憶されているプログラムに基づいて動作する。あるいは、制御部２０は、集積回路、マイクロプロセッサ、ファームウェア等で構成されてもよい。 The control unit 20 calculates feature vectors for a plurality of images and determines whether or not the images are similar. For this purpose, the control unit 20 includes a first image acquisition unit 21, a first image division unit 22, a first feature vector calculation unit 23, a second image acquisition unit 24, a second image division unit 25, and a second feature vector calculation unit. 26, an upper limit calculator 27 and a first collator 28. The control unit 20 is connected to the interface unit 11, the storage unit 12, and the display unit 13, and performs data transmission / reception control of the interface unit 11, control of the storage unit 12, display control of the display unit 13, and the like. The control unit 20 operates based on a program stored in the storage unit 12 in advance. Alternatively, the control unit 20 may be configured by an integrated circuit, a microprocessor, firmware, and the like.

図２は、画像照合装置１による画像照合処理の動作を示すフローチャートである。以下、図２に示したフローチャートを参照しつつ、画像照合処理の動作を説明する。なお、以下に説明する動作のフローは、予め記憶部１２に記憶されているプログラムに基づき主に制御部２０により画像照合装置１の各要素と協同して実行される。 FIG. 2 is a flowchart showing the operation of image collation processing by the image collation apparatus 1. The operation of the image matching process will be described below with reference to the flowchart shown in FIG. The operation flow described below is mainly executed by the control unit 20 in cooperation with each element of the image collating apparatus 1 based on a program stored in the storage unit 12 in advance.

最初に、第１画像取得部２１は、インターフェース部１１を介して、外部のコンピュータ、フラッシュメモリ等から画像を取得し（以下、第１画像取得部２１が取得した画像を第１画像と称する）、記憶部１２に保存する（ステップＳ２０１）。 First, the first image acquisition unit 21 acquires an image from an external computer, a flash memory, or the like via the interface unit 11 (hereinafter, the image acquired by the first image acquisition unit 21 is referred to as a first image). And stored in the storage unit 12 (step S201).

次に、第１画像分割部２２は、記憶部１２に保存された第１画像を読み出し、第１画像を所定サイズの複数のブロックに分割する（ステップＳ２０２）。例えば、第１画像分割部２２は、３５２×２４０画素の画像を２２×１５画素の２５６ブロックに分割する。 Next, the first image dividing unit 22 reads the first image stored in the storage unit 12, and divides the first image into a plurality of blocks having a predetermined size (step S202). For example, the first image dividing unit 22 divides an image of 352 × 240 pixels into 256 blocks of 22 × 15 pixels.

図３は、ブロック毎に分割した画像の例を示す模式図である。図３の例では、画像３００を、各ブロックの画素数がｍであるｎ個のブロックに分割した画像３１０を示す。 FIG. 3 is a schematic diagram illustrating an example of an image divided for each block. The example of FIG. 3 shows an image 310 obtained by dividing the image 300 into n blocks each having m pixels.

次に、第１特徴ベクトル算出部２３は、第１画像について、分割したブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づく特徴ベクトルを算出し、第１画像と関連付けて記憶部１２に保存する（ステップＳ２０３）。 Next, for the first image, the first feature vector calculation unit 23 calculates the feature vector based on the sum of the normalized pixel values of the pixels included in the divided blocks and the square sum of the normalized pixel values of the pixels included in each block. Is stored in the storage unit 12 in association with the first image (step S203).

画像ｘ、画像ｙについてブロック内画素数がｍである、ｎ個のブロックに分割した場合のＮＣＣ値を、ブロックの番号ｉ（１≦ｉ≦ｎ）及びブロック内の画素の番号ｊ（１≦ｊ≦ｍ）を用いて式（１１）で定義する。

ここで、φ_i,j、ψ_i,jは、各画素の画素値であり、φ_aveは画像ｘの全画素の画素値の平均値であり、ψ_aveは画像ｙの全画素の画素値の平均値である。 For the image x and the image y, the NCC value when the number of pixels in the block is m and divided into n blocks is the block number i (1 ≦ i ≦ n) and the pixel number j (1 ≦ 1) in the block. j ≦ m) and defined by equation (11).

Here, φ _{i, j} and ψ _{i, j} are the pixel values of each pixel, φ _ave is the average value of the pixel values of all the pixels of the image x, and ψ _ave is the pixel value of all the pixels of the image y. Is the average value.

画像ｘ及び画像ｙの各画素の正規化画素値ｘ_i,j及びｙ_i,jを式（１２）で定義する。

なお、正規化画素値は、厳密には式（１３）とすべきであるが、本実施形態では単純化のため正規化画素値を式（１２）で定義する。

The normalized pixel values x _{i, j} and y _{i, j} of each pixel of the image x and the image y are defined by Expression (12).

Strictly speaking, the normalized pixel value should be Equation (13), but in this embodiment, the normalized pixel value is defined by Equation (12) for the sake of simplicity.

式（１１）のＮＣＣ値は、式（１２）の正規化画素値ｘ_i,j及びｙ_i,jを用いて式（１４）で表される。

The NCC value of Expression (11) is expressed by Expression (14) using the normalized pixel values x _{i, j} and y _{i, j} of Expression (12).

また、画像ｘのｉ番目のグループにおける正規化画素値ｘ_i,jの総和及び二乗和並びに画像ｙのｉ番目のグループにおける正規化画素値ｙ_i,jの総和及び二乗和を式（１５）〜（１８）で定義する。

ここで、正規化画素値の性質上、式（１９）、（２０）が成り立つ。

Further, the sum and square sum of normalized pixel values x _{i, j} in the i-th group of the image x and the sum and square sum of normalized pixel values y _{i, j} in the i-th group of the image y are expressed by the following equation (15). It is defined by (18).

Here, due to the nature of the normalized pixel value, equations (19) and (20) hold.

以下、ラグランジュの未定乗数決定法を利用して式（１４）から２つの画像のＮＣＣの厳密な上限値を算出する式を求める。本実施形態では、式（２１）の関数の極値問題を考える。

ここで、λ_1,i、λ_2,i、λ_3,i、λ_4,iは、４ｎ個のラグランジュ未定乗数である。式（２１）において、微分係数ｄΛ＝０（全てのｉ、ｊについて∂Λ/∂ｘ_i,j＝０、∂Λ/∂y_i,j＝０）とすることにより、式（２２）、（２３）の２ｍｎ個の式が得られる。

式（２２）、（２３）から式（２４）が得られる。

Hereinafter, an equation for calculating a strict upper limit value of NCC of two images is obtained from Equation (14) using Lagrange's undetermined multiplier determination method. In this embodiment, the extreme value problem of the function of Expression (21) is considered.

Here, λ _{1, i} , λ _{2, i} , λ _{3, i} , λ _{4, i} are 4n Lagrange undetermined multipliers. In the formula (21), the differential coefficient d [lambda] = 0 (all i, j for _{∂Λ / ∂x i, j = 0} , ∂Λ / ∂y i, j = 0) by the equation (22), The 2mn equations of (23) are obtained.

Expression (24) is obtained from Expressions (22) and (23).

ここで、式（２４）が成立すると仮定すると、ｘ_i,j、ｙ_i,jは式（２５）で求められる。

この場合、式（２５）の右辺はｉ（ブロックの番号）のみにより定まるので、ｘ_i,j、ｙ_i,jは全てのｊ（ブロック内の画素）について同一の値をとることになり、条件が強すぎることになる。一方、式（２４）が成立しないと仮定すると、式（２４）の左辺の行列の行列式から式（２６）が得られる。

また、式（２２）、（２３）、（２６）から、式（２７）が得られる。

この場合、式（２２）と式（２３）は、同じ意味を表すことになる。従って、本実施形態では、式（２２）のみを用いてＮＣＣの上限値を算出する式を求める。 Here, assuming that Expression (24) holds, x _{i, j} and y _{i, j} are obtained by Expression (25).

In this case, since the right side of Expression (25) is determined only by i (block number), x _{i, j} and y _{i, j} have the same value for all j (pixels in the block). The condition will be too strong. On the other hand, assuming that Expression (24) does not hold, Expression (26) is obtained from the determinant of the matrix on the left side of Expression (24).

Further, Expression (27) is obtained from Expressions (22), (23), and (26).

In this case, Formula (22) and Formula (23) represent the same meaning. Therefore, in the present embodiment, an equation for calculating the upper limit value of the NCC is obtained using only the equation (22).

まず、式（２２）から式（２８）が得られる。

式（２８）を式（１７）に代入することにより、式（２９）が得られる。

一方、式（２８）を式（１８）に代入することにより、式（３０）が得られる。

式（２９）より、式（３１）が得られる。

式（３１）を式（３０）に代入することにより、式（３２）が得られる。

式（２８）、（３１）、（３２）を式（１４）に代入することにより、式（３３）が得られる。

First, equation (28) is obtained from equation (22).

By substituting equation (28) into equation (17), equation (29) is obtained.

On the other hand, Expression (30) is obtained by substituting Expression (28) into Expression (18).

From equation (29), equation (31) is obtained.

By substituting equation (31) into equation (30), equation (32) is obtained.

By substituting Equations (28), (31), and (32) into Equation (14), Equation (33) is obtained.

つまり、ＮＣＣの上限値ＵＢは式（３４）で与えられ、ＮＣＣの下限値ＬＢは式（３５）で与えられる。

ＮＣＣ値はこの上限値ＵＢを越えることがないので、ＮＣＣ値の代わりに上限値ＵＢを用いて画像ｘと画像ｙの類似性を評価することができる。 That is, the upper limit value UB of NCC is given by Expression (34), and the lower limit value LB of NCC is given by Expression (35).

Since the NCC value does not exceed the upper limit value UB, the similarity between the image x and the image y can be evaluated using the upper limit value UB instead of the NCC value.

一方、式（３４）の上限値ＵＢは、式（３６）に示されるように、式（３７）、（３８）で示される特徴ベクトルξ_xとξ_yの内積となる。

式（３７）、（３８）に示されるように、特徴ベクトルξ_x及びξ_yは、それぞれ画像ｘ、画像ｙのみから算出することができる。つまり、画像照合装置１は、予め画像ｘ、画像ｙをそれぞれ２ｎ個の要素をもつ特徴ベクトルξ_x及びξ_yに変換しておくことにより、その内積を算出して上限値ＵＢを求めることができ、特に、ある画像を複数の画像と照合するときに高速に照合することができる。そこで、第１特徴ベクトル算出部２３は、第１画像の特徴ベクトルとして第１画像について式（３７）の特徴ベクトルξ_xを算出する。 On the other hand, the upper limit value UB of the equation (34) is the inner product of the feature vectors ξ _x and ξ _{y represented by} the equations (37) and (38) as shown in the equation (36).

As shown in the equations (37) and (38), the feature vectors ξ _x and ξ _y can be calculated only from the image x and the image y, respectively. In other words, the image collation apparatus 1 calculates the inner product by obtaining the upper limit value UB by previously converting the image x and the image y into feature vectors ξ _x and ξ _y each having 2n elements. In particular, a certain image can be collated at high speed when collating with a plurality of images. Therefore, the first feature vector calculation unit 23 calculates the feature vector ξ _x of Expression (37) for the first image as the feature vector of the first image.

次に、第２画像取得部２４は、インターフェース部１１を介して、外部のコンピュータ、フラッシュメモリ等から画像を取得し（以下、第２画像取得部２４が取得した画像を第２画像と称する）、記憶部１２に保存する（ステップＳ２０４）。 Next, the second image acquisition unit 24 acquires an image from an external computer, flash memory, or the like via the interface unit 11 (hereinafter, the image acquired by the second image acquisition unit 24 is referred to as a second image). And stored in the storage unit 12 (step S204).

次に、第２画像分割部２５は、記憶部１２に保存された第２画像を読み出し、第１画像を分割したブロックとそれぞれ同サイズ、同数の複数のブロックに分割する（ステップＳ２０５）。 Next, the second image dividing unit 25 reads the second image stored in the storage unit 12, and divides the first image into a plurality of blocks having the same size and the same number as the divided blocks (step S205).

次に、第２特徴ベクトル算出部２６は、第２画像について、分割したブロックに含まれる画素の正規化画素値の総和及び各ブロックに含まれる画素の正規化画素値の二乗和に基づく特徴ベクトル、つまり式（３８）の特徴ベクトルξ_yを算出し、第２画像と関連付けて記憶部１２に保存する（ステップＳ２０６）。 Next, for the second image, the second feature vector calculation unit 26 calculates the feature vector based on the sum of the normalized pixel values of the pixels included in the divided blocks and the square sum of the normalized pixel values of the pixels included in each block. That is, the feature vector ξ _y of equation (38) is calculated and stored in the storage unit 12 in association with the second image (step S206).

次に、上限値算出部２７は、式（３６）に示すように特徴ベクトルξ_x、ξ_yの内積を算出して、上限値ＵＢを求める（ステップＳ２０７）。 Next, the upper limit calculator 27 calculates the inner product of the feature vectors ξ _x and ξ _y as shown in the equation (36) to obtain the upper limit UB (step S207).

次に、第１照合部２８は、上限値ＵＢが所定の閾値θ₁以上であるか否かに基づいて第１画像と前記第２画像を照合する。まず、第１照合部２８は、上限値ＵＢが閾値θ₁以上であるか否かを判定する（ステップＳ２０８）。この閾値θ₁は、ＮＣＣを用いて二つの画像が類似するか否かを判定するときに、ＮＣＣ値と比較する閾値θ₀（−１≪θ₀＜１）と同じ値である。あるいは、閾値θ₁は、閾値θ₀より大きい値としてもよい。 Next, the first collation unit 28 collates the first image and the second image based on whether or not the upper limit value UB is equal to or greater than a predetermined threshold value θ ₁ . First, the first matching unit 28 determines whether or not the upper limit value UB is equal to or greater than the threshold value θ ₁ (step S208). This threshold value θ ₁ is the same value as the threshold value θ ₀ (−1 << θ ₀ <1) to be compared with the NCC value when determining whether two images are similar using NCC. Alternatively, the threshold value θ ₁ may be larger than the threshold value θ ₀ .

第１照合部２８は、上限値ＵＢが閾値θ₁以上である場合、第１画像が第２画像と類似すると判定し（ステップＳ２０９）、一連のステップを終了する。一方、第１照合部２８は、上限値ＵＢが閾値θ₁未満である場合、第１画像が第２画像と類似しないと判定し（ステップＳ２１０）、一連のステップを終了する。 The first matching unit 28, if the upper limit value UB is the threshold value theta ₁ above, the first image is determined to be similar to the second image (step S209), and ends the series of steps. On the other hand, when the upper limit value UB is less than the threshold value θ ₁ , the first verification unit 28 determines that the first image is not similar to the second image (step S210), and ends a series of steps.

例えば、画像照合装置１は、類似すると判定した画像のペアを利用者が確認できるように、表示部１３に表示する。あるいは、画像照合装置１は、類似すると判定した画像のペアの情報をインターフェース部１１を介して外部のコンピュータ（不図示）に通知してもよい。 For example, the image collation apparatus 1 displays the image pair determined to be similar on the display unit 13 so that the user can confirm the image pair. Alternatively, the image collation device 1 may notify an external computer (not shown) of the information of the image pair determined to be similar via the interface unit 11.

なお、上限値ＵＢと下限値ＬＢの間の範囲が狭いほど、上限値ＵＢは、ＮＣＣ値をより正確に近似していることになる。上限値ＵＢと下限値ＬＢの間の範囲は、式（３４）、（３５）から、式（３９）により求めることができる。

一方、

であり、ここで、Ｇ_j,k＝ｘ_i,jｘ_i,j−ｘ_i,jｘ_i,kとすると、式（４１）が成り立つ。

従って、式（４０）は、式（４２）で表される。

ここで、Ｅ_kはｋにわたる期待値であり、Ｖａｒ_i(ｘ_i,j)はｉ番目のブロック内のｘ_i,jの分散である。同様に、

が成立し、式（３９）は、式（４４）で表される。

The narrower the range between the upper limit value UB and the lower limit value LB, the closer the upper limit value UB approximates the NCC value. The range between the upper limit value UB and the lower limit value LB can be obtained from the equations (34) and (35) by the equation (39).

on the other hand,

Here, when G _{j, k} = x _{i, j} x _{i, j} −x _{i, j} x _{i, k} , Equation (41) is established.

Therefore, Formula (40) is represented by Formula (42).

Here, E _k is the expected value over _k , and Var _i (x _{i, j} ) is the variance of x _{i, j} within the i th block. Similarly,

Is established, and Expression (39) is expressed by Expression (44).

つまり、上限値ＵＢと下限値ＬＢの間の範囲を狭くするためには、各ブロック内部の正規化画素値の分散を小さくするか、一ブロック内の画素数ｍを小さくする必要がある。しかし、一ブロック内の画素数ｍを小さくすると、ブロック数ｎが大きくなり、特徴ベクトルξ_xとξ_yの次元数が大きくなるため、現実的でない。従って、同一ブロック内の画素の間の正規化画素値の差が可能な限り小さくなるように各ブロックを構成し、各ブロック内部の正規化画素値の分散を小さくすることが好ましい。一般に、画像のマルコフ性により、近傍の画素の（正規化）画素値は、非常に近い値をもつことが知られている。従って、例えばブロックの形状を、長方形、特に正方形に近い形にすることにより、各ブロック内の画素の画素値を近付け、上限値ＵＢをよりＮＣＣ値に近似させることができる。 That is, in order to narrow the range between the upper limit value UB and the lower limit value LB, it is necessary to reduce the dispersion of normalized pixel values within each block or to reduce the number m of pixels in one block. However, if the number of pixels m in one block is reduced, the number of blocks n increases and the number of dimensions of the feature vectors ξ _x and ξ _y increases, which is not practical. Therefore, it is preferable to configure each block so that the difference in normalized pixel values between pixels in the same block is as small as possible, and to reduce the variance of normalized pixel values within each block. In general, it is known that the (normalized) pixel values of neighboring pixels have very close values due to the Markov nature of the image. Therefore, for example, by making the shape of the block a rectangle, especially a shape close to a square, the pixel values of the pixels in each block can be brought closer, and the upper limit value UB can be more approximated to the NCC value.

なお、ＭＳＥＡにより算出されるＮＣＣの上限値ＵＢ^MSEAは、式（５）から式（４５）のように表される。

本実施形態により算出されるＮＣＣの上限値ＵＢと、ＭＳＥＡにより算出されるＮＣＣの上限値ＵＢ^MSEAについて、式（３４）と式（４５）のルート内の式を比較することにより式（４６）が成立する。

従って、式（４７）が成立し、本実施形態により算出されるＮＣＣの上限値ＵＢは、ＭＳＥＡにより算出されるＮＣＣの上限値ＵＢ^MSEA以下となる。

つまり、本実施形態により算出されるＮＣＣの上限値ＵＢは、ＭＳＥＡにより算出されるＮＣＣの上限値ＵＢ^MSEAより正確にＮＣＣ値を近似していることが理論的に示される。 Note that the upper limit value UB ^{MSEA of} NCC calculated by ^MSEA is expressed as in Expression (5) to Expression (45).

For the upper limit value UB of NCC calculated by the present embodiment and the upper limit value UB ^MSEA of NCC calculated by ^MSEA , the formulas in the route of formula (34) and formula (45) are compared to formula (46). Is established.

Therefore, the equation (47) is established, and the NCC upper limit value UB calculated according to the present embodiment is equal to or less than the NCC upper limit value UB ^MSEA calculated by the ^MSEA .

That is, it is theoretically shown that the NCC upper limit value UB calculated according to the present embodiment approximates the NCC value more accurately than the NCC upper limit value UB ^MSEA calculated by ^MSEA .

以上詳述したように、図２に示したフローチャートに従って動作することによって、画像照合装置１は、元の情報量（数万程度の画素数）から大幅に圧縮した（数十〜百程度の）特徴ベクトル（すなわち低次元特徴量）を用いて、画像の類似性を高速かつ高精度に判定することが可能となった。 As described above in detail, by operating according to the flowchart shown in FIG. 2, the image collating apparatus 1 is greatly compressed (about tens to hundreds) from the original information amount (number of pixels of about tens of thousands). It has become possible to determine the similarity of images at high speed and with high accuracy using feature vectors (ie, low-dimensional feature values).

なお、画像照合装置１は、照合する二画像のサイズが異なる場合、二画像のサイズを同一サイズに正規化することにより、サイズの異なる二画像についても低次元特徴量を用いた画像照合をすることができる。 Note that, when the two images to be collated are different in size, the image collating apparatus 1 normalizes the two images to the same size, thereby performing image collation using low-dimensional feature amounts for the two images having different sizes. be able to.

図４は、制御部の他の例を示す概略構成図である。図４に示す制御部３０は、図１に示す制御部２０の代わりに用いることが可能である。図４に示す制御部３０は、図１に示す制御部２０の各部に加えて、ＮＣＣ値算出部３１と第２照合部３２を有する。 FIG. 4 is a schematic configuration diagram illustrating another example of the control unit. The control unit 30 shown in FIG. 4 can be used instead of the control unit 20 shown in FIG. The control unit 30 illustrated in FIG. 4 includes an NCC value calculation unit 31 and a second verification unit 32 in addition to the units of the control unit 20 illustrated in FIG.

図５は、図４に示す制御部３０を用いる画像照合装置１による詳細な画像照合処理の動作を示すフローチャートである。以下、図５に示したフローチャートを参照しつつ、詳細な画像照合処理の動作を説明する。なお、以下に説明する動作のフローは、予め記憶部１２に記憶されているプログラムに基づき主に制御部３０により画像照合装置１の各要素と協同して実行される。また、図５に示したフローチャートは、図２に示したフローチャートにより、類似すると判定された画像のペアに対して実行される。 FIG. 5 is a flowchart showing a detailed image matching process operation by the image matching apparatus 1 using the control unit 30 shown in FIG. Hereinafter, the detailed operation of the image matching process will be described with reference to the flowchart shown in FIG. The operation flow described below is mainly executed by the control unit 30 in cooperation with each element of the image collating apparatus 1 based on a program stored in the storage unit 12 in advance. The flowchart shown in FIG. 5 is executed for a pair of images determined to be similar to each other according to the flowchart shown in FIG.

最初に、ＮＣＣ値算出部３１は、類似すると判定された第１画像と第２画像について式（１１）を用いてＮＣＣ値を算出する（ステップＳ５０１）。 First, the NCC value calculation unit 31 calculates an NCC value for the first image and the second image determined to be similar using Expression (11) (step S501).

次に、第２照合部３２は、ＮＣＣ値算出部３１により算出されたＮＣＣ値が所定の閾値θ₀（例えば０．９）以上であるか否かに基づいて第１画像と第２画像を詳細に照合する。まず、第２照合部３２は、ＮＣＣ値が閾値θ₀以上であるか否かを判定する（ステップＳ５０２）。なお、上述した通り、この閾値θ₀は閾値θ₁以下の値の閾値である。 Next, the second matching unit 32 determines the first image and the second image based on whether or not the NCC value calculated by the NCC value calculating unit 31 is equal to or greater than a predetermined threshold θ ₀ (for example, 0.9). Match in detail. First, the second verification unit 32 determines whether or not the NCC value is equal to or greater than the threshold value θ ₀ (step S502). As described above, the threshold value θ ₀ is a threshold value having a value equal to or smaller than the threshold value θ ₁ .

第２照合部３２は、ＮＣＣ値が閾値θ₀以上である場合、第１画像と第２画像は類似すると判定し（ステップＳ５０３）、一連のステップを終了する。一方、第２照合部２３は、ＮＣＣ値が閾値θ₀未満である場合、第１画像と第２画像は類似しないと判定し（ステップＳ５０４）、一連のステップを終了する。 If the NCC value is equal to or greater than the threshold value θ ₀ , the second verification unit 32 determines that the first image and the second image are similar (step S503), and ends the series of steps. On the other hand, when the NCC value is less than the threshold value θ ₀ , the second matching unit 23 determines that the first image and the second image are not similar (step S504), and ends a series of steps.

例えば、画像照合装置１は、図５に示したフローチャートにより類似すると判定した画像のペアを表示部１３に表示する。あるいは、画像照合装置１は、図５に示したフローチャートにより類似すると判定した画像のペアの情報をインターフェース部１１を介して外部のコンピュータ（不図示）に通知してもよい。 For example, the image matching apparatus 1 displays on the display unit 13 a pair of images determined to be similar according to the flowchart shown in FIG. Alternatively, the image collating apparatus 1 may notify an external computer (not shown) of the information of the image pair determined to be similar according to the flowchart shown in FIG.

なお、制御部２０は、第１画像取得部２１と第２画像取得部２４の代わりに、第１画像及び第２画像の何れをも取得できる画像取得部を備えることもできる。また、第１画像分割部２２と第２画像分割部２５の代わりに第１画像及び第２画像の何れをも所定のブロックに分割できる画像分割部を備えることもできる。さらに、第１特徴ベクトル算出部２３と第２特徴ベクトル算出部２６の代わりに第１画像の特徴ベクトル及び第２画像の特徴ベクトルの何れをも算出できる特徴ベクトル算出部を備えることもできる。 The control unit 20 may include an image acquisition unit that can acquire both the first image and the second image instead of the first image acquisition unit 21 and the second image acquisition unit 24. Further, instead of the first image dividing unit 22 and the second image dividing unit 25, an image dividing unit that can divide both the first image and the second image into predetermined blocks can be provided. Further, instead of the first feature vector calculation unit 23 and the second feature vector calculation unit 26, a feature vector calculation unit capable of calculating either the feature vector of the first image or the feature vector of the second image may be provided.

また、ステップＳ２０４〜Ｓ２０６の第２画像に対する処理は、ステップＳ２０１〜Ｓ２０３の第１画像に対する処理より前に実施してもよい。また、画像照合装置１が複数のＣＰＵを備えること等により、並列処理が可能な場合には、ステップＳ２０４〜Ｓ２０６の第２画像に対する処理と、ステップＳ２０１〜Ｓ２０３の第１画像に対する処理を並列に実施してもよい。 Further, the processing for the second image in steps S204 to S206 may be performed before the processing for the first image in steps S201 to S203. In addition, when the image matching device 1 includes a plurality of CPUs and the like, when parallel processing is possible, the processing for the second image in steps S204 to S206 and the processing for the first image in steps S201 to S203 are performed in parallel. You may implement.

以上詳述したように、図５に示したフローチャートに従って動作することによって、画像照合装置１は、低次元空間での類似性が高い（ＮＣＣの上限値ＵＢが閾値θ₁以上の）画像のペアのうち、オリジナル空間での類似性が低い（ＮＣＣ値が閾値θ₀未満の）画像のペアを取り除く（クレンジングする）ことができる。これにより、画像照合装置１は、ある画像のペアに対して低負荷で類似性を判定しておき、その判定において類似性が高いと判定した画像のペアのみについて高負荷で高精度に類似性を判定するので、効率よく画像を照合できるようになった。 As described above in detail, by operating according to the flowchart shown in FIG. 5, the image matching apparatus 1 has a pair of images having high similarity in the low-dimensional space (the NCC upper limit value UB is equal to or greater than the threshold θ ₁ ). Among them, it is possible to remove (cleanse) image pairs having low similarity in the original space (NCC value is less than the threshold value θ ₀ ). As a result, the image collating apparatus 1 determines similarity with a low load for a certain image pair, and only the image pair determined to have high similarity in the determination is highly similar with high load and high accuracy. Therefore, it is possible to collate images efficiently.

図６は、画像照合装置を画像検索サーバに適用する場合の制御部の概略構成図である。図６に示す制御部４０は、図１に示す制御部２０又は図４に示す制御部３０の代わりに用いることが可能である。図６に示す制御部４０は、図４に示す制御部３０の各部に加えて、インデックス判定部４１を有する。なお、以下では、第１画像を事前に記憶部１２に格納された照合元の画像とし、第２画像を外部のコンピュータ（不図示）等から検索を要求された問合せ画像として説明する。 FIG. 6 is a schematic configuration diagram of a control unit when the image matching device is applied to an image search server. The control unit 40 shown in FIG. 6 can be used in place of the control unit 20 shown in FIG. 1 or the control unit 30 shown in FIG. The control unit 40 illustrated in FIG. 6 includes an index determination unit 41 in addition to the units of the control unit 30 illustrated in FIG. In the following description, the first image is assumed to be a collation source image stored in advance in the storage unit 12, and the second image is assumed to be an inquiry image requested to be searched by an external computer (not shown) or the like.

図７は、図６に示す制御部４０を用いる画像照合装置１による画像の取得処理の動作を示すフローチャートである。以下、図７に示したフローチャートを参照しつつ、画像取得処理の動作を説明する。なお、以下に説明する動作のフローは、予め記憶部１２に記憶されているプログラムに基づき主に制御部４０により画像照合装置１の各要素と協同して実行される。 FIG. 7 is a flowchart showing the operation of the image acquisition process by the image collating apparatus 1 using the control unit 40 shown in FIG. The operation of the image acquisition process will be described below with reference to the flowchart shown in FIG. The operation flow described below is mainly executed by the control unit 40 in cooperation with each element of the image collating apparatus 1 based on a program stored in the storage unit 12 in advance.

図７に示すフローチャートでは、画像照合装置１は、第１画像の取得のみを実施し、第１画像と第２画像の照合処理は、後述するフローチャートで説明する。図７に示すステップＳ７０１〜Ｓ７０３の処理は、図２に示すステップＳ２０１〜Ｓ２０３の処理と同じであるため、説明を省略する。 In the flowchart illustrated in FIG. 7, the image collation apparatus 1 performs only the acquisition of the first image, and the collation processing of the first image and the second image will be described with reference to a flowchart described later. The processing in steps S701 to S703 shown in FIG. 7 is the same as the processing in steps S201 to S203 shown in FIG.

第１特徴ベクトル算出部２３は、第１画像について特徴ベクトルを算出すると、算出した特徴ベクトルに基づいて多次元インデックスを作成する（ステップＳ７０４）。 When calculating the feature vector for the first image, the first feature vector calculating unit 23 creates a multidimensional index based on the calculated feature vector (step S704).

式（４８）、（４９）に示すように、式（３７）、（３８）で求められる特徴ベクトルξ_x、ξ_yのノルムは１になる。

つまり、式（５０）に示すように、特徴ベクトルξ_xとξ_yの内積は、特徴ベクトルξ_xとξ_yのユークリッド距離に変換することができる。

つまり、特徴ベクトルξ_xとξ_yのユークリッド距離が式（５１）の条件を満たす場合、特徴ベクトルξ_xとξ_yの内積は閾値θ₁以上となり、ＮＣＣの上限値ＵＢは閾値θ₁以上となる。

As shown in the equations (48) and (49), the norm of the feature vectors ξ _x and ξ _y obtained by the equations (37) and (38) is 1.

That is, as shown in equation (50), the inner product of the feature vector xi] _x and xi] _y can be converted into the Euclidean distance of feature vectors xi] _x and xi] _y.

That is, when the Euclidean distance between the feature vectors ξ _x and ξ _y satisfies the condition of Equation (51), the inner product of the feature vectors ξ _x and ξ _y is equal to or greater than the threshold θ ₁ , and the NCC upper limit value UB is equal to or greater than the threshold θ _1. Become.

従って、画像照合装置１は、特徴ベクトルξ_xとξ_yの内積を算出する代わりに、特徴ベクトルξ_xとξ_yのユークリッド空間における距離探索を行うことにより、式（５２）の条件を満たす画像のペアを抽出することができる。

Therefore, instead of calculating the inner product of the feature vectors ξ _x and ξ _y , the image matching device 1 performs a distance search of the feature vectors ξ _x and ξ _{y in} the Euclidean space, thereby satisfying the expression (52). A pair can be extracted.

この場合、画像照合装置１は、ユークリッド距離を用いた距離探索をサポートする、多次元データに対するインデキシング技術（Bohm, C., Berchtold, S., Keim, D. A., September 2001. Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Computing Surveys 33 (3), 322-373.）を適用することができ、画像照合のさらなる高速化を図ることができる。画像照合装置１は、多次元インデキシング技術として、例えば、木構造を用いるＡＮＮ（Approximate Nearest Neighbors）（Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., Wu, A. Y., November 1998. An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J. ACM 45, 891-923.）、ハッシュを用いるＬＳＨ（Locality Sensitive Hashing）（Andoni, A., Indyk, P., 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51 (1), 117-122.）、階層的ベクトル量子化（Nister, D., Stewenius, H., 2006. Scalable recognition with a vocabulary tree. In: Proc. of CVPR. Vol. 2. pp. 2161-2168.）、ベクトル量子化及びスカラー量子化（ハミングエンベディング）（Douze, M., Je´gou, H., Sandhawalia, H., Amsaleg, L., Schmid, C., 2009. Evaluation of gist descriptors for web-scale image search. In: Proceeding of the ACM International Conference on Image and Video Retrieval. CIVR ’09. ACM, New York, NY, USA, pp. 19:1-19:8.）等の、高次元空間におけるｋ近傍法（ｋ−ＮＮ探索）を用いた技術を利用できる。 In this case, the image matching apparatus 1 supports indexing using multi-dimensional data (Bohm, C., Berchtold, S., Keim, DA, September 2001. Searching in high-dimensional spaces). ACM Computing Surveys 33 (3), 322-373.) Can be applied, and image matching can be further accelerated. The image matching apparatus 1 is an example of an ANN (Approximate Nearest Neighbors) using a tree structure (Arya, S., Mount, DM, Netanyahu, NS, Silverman, R., Wu, AY, November 1998.) as a multidimensional indexing technique. J. ACM 45, 891-923.), LSH (Locality Sensitive Hashing) using hashing (Andoni, A., Indyk, P., 2008. Near-optimal hashing algorithms for Commun. ACM 51 (1), 117-122.), hierarchical vector quantization (Nister, D., Stewenius, H., 2006. Scalable recognition with a vocabulary tree. In: Proc. of CVPR. Vol. 2. pp. 2161-2168.), vector quantization and scalar quantization (Humming embedding) (Douze, M., Je´gou, H., Sandhawalia, H., Amsaleg, L., Schmid) , C., 2009. Evaluation of gist descriptors for web-scale image search.In: Proceeding of the ACM International Conference on Image CIVR '09. ACM, New York, NY, USA, pp. 19: 1-19: 8.) and other techniques using the k-nearest neighbor method (k-NN search) in high-dimensional space it can.

例えば、多次元インデキシング技術としてＬＳＨを利用する場合、第１特徴ベクトル算出部２３は、特徴ベクトルに対して、予め定められた複数のハッシュ関数を用いて複数のハッシュ値を算出し、多次元インデックスとする。 For example, when LSH is used as a multidimensional indexing technique, the first feature vector calculation unit 23 calculates a plurality of hash values using a plurality of predetermined hash functions for the feature vector, and multidimensional index And

あるいは、多次元インデキシング技術としてＡＮＮを利用する場合、画像照合装置１は、特徴ベクトルについての特徴空間を予め所定の領域に分割しておく。そして、第１特徴ベクトル算出部２３は、特徴ベクトルが属する特徴空間の領域を多次元インデックスとする。 Or when using ANN as a multidimensional indexing technique, the image collation apparatus 1 divides the feature space for the feature vector into predetermined regions in advance. Then, the first feature vector calculation unit 23 sets the feature space region to which the feature vector belongs as a multidimensional index.

第１特徴ベクトル算出部２３は、多次元インデックスを作成すると、第１画像と関連付けて記憶部１２に格納し（ステップＳ７０５）、一連のステップを終了する。 When the first feature vector calculation unit 23 creates a multidimensional index, the first feature vector calculation unit 23 stores the multi-dimensional index in association with the first image in the storage unit 12 (step S705), and ends a series of steps.

図８は、画像照合装置１が画像検索サーバとして動作する場合の画像照合処理の動作を示すフローチャートである。以下、図８に示したフローチャートを参照しつつ、画像照合処理の動作を説明する。なお、以下に説明する動作のフローは、予め記憶部１２に記憶されているプログラムに基づき主に制御部４０により画像照合装置１の各要素と協同して実行される。 FIG. 8 is a flowchart showing the operation of the image matching process when the image matching apparatus 1 operates as an image search server. The operation of the image matching process will be described below with reference to the flowchart shown in FIG. The operation flow described below is mainly executed by the control unit 40 in cooperation with each element of the image collating apparatus 1 based on a program stored in the storage unit 12 in advance.

なお、図８に示すステップＳ８０１〜Ｓ８０３の処理は、図２に示すステップＳ２０４〜Ｓ２０６の処理と同じであるため、説明を省略する。ただし、多次元インデキシング技術としてＬＳＨを利用する場合、第２特徴ベクトル算出部２６は、ステップＳ８０３で第２画像について特徴ベクトルを算出した後、その特徴ベクトルについて、第１画像の特徴ベクトルに対してハッシュ値を算出したのと同一の、複数のハッシュ関数を用いて複数のハッシュ値を算出する。 Note that the processing in steps S801 to S803 shown in FIG. 8 is the same as the processing in steps S204 to S206 shown in FIG. However, when LSH is used as the multidimensional indexing technique, the second feature vector calculation unit 26 calculates the feature vector for the second image in step S803, and then calculates the feature vector for the feature vector of the first image. A plurality of hash values are calculated using a plurality of hash functions that are the same as the hash values.

ステップＳ８０４〜Ｓ８０８の処理は、記憶部１２に格納された全ての第１画像に対して各画像毎に実施される。 The processes in steps S804 to S808 are performed for each image on all the first images stored in the storage unit 12.

ステップＳ８０４において、インデックス判定部４１は、第２特徴ベクトル算出部２６により第２画像について特徴ベクトルが算出されると、記憶部１２から第１画像の特徴ベクトル及び多次元インデックスを読み出す（ステップＳ８０４）。 In step S804, when the second feature vector calculation unit 26 calculates the feature vector for the second image, the index determination unit 41 reads the feature vector and the multidimensional index of the first image from the storage unit 12 (step S804). .

次に、インデックス判定部４１は、第１画像の特徴ベクトルと第２画像の特徴ベクトルが多次元インデックスに基づく条件を満たしているか否かを判定する（ステップＳ８０５）。 Next, the index determination unit 41 determines whether or not the feature vector of the first image and the feature vector of the second image satisfy a condition based on the multidimensional index (step S805).

例えば、多次元インデキシング技術としてＬＳＨを利用する場合、インデックス判定部４１は、記憶部１２から読み出した第１画像の特徴ベクトルのハッシュ値が、第２画像の特徴ベクトルの対応するハッシュ値と所定数（例えば、全ハッシュ値のうちの半数）以上一致するか否かにより、多次元インデックスに基づく条件を満たしているか否かを判定する。 For example, when LSH is used as the multidimensional indexing technique, the index determination unit 41 determines that the hash value of the feature vector of the first image read from the storage unit 12 is equal to the hash value corresponding to the feature vector of the second image. Whether or not the condition based on the multidimensional index is satisfied is determined based on whether or not (for example, half of all hash values) match.

あるいは、多次元インデキシング技術としてＡＮＮを利用する場合、インデックス判定部４１は、第１画像の特徴ベクトルが属する特徴空間の領域が、第２画像の特徴ベクトルとのユークリッド距離が（２−２θ₁）以下となる部分を含むか否かにより、多次元インデックスに基づく条件を満たしているか否かを判定する。 Alternatively, when ANN is used as the multidimensional indexing technique, the index determination unit 41 determines that the region of the feature space to which the feature vector of the first image belongs has a Euclidean distance (2-2θ ₁ ) from the feature vector of the second image. Whether or not a condition based on the multidimensional index is satisfied is determined depending on whether or not the following portion is included.

インデックス判定部４１が多次元インデックスに基づく条件を満たしていると判定した場合、第１照合部２８は、第１画像の特徴ベクトルと第２画像の特徴ベクトルのユークリッド距離が式（５１）の条件を満たすか否かを判定する（ステップＳ８０６）。 When the index determination unit 41 determines that the condition based on the multidimensional index is satisfied, the first matching unit 28 determines that the Euclidean distance between the feature vector of the first image and the feature vector of the second image is a condition of Expression (51). It is determined whether or not the condition is satisfied (step S806).

第１照合部２８は、式（５１）の条件が満たされている場合、その第１画像を第２画像に類似する画像の候補（以下、類似画像候補と称する）として抽出する（ステップＳ８０７）。 If the condition of formula (51) is satisfied, the first collation unit 28 extracts the first image as a candidate for an image similar to the second image (hereinafter referred to as a similar image candidate) (step S807). .

一方、ステップＳ８０５でインデックス判定部４１が多次元インデックスに基づく条件が満たされていないと判定した場合、ステップＳ８０６で第１照合部２８が式（５１）の条件が満たされていないと判定した場合、又は、ステップＳ８０７で第１照合部２８が類似画像候補を抽出した場合、制御部４０は、第２画像を全ての第１画像と比較したか否かを判定する（ステップＳ８０８）。 On the other hand, when the index determination unit 41 determines in step S805 that the condition based on the multidimensional index is not satisfied, the first matching unit 28 determines in step S806 that the condition of expression (51) is not satisfied. Or when the 1st collation part 28 extracted the similar image candidate by step S807, the control part 40 determines whether the 2nd image was compared with all the 1st images (step S808).

制御部４０は、まだ第２画像と比較していない第１画像がある場合、ステップＳ８０４〜Ｓ８０７の処理を繰り返し、全ての第２画像と第１画像との比較が完了すると、一連のステップを終了する。 When there is a first image that has not yet been compared with the second image, the control unit 40 repeats the processing of steps S804 to S807, and when the comparison between all the second images and the first image is completed, a series of steps is performed. finish.

制御部４０は、図８のフローチャートを終了した後、さらに、図５に示したフローチャートを実施し、第２画像の類似画像候補として抽出した第１画像についてＮＣＣ値を用いて詳細に画像照合を行う。そして、制御部４０は、第２画像と類似すると判定した第１画像の情報をインターフェース部１１を介して検索を要求したコンピュータに通知する。あるいは、制御部４０は、図８のフローチャートにより類似画像候補として抽出した第１画像の情報をインターフェース部１１を介して検索を要求したコンピュータに通知してもよい。 After completing the flowchart of FIG. 8, the control unit 40 further performs the flowchart shown in FIG. 5, and performs image matching in detail using the NCC value for the first image extracted as the similar image candidate of the second image. Do. Then, the control unit 40 notifies the computer that requested the search via the interface unit 11 of the information of the first image determined to be similar to the second image. Or the control part 40 may notify the computer which requested | required the search via the interface part 11 of the information of the 1st image extracted as a similar image candidate by the flowchart of FIG.

以上詳述したように、図８に示したフローチャートに従って動作することによって、画像照合装置１は、画像検索サーバとして動作し、複数の画像に対する画像照合を高速かつ高精度に実施することが可能となった。また、所定の画像を複数の画像と照合する場合に多次元インデックス構造を利用して照合処理を高速化することが可能となった。 As described in detail above, by operating according to the flowchart shown in FIG. 8, the image matching apparatus 1 operates as an image search server, and can perform image matching on a plurality of images at high speed and with high accuracy. became. In addition, when a predetermined image is collated with a plurality of images, the collation process can be speeded up using a multidimensional index structure.

以下、本実施形態の低次元変換方法による類似画像の抽出精度について説明する。複数の画像Ｉ_i ¹を有する画像セットＩＳ¹（ＩＳ¹=｛Ｉ_i ¹｝）と複数の画像Ｉ_i ²を有する画像セットＩＳ²（ＩＳ²=｛Ｉ_i ²｝）において、ＮＣＣ値が閾値θ₀より大きくなる画像のペアの集合ＩＩＰＳは、式（５３）のように表される。

この集合ＩＩＰＳは、本実施形態で算出されるＮＣＣの上限値ＵＢが閾値θ₀より大きくなる、つまり式（５４）で表される画像のペアの集合ＩＩＰＳ'に近似できる。

Hereinafter, the extraction accuracy of similar images by the low-dimensional conversion method of the present embodiment will be described. In the image set IS ¹ having a plurality of images _{^{^{I i 1 (IS 1 = {}}} I i 1}) and image set ^{^{IS 2 (IS 2 = {I}} i 2}) having a plurality of images I _i ^2, NCC value A set IIPS of image pairs that is larger than the threshold value θ ₀ is expressed by Expression (53).

This set IIPS can be approximated to the set IIPS ′ of image pairs represented by the equation (54), in which the upper limit value UB of NCC calculated in the present embodiment is larger than the threshold θ ₀ .

上述した通り、ＮＣＣ値は上限値ＵＢ以下となるので、上限値ＵＢがθ₀以下である場合、ＮＣＣ値は確実にθ₀以下となり、対応する画像のペアは確実にＩＩＰＳに含まれないと判断できる。つまり、式（５５）の関係が成立し、ＩＩＰＳに含まれる画像のペアがＩＩＰＳ'から取りこぼされることはない。

As described above, since the NCC value is equal to or lower than the upper limit value UB, when the upper limit value UB is equal to or smaller than θ ₀ , the NCC value is surely equal to or smaller than θ ₀ , and the corresponding image pair is surely not included in the IIPS. I can judge. That is, the relationship of Expression (55) is established, and the image pair included in the IIPS is not missed from the IIPS ′.

一方、ＩＩＰＳに対してＩＩＰＳ'が大きくなりすぎると、類似しない画像まで類似画像として抽出していることになり、類似画像の抽出精度が低いということになる。このように、低次元変換の効果及び抽出した画像の類似性は、類似画像の抽出精度（以下、プレシジョンと称する）と、類似画像の非取りこぼし率（以下、リコールと称する）を用いて評価することができる。式（５６）はプレシジョンの算出式であり、式（５７）はリコールの算出式である。

例えば、ＩＩＰＳ'を非常に大きくする（極端な例ではＩＩＰＳ'をＩＩＰＳの全集合とする）ことにより、リコールを容易に１００％にすることができるが、その場合、プレシジョンが低くなる。低次元変換においてはリコールを１００％に保ちつつ、プレシジョンを可能な限り高くすることが好ましい。 On the other hand, if IIPS ′ is too large relative to IIPS, images that are not similar are extracted as similar images, and the extraction accuracy of similar images is low. As described above, the effect of low-dimensional conversion and the similarity of extracted images are evaluated using the extraction accuracy of similar images (hereinafter referred to as precision) and the non-missing rate of similar images (hereinafter referred to as recall). be able to. Expression (56) is a precision calculation expression, and Expression (57) is a recall calculation expression.

For example, by making IIPS ′ very large (in the extreme example, IIPS ′ is the total set of IIPS), the recall can be easily made 100%, but in that case, the precision becomes low. In low-dimensional conversion, it is preferable to keep the precision as high as possible while keeping the recall at 100%.

図９は、本実施形態及び他の低次元変換方法においてリコールを１００％にしたときのプレシジョンを表すグラフである。図９では、他の低次元変換方法としてＭＳＥＡと、離散コサイン変換（ＤＣＴ：Discrete Cosine Transform）が用いられる。 FIG. 9 is a graph showing the precision when the recall is set to 100% in the present embodiment and other low-dimensional conversion methods. In FIG. 9, MSEA and Discrete Cosine Transform (DCT) are used as other low-dimensional transform methods.

図９に示すグラフ９００において、グラフ９０１は本実施形態によるプレシジョンを示し、グラフ９０２はＭＳＥＡによるプレシジョンを示し、グラフ９０３はＤＣＴによるプレシジョンを示す。グラフ９００の横軸は各低次元変換における特徴量（特徴ベクトル）の次元数を示し、縦軸はプレシジョンを示す。なお、図９は、１時間のビデオからランダムに選択された１万フレームの画像と他の１時間のビデオからランダムに選択された１万フレームの画像についてのプレシジョンを示す。この二つの異なる１時間のビデオは、異なった日の同じ時間帯に同じチャンネルで放送された映像であり、オープニング、エンディングの画像等、同一の画像が幾つか含まれている。各画像は３５２×２４０画素の画像であり、閾値θ₀及びθ₁は０．９に設定されている。 In the graph 900 shown in FIG. 9, a graph 901 shows the precision according to the present embodiment, a graph 902 shows a precision by MSEA, and a graph 903 shows a precision by DCT. The horizontal axis of the graph 900 indicates the number of dimensions of the feature amount (feature vector) in each low-dimensional conversion, and the vertical axis indicates the precision. FIG. 9 shows the precision for 10,000 frames of images randomly selected from one hour of video and 10,000 frames of images randomly selected from other one hour of video. These two different one-hour videos are videos broadcast on the same channel in the same time zone on different days, and include several identical images such as opening and ending images. Each image is an image of 352 × 240 pixels, and the threshold values θ ₀ and θ ₁ are set to 0.9.

式（５）の上限値ＵＢ^MSEAは、式（５８）に示すように、式（５９）、（６０）で示される特徴ベクトルの内積で表される。

そこで、ＭＳＥＡを用いた低次元変換方法では、低次元特徴量として式（５９）、（６０）に示す特徴ベクトルを使用し、その内積が閾値θ₁以上となる二画像を類似画像として抽出する。 The upper limit value UB ^MSEA of Expression (5) is represented by the inner product of the feature vectors represented by Expressions (59) and (60) as shown in Expression (58).

Therefore, in the low-dimensional conversion method using MSEA, feature vectors shown in equations (59) and (60) are used as low-dimensional feature values, and two images whose inner product is equal to or greater than the threshold θ ₁ are extracted as similar images. .

図１０は、ＤＣＴを用いた低次元変換方法を説明するための概略図である。ＤＣＴを用いた低次元変換方法では、まず、各画像の正規化画像を所定のブロックに分割し、二次元ＤＣＴにより周波数成分に変換する。図１０のブロック１０００内の各要素（ａ₀、ａ₁、ａ₂、...）は、二次元ＤＣＴにより変換されたＤＣＴ係数を示す。図１０に示すように、二次元ＤＣＴにより変換されたＤＣＴ係数は、ＭＰＥＧ、ＪＰＥＧ等で用いられるジグザグスキャンにより低周波数成分側から順に並べられる。そして、式（６１）に示すように、低周波数成分側から順に選択されたＤＣＴ係数による特徴ベクトルがＤＣＴによる低次元特徴量となる。

なお、ＤＣＴ係数は正規化画像から生成され、ＤＣ成分は常に０となるため省略される。 FIG. 10 is a schematic diagram for explaining a low-dimensional conversion method using DCT. In the low-dimensional conversion method using DCT, first, the normalized image of each image is divided into predetermined blocks and converted into frequency components by two-dimensional DCT. Each element (a ₀ , a ₁ , a ₂ ,...) In the block 1000 of FIG. 10 represents a DCT coefficient transformed by the two-dimensional DCT. As shown in FIG. 10, the DCT coefficients converted by the two-dimensional DCT are arranged in order from the low frequency component side by zigzag scanning used in MPEG, JPEG, and the like. Then, as shown in the equation (61), the feature vector based on the DCT coefficients sequentially selected from the low frequency component side becomes the low-dimensional feature value by DCT.

Note that the DCT coefficient is generated from the normalized image, and the DC component is always 0, so it is omitted.

ＤＣＴにより変換された周波数成分の全パワーは元の画像のパワーと一致する。つまり、全ＤＣＴ係数を用いて計算したユークリッド距離がＮＣＣ値に対応するので、ＤＣＴ係数の一部（又は全部）を用いて計算したユークリッド距離が（２−２θ₀）以下である場合、ＮＣＣ値がθ₀以上となることはない。そこで、この方法では、式（６２）に示すように、それぞれのＤＣＴ成分がａ_i、ｂ_iである二つの画像の特徴ベクトルのユークリッド距離を求め、ユークリッド距離が（２−２θ₁）（θ₁≧θ₀）以下となる二画像を類似画像として抽出する。

The total power of the frequency components converted by DCT matches the power of the original image. That is, since the Euclidean distance calculated using all DCT coefficients corresponds to the NCC value, if the Euclidean distance calculated using a part (or all) of the DCT coefficients is equal to or less than (2-2θ ₀ ), the NCC value Does not exceed θ ₀ . Therefore, in this method, as shown in the equation (62), the Euclidean distance between the feature vectors of two images whose DCT components are a _i and b _i is obtained, and the Euclidean distance is (2-2θ ₁ ) (θ Two images satisfying ₁ ≧ θ ₀ ) or less are extracted as similar images.

なお、図９に示すグラフ９００では、式（６２）に示すユークリッド距離が（２−２θ₁）以下となる画像のペアの集合をＩＩＰＳ'とし、ＮＣＣ値がθ₀以上となる画像のペアの集合をＩＩＰＳとしてプレシジョン及びリコールを算出している。 In the graph 900 shown in FIG. 9, a set of image pairs in which the Euclidean distance shown in the equation (62) is (2-2θ ₁ ) or less is defined as IIPS ′, and an image pair having an NCC value of θ ₀ or more is shown. Precision and recall are calculated using the set as the IIPS.

図９に示すように、（特に次元数が低い場合）プレシジョンは必ずしも高くない。プレシジョンは、最も高くても１５％未満であり（次元数が１０２４の場合）、次元数が６４の場合は５〜６％まで落ちる。また、ＤＣＴによるプレシジョンは、本実施形態によるプレシジョンとほとんど同じ値であるが、ＭＳＥＡによるプレシジョンは、極めて低い値となる。 As shown in FIG. 9, the precision is not necessarily high (especially when the number of dimensions is low). The precision is at most less than 15% (when the number of dimensions is 1024), and when the number of dimensions is 64, the precision drops to 5 to 6%. The precision based on DCT is almost the same value as the precision according to the present embodiment, but the precision based on MSEA is extremely low.

一方、例えば、ＮＣＣの上限値ＵＢについての閾値θ₁をＮＣＣ値についての閾値θ₀より大きい値にすると（つまり、類似画像を正確にフィルタリングするのではなく、近似的にフィルタリングすると）、リコールは１００％にならなくなるが、プレシジョンを高くすることができる。 On the other hand, for example, if the threshold value θ ₁ for the upper limit value UB of the NCC is set to a value larger than the threshold value θ ₀ for the NCC value (that is, if the similar image is not filtered accurately but filtered approximately), the recall is performed. Although it does not become 100%, the precision can be increased.

図１１（ａ）は、本実施形態及び他の低次元変換方法においてリコールを１００％以下にしたときのリコールとプレシジョンの関係のグラフを示し、図１１（ｂ）は、その拡大図を示す。図１１（ａ）、（ｂ）に示すグラフ１１００、１１１０では、他の低次元変換方法として、ＭＳＥＡと、ＤＣＴと、正規化輝度ヒストグラム（以下、ＮＩＨ（Normalized intensity histograms）と称する）が用いられる。 FIG. 11A shows a graph of the relationship between recall and precision when the recall is made 100% or less in this embodiment and other low-dimensional conversion methods, and FIG. 11B shows an enlarged view thereof. In graphs 1100 and 1110 shown in FIGS. 11A and 11B, MSEA, DCT, and normalized intensity histogram (hereinafter referred to as NIH (Normalized intensity histograms)) are used as other low-dimensional conversion methods. .

図１２は、ＮＩＨを用いた低次元変換方法を説明するための概略図である。ＮＩＨを用いた低次元変換方法では、まず、各画像の正規化画像１２００を（２×２、３×３等の）サブ領域１２１０に分割し、各サブ領域について、一定範囲毎に量子化した正規化輝度のヒストグラム１２２０を算出する。そして、式（６３）に示すように、全てのサブ領域のヒストグラムの分布値ａ_i,j（ｉはサブ領域の番号、ｊは各ヒストグラムの量子化レベルの番号）を順番に並べたベクトルがＮＩＨによる低次元特徴量となる。

この方法では、式（６４）に示すように、各ヒストグラムの分布値がａ_i,j、ｂ_i,jである二つの画像の特徴ベクトルのユークリッド距離を求め、ユークリッド距離が（２−２θ₁）以下となる二画像を類似画像として抽出する。

FIG. 12 is a schematic diagram for explaining a low-dimensional conversion method using NIH. In the low-dimensional conversion method using NIH, first, the normalized image 1200 of each image is divided into sub-regions 1210 (2 × 2, 3 × 3, etc.), and each sub-region is quantized for each fixed range. A normalized luminance histogram 1220 is calculated. Then, as shown in the equation (63), a vector in which histogram distribution values a _{i, j} (i is a sub-region number and j is a quantization level number of each histogram) in all sub-regions is arranged in order. It becomes a low-dimensional feature value by NIH.

In this method, as shown in Expression (64), the Euclidean distance between the feature vectors of two images whose distribution values of each histogram are a _{i, j} and b _{i, j} is obtained, and the Euclidean distance is (2-2θ _1). ) The following two images are extracted as similar images.

また、この方法では、サブ領域の数と正規化輝度の量子化レベルを変更することにより、特徴量の次元数を変更することができる。例えば、サブ領域を２×２にし、正規化輝度の量子化を４レベルにした場合、特徴量の次元数は、２×２×４となる。ＮＩＨを用いて類似画像を抽出する場合、ＮＣＣの上限値は算出されないのでＮＣＣ値が閾値θ₀以上になる画像のペアの集合ＩＩＰＳに対するリコールは１００％とならないが、リコール及びプレシジョンが非常に高くなることが知られている。 In this method, the number of dimensions of the feature quantity can be changed by changing the number of sub-regions and the quantization level of the normalized luminance. For example, if the sub-region is 2 × 2 and the normalized luminance quantization is 4 levels, the number of dimensions of the feature amount is 2 × 2 × 4. When similar images are extracted using NIH, the upper limit value of NCC is not calculated, so the recall for the set IIPS of image pairs whose NCC value is equal to or greater than the threshold θ ₀ does not become 100%, but the recall and precision are very high. It is known to be.

なお、図１１に示すグラフ１１００では、式（６４）に示すユークリッド距離が（２−２θ₁）以下となる画像のペアの集合をＩＩＰＳ'とし、ＮＣＣ値がθ₀以上の画像のペアの集合をＩＩＰＳとしてプレシジョン及びリコールを算出している。 In the graph 1100 shown in FIG. 11, a set of image pairs in which the Euclidean distance shown in Expression (64) is (2-2θ ₁ ) or less is defined as IIPS ′, and a set of image pairs whose NCC value is θ ₀ or more. IIPS is used to calculate the precision and recall.

図１１（ａ）、（ｂ）に示すグラフ１１００において、「ｐｒｏｐ１６Ｄ」、「ｐｒｏｐ６４Ｄ」、「ｐｒｏｐ２５６Ｄ」、「ｐｒｏｐ１０２４Ｄ」のグラフは、本実施形態の低次元変換方法で特徴量次元数がそれぞれ１６、６４、２５６、１０２４のときのリコールとプレシジョンの関係を示す。また、「ＤＣＴ５Ｄ」、「ＤＣＴ４４Ｄ」、「ＤＣＴ２０９Ｄ」のグラフは、ＤＣＴによる低次元変換方法で特徴量次元数がそれぞれ５、４４、２０９のときのリコールとプレシジョンの関係を示す。また、「ＮＩＨ２×２×４」、「ＮＩＨ４×４×４」、「ＮＩＨ４×４×８」のグラフは、ＮＩＨによる低次元変換方法で特徴量次元数がそれぞれ２×２×４、４×４×４、４×４×８のときのリコールとプレシジョンの関係を示す。また、「ＭＳＥＡ１６Ｄ」、「ＭＳＥＡ６４Ｄ」、「ＭＳＥＡ５１２Ｄ」のグラフは、ＭＳＥＡによる低次元変換方法で特徴量次元数がそれぞれ１６、６４、５１２のときのリコールとプレシジョンの関係を示す。グラフ１１００、１１１０の横軸はリコールの値を示し、縦軸はプレシジョンの値を示す。 In the graph 1100 shown in FIGS. 11A and 11B, the graphs of “prop 16D”, “prop 64D”, “prop 256D”, and “prop 1024D” are feature amount dimensions in the low-dimensional conversion method of this embodiment. The relationship between recall and precision when the numbers are 16, 64, 256, and 1024, respectively. The graphs “DCT 5D”, “DCT 44D”, and “DCT 209D” show the relationship between recall and precision when the number of feature dimensions is 5, 44, and 209, respectively, in the low-dimensional conversion method using DCT. In addition, the graphs of “NIH 2 × 2 × 4”, “NIH 4 × 4 × 4”, and “NIH 4 × 4 × 8” have a feature quantity dimension number of 2 × 2 × 4 by a low-dimensional conversion method using NIH, respectively. The relationship between recall and precision when 4 × 4 × 4, 4 × 4 × 8 is shown. Further, the graphs “MSEA 16D”, “MSEA 64D”, and “MSEA 512D” show the relationship between recall and precision when the number of feature dimensions is 16, 64, and 512, respectively, in the low-dimensional conversion method using MSEA. In the graphs 1100 and 1110, the horizontal axis indicates the recall value, and the vertical axis indicates the precision value.

図１１（ａ）、（ｂ）に示すように、本実施形態の低次元変換方法では、特徴量次元数が６４と２５６の場合、リコールを９７％に保ちつつ、プレシジョンが７０％以上となる。特に、リコールが非常に高い（９５％以上）場合、ＤＣＴ、ＮＩＨ、ＭＳＥＡよりプレシジョンが高くなる。 As shown in FIGS. 11A and 11B, in the low-dimensional conversion method of the present embodiment, when the feature quantity dimension is 64 and 256, the recall is maintained at 97% and the precision becomes 70% or more. . In particular, when the recall is very high (95% or more), the precision is higher than DCT, NIH, and MSEA.

図１３（ａ）は、図１１（ａ）とは異なる画像セットのペア（それぞれ、類似する放送映像から取得した１万画像）についてのリコールとプレシジョンの関係のグラフを示し、図１３（ｂ）は、その拡大図を示す。図１３（ａ）、（ｂ）のグラフ１３００、１３１０では、ほとんど全てのタイプで、図１１（ａ）、（ｂ）のグラフ１１００、１１１０より高いリコールとプレシジョンの組合せが得られる。図１３（ｂ）に示すように、本実施形態の低次元変換方法で特徴量次元数が１０２４又は２５６のときに最も高いパフォーマンスが得られ、次いでＭＳＥＡによる低次元変換方法で特徴量次元数が５１２のとき、本実施形態の低次元変換方法で特徴量次元数が６４又は１６のとき、ＭＳＥＡによる低次元変換方法で特徴量次元数が６４のときの順に高いパフォーマンスが得られる。一方、ＤＣＴによる低次元変換方法では、あまり高いパフォーマンスが得られない。また、ＭＳＥＡによる低次元変換方法で特徴量次元数が１６の場合のパフォーマンスは、本実施形態の低次元変換方法で特徴量次元数が１６の場合より非常に低くなる。つまり、本実施形態の低次元変換方法は、全ての特徴量次元数において、他の低次元変換方法より高いパフォーマンスを得ることができる。 FIG. 13A shows a graph of the relationship between recall and precision for a pair of image sets different from FIG. 11A (10,000 images respectively obtained from similar broadcast images), and FIG. Shows an enlarged view thereof. In graphs 1300 and 1310 in FIGS. 13A and 13B, almost all types can obtain a higher combination of recall and precision than graphs 1100 and 1110 in FIGS. 11A and 11B. As shown in FIG. 13B, the highest performance is obtained when the number of feature dimensions is 1024 or 256 in the low-dimensional conversion method of this embodiment, and then the feature quantity dimension is obtained by the low-dimensional conversion method using MSEA. In the case of 512, when the number of feature dimensions is 64 or 16 in the low-dimensional conversion method of the present embodiment, the high performance is obtained in the order in which the number of feature dimensions is 64 in the low-dimensional conversion method by MSEA. On the other hand, the low-dimensional conversion method using DCT does not provide very high performance. Further, the performance when the feature dimension dimension is 16 in the low-dimensional conversion method by MSEA is much lower than that when the feature dimension dimension is 16 in the low-dimensional conversion method of the present embodiment. That is, the low-dimensional conversion method of the present embodiment can obtain higher performance than other low-dimensional conversion methods in all feature quantity dimensions.

図１４（ａ）、（ｂ）、（ｃ）、（ｄ）は、それぞれリコールを０．９５、０．９７、０．９８、０．９９に固定したときの各低次元変換方法での特徴量次元数とプレシジョンの関係のグラフを示す。図１４（ａ）、（ｂ）、（ｃ）、（ｄ）に示すグラフ１４００、１４１０、１４２０、１４３０において、それぞれ、グラフ１４０１、１４１１、１４２１、１４３１は本実施形態の低次元変換方法でのプレシジョンを示し、グラフ１４０２、１４１２、１４２２、１４３２はＭＳＥＡによる低次元変換方法でのプレシジョンを示し、グラフ１４０３、１４１３、１４２３、１４３３はＤＣＴによる低次元変換方法でのプレシジョンを示し、グラフ１４０４、１４１４、１４２４、１４３４はＮＩＨによる低次元変換方法でのプレシジョンを示す。グラフ１４００〜１４３０の横軸は特徴量次元数を示し、縦軸はプレシジョンを示す。 14 (a), (b), (c), and (d) are features of each low-dimensional conversion method when the recall is fixed at 0.95, 0.97, 0.98, and 0.99, respectively. The graph of the relation between quantity dimension number and precision is shown. In graphs 1400, 1410, 1420, and 1430 shown in FIGS. 14A, 14B, 14C, and 14D, graphs 1401, 1411, 1421, and 1431 are obtained by the low-dimensional conversion method of the present embodiment, respectively. The graphs 1402, 1412, 1422, and 1432 show the precision in the MSEA low-dimensional conversion method, the graphs 1403, 1413, 1423, and 1433 show the precision in the DCT low-dimensional conversion method, and the graphs 1404, 1414 , 1424 and 1434 show the precision in the low-dimensional conversion method by NIH. In the graphs 1400 to 1430, the horizontal axis indicates the number of feature dimensions, and the vertical axis indicates the precision.

図１４（ａ）〜（ｄ）に示すように、リコールを非常に高い値にした場合、本実施形態の低次元変換方法でのプレシジョンは、他の低次元変換方法でのプレシジョンより高くなることが明らかである。図１４（ｄ）に示すように、リコールが０．９９の場合、本実施形態の低次元変換方法では、ＤＣＴによる低次元変換方法よりわずかにパフォーマンスが低くなるが、この場合、そもそもプレシジョンが０．１４以下であり、極めて低い。また、例えば、図１４（ａ）に示すように、リコールが０．９５の場合、本実施形態の低次元変換方法において、特徴量次元数が低い方が高いパフォーマンスを示している（特徴量次元数が６４、１２８、２５６の方が、特徴量次元数が５１２のときよりもプレシジョンが高い）。類似した現象がＭＳＥＡでもみられる（特徴量次元数が１６、３２、６４の方が、特徴量次元数が１２８のときよりもプレシジョンが高い）。これは、これらの方法が複数ブロックへ分割するものであるためと考えられる。そのため、リコールを１００％としない場合、適切なサイズのブロックに分割する必要がある。図１４（ａ）〜（ｄ）から、本実施形態の低次元変換方法でのプレシジョンは、特徴量次元数が６４のとき、すなわち、分割するブロック数が３２のとき、最も高くなると考えられる。 As shown in FIGS. 14A to 14D, when the recall is set to a very high value, the precision in the low-dimensional conversion method of the present embodiment is higher than the precision in other low-dimensional conversion methods. Is clear. As shown in FIG. 14D, when the recall is 0.99, the low-dimensional conversion method of the present embodiment is slightly lower in performance than the low-dimensional conversion method using DCT. In this case, however, the precision is zero in the first place. .14 or less, which is extremely low. Further, for example, as shown in FIG. 14A, when the recall is 0.95, in the low-dimensional conversion method of the present embodiment, the lower the feature quantity dimension, the higher the performance (feature quantity dimension). Numbers of 64, 128, and 256 have higher precision than when the number of feature dimensions is 512). A similar phenomenon is also observed in MSEA (the number of feature quantity dimensions is 16, 32, and 64 is higher than that when the feature quantity dimension is 128). This is considered because these methods divide into a plurality of blocks. Therefore, when the recall is not 100%, it is necessary to divide the block into appropriate size blocks. 14A to 14D, the precision in the low-dimensional conversion method of this embodiment is considered to be highest when the feature quantity dimension number is 64, that is, when the number of blocks to be divided is 32.

図１５（ａ）〜（ｄ）は、図１４（ａ）〜（ｄ）とは異なる画像セットのペアについて、それぞれリコールを０．９５、０．９７、０．９８、０．９９に固定したときの各低次元変換方法での特徴量次元数とプレシジョンの関係のグラフを示す。図１５（ａ）、（ｂ）、（ｃ）、（ｄ）に示すグラフ１５００、１５１０、１５２０、１５３０において、それぞれ、グラフ１５０１、１５１１、１５２１、１５３１は本実施形態の低次元変換方法でのプレシジョンを示し、グラフ１５０２、１５１２、１５２２、１５３２はＭＳＥＡによる低次元変換方法でのプレシジョンを示し、グラフ１５０３、１５１３、１５２３、１５３３はＤＣＴによる低次元変換方法でのプレシジョンを示し、グラフ１５０４、１５１４、１５２４、１５３４はＮＩＨによる低次元変換方法でのプレシジョンを示す。グラフ１５００〜１５３０の横軸は特徴量次元数を示し、縦軸はプレシジョンを示す。 15 (a) to 15 (d), recalls are fixed at 0.95, 0.97, 0.98, and 0.99, respectively, for pairs of image sets different from FIGS. 14 (a) to (d). The graph of the relationship between the number of feature dimensions and the precision in each low-dimensional conversion method is shown. In graphs 1500, 1510, 1520, and 1530 shown in FIGS. 15A, 15B, 15C, and 15D, graphs 1501, 1511, 1521, and 1531 are obtained by the low-dimensional conversion method of the present embodiment, respectively. Graphs 1502, 1512, 1522, and 1532 show the precision in the low-dimensional conversion method using MSEA, graphs 1503, 1513, 1523, and 1533 show the precision in the low-dimensional conversion method using DCT, and graphs 1504 and 1514. , 1524 and 1534 show the precision in the low-dimensional conversion method by NIH. In the graphs 1500 to 1530, the horizontal axis indicates the number of feature dimensions, and the vertical axis indicates the precision.

図１５（ａ）〜（ｄ）に示すグラフ１５００〜１５３０では、プレシジョンが比較的高くなっている。例えば、図１５（ｄ）では、リコールが０．９９に固定されているにも関わらず、プレシジョンは略０．６となる。図１５（ａ）〜（ｃ）に示すように、リコールが比較的低い（０．９５〜０．９８）場合、本実施形態の低次元変換方法とＭＳＥＡによる低次元変換方法はＤＣＴ及びＮＩＨによる低次元変換方法よりパフォーマンスがよい。一方、図１５（ｄ）に示すように、リコールが高い（０．９９）場合、本実施形態の低次元変換方法とＤＣＴ及びＮＩＨによる低次元変換方法はＭＳＥＡによる低次元変換方法よりパフォーマンスが高い。つまり、全体として、本実施形態の低次元変換方法は他の低次元変換方法より高いパフォーマンスを得ることができる。 In the graphs 1500 to 1530 shown in FIGS. 15A to 15D, the precision is relatively high. For example, in FIG. 15D, the precision is approximately 0.6 even though the recall is fixed at 0.99. As shown in FIGS. 15A to 15C, when the recall is relatively low (0.95 to 0.98), the low-dimensional conversion method of this embodiment and the low-dimensional conversion method by MSEA are based on DCT and NIH. Better performance than low-dimensional transformation methods. On the other hand, as shown in FIG. 15D, when the recall is high (0.99), the low-dimensional conversion method of this embodiment and the low-dimensional conversion method by DCT and NIH have higher performance than the low-dimensional conversion method by MSEA. . That is, as a whole, the low-dimensional conversion method of this embodiment can obtain higher performance than other low-dimensional conversion methods.

以上、図９、図１１（ａ）、（ｂ）、図１３（ａ）、（ｂ）、図１４（ａ）〜（ｄ）、図１５（ａ）〜（ｄ）を用いて説明したように、本実施形態の低次元変換方法を用いることにより、他の低次元変換方法より高精度に類似性の高い画像を抽出することが可能となる。 As described above, it has been described with reference to FIGS. 9, 11A, 11B, 13A, 13B, 14A to 14D, and 15A to 15D. In addition, by using the low-dimensional conversion method of the present embodiment, it is possible to extract images having high similarity with higher accuracy than other low-dimensional conversion methods.

本実施形態の低次元変換方法では、ラグランジュの未定乗数決定法を用いてＮＣＣの上限値を算出する。これにより、ＮＣＣ値が上限値に近いとき、そのＮＣＣの微分係数は非常に０に近くなり、上限値から離れているとき、微分係数は０から離れる。そのため、ＮＣＣ値が上限値に近い（リコールが高い）場合、上限値をわずかに大きくしても、微分係数が０に近いのでリコールは大きく変化しない。一方、ＮＣＣ値が上限値から離れている場合、微分係数は０から離れているので、プレシジョンは大幅に増加する。これにより、本実施形態の低次元変換方法では、リコールを高く保ちつつ、プレシジョンを高くできると考えられる。 In the low-dimensional conversion method of this embodiment, the upper limit value of NCC is calculated using Lagrange's undetermined multiplier determination method. Thereby, when the NCC value is close to the upper limit value, the differential coefficient of the NCC is very close to 0, and when the NCC value is far from the upper limit value, the differential coefficient is away from 0. Therefore, when the NCC value is close to the upper limit value (recall is high), even if the upper limit value is slightly increased, the recall does not change greatly because the differential coefficient is close to 0. On the other hand, when the NCC value is far from the upper limit value, the differential coefficient is far from 0, so that the precision is greatly increased. Thereby, in the low-dimensional conversion method of this embodiment, it is considered that the precision can be increased while keeping the recall high.

図１６は、本実施形態による照合処理の時間を示す表である。図１６の表１６００は、それぞれ１万画像（３５２×２４０画素）からなる二つの画像セットにおいて、全ての画像ペアの組合せについて特徴ベクトルのユークリッド距離により全数照合したときの時間を示す。なお、この時間には、特徴ベクトルの算出時間は含まれない。 FIG. 16 is a table showing the verification processing time according to the present embodiment. A table 1600 in FIG. 16 shows the time when all the combinations of all image pairs in the two image sets each including 10,000 images (352 × 240 pixels) are collated by the Euclidean distance of the feature vector. This time does not include the time for calculating the feature vector.

表１６００の行１６１０と行１６３０は低次元特徴量による照合処理を、行１６２０と行１６４０は低次元特徴量による照合処理及びそれにより抽出した類似画像のペアに対するＮＣＣによる照合処理を示す。また、行１６１０と行１６２０ではリコールが１００％になるようにし（低次元特徴量による照合処理における閾値θ₁＝ＮＣＣによる照合処理における閾値θ₀＝０．９）、行１６３０と行１６４０ではリコールが９５％になるようにしている（θ₁＞θ₀）。各行において、さらに特徴量の次元数毎に処理結果が示される。一方、列１６５０は特徴量次元数を、列１６６０はプレシジョンを、列１６７０はリコールを、列１６８０は処理時間（秒）をそれぞれ示す。 Lines 1610 and 1630 of the table 1600 indicate collation processing using low-dimensional feature values, and rows 1620 and 1640 indicate collation processing using low-dimensional feature values and matching processing by NCC for pairs of similar images extracted thereby. In addition, the recall is set to 100% in the row 1610 and the row 1620 (threshold value θ ₁ in the collation processing using the low-dimensional feature amount = threshold value θ ₀ = 0.9 in the collation processing using the NCC), and the recall is performed in the rows 1630 and 1640. Is 95% (θ ₁ > θ ₀ ). In each row, the processing result is shown for each dimension number of the feature amount. On the other hand, column 1650 shows the number of feature dimensions, column 1660 shows the precision, column 1670 shows the recall, and column 1680 shows the processing time (seconds).

行１６１０と行１６３０に示すように、照合処理時間は、概ね特徴量の次元数に比例している。これに基づくと、全ての画像ペアの組合せについてＮＣＣによる照合処理（つまり、３５２×２４０次元の演算）を実施した場合、約４０時間かかると考えられる。リコールが１００％の場合、行１６１０に示すようにプレシジョン値は比較的低くなり、類似画像として多くの画像が抽出されるため、行１６２０に示すようにＮＣＣによる照合処理時間は非常に長くなる。一方、リコールが９５％の場合、行１６３０に示すようにプレシジョンが大幅に高くなるため、類似画像の抽出数が抑制され、行１６４０に示すようにＮＣＣによる照合処理時間はあまり長くならない。従って、わずかにリコールを減少させる（９５％）のがリーズナブルであると考えられる。 As shown in rows 1610 and 1630, the matching processing time is approximately proportional to the number of dimensions of the feature amount. Based on this, when collation processing by NCC (that is, 352 × 240-dimensional calculation) is performed on all image pair combinations, it is considered that it takes about 40 hours. When the recall is 100%, the precision value is relatively low as shown in the row 1610, and many images are extracted as similar images. Therefore, the verification processing time by the NCC is very long as shown in the row 1620. On the other hand, when the recall is 95%, the precision is significantly increased as shown in the row 1630, so that the number of extracted similar images is suppressed, and the verification processing time by the NCC is not so long as shown in the row 1640. Therefore, it is considered reasonable to reduce the recall slightly (95%).

なお、上述したように、行１６１０と行１６３０の照合処理時間は、特徴量の次元数に応じて異なるものであり、本実施形態とＭＳＥＡとで差は生じない。また、１万画像（３５２×２４０画素）について、本実施形態に基づく低次元特徴量の算出時間とＭＳＥＡに基づく低次元特徴量の算出時間は、その次元数に関わらず、それぞれ、約３５秒と約３２秒であり、ほとんど差がなかった。つまり、次元数が同じ場合、本実施形態の低次元変換方法とＭＳＥＡによる低次元変換方法とで類似画像を抽出するのにかかる時間は、ほとんど同じとなる。一方、本実施形態の低次元変換方法では、ＭＳＥＡによる低次元変換方法より、類似画像として抽出する画像のペアが少なくなるので、その後のＮＣＣによる照合処理時間は短くなり、トータルの照合時間は短くなる。 Note that, as described above, the matching processing times of the row 1610 and the row 1630 differ depending on the number of dimensions of the feature amount, and there is no difference between the present embodiment and the MSEA. In addition, for 10,000 images (352 × 240 pixels), the calculation time of the low-dimensional feature value based on this embodiment and the calculation time of the low-dimensional feature value based on MSEA are about 35 seconds, regardless of the number of dimensions. And about 32 seconds, there was almost no difference. That is, when the number of dimensions is the same, the time taken to extract similar images by the low-dimensional conversion method of this embodiment and the low-dimensional conversion method by MSEA is almost the same. On the other hand, in the low-dimensional conversion method of the present embodiment, the number of image pairs extracted as similar images is smaller than in the low-dimensional conversion method by MSEA, so that the subsequent verification processing time by NCC is short and the total verification time is short. Become.

図１７は、画像照合装置を画像符号化装置に適用する場合の例を示す概略構成図である。図１７に示す画像照合装置２は、図１に示す画像照合装置１の各部に加えて画像入力部５３を有する。また、画像照合装置２の制御部６０は、インテグラル画像生成部６１、符号化処理部６２、画像分割部６３、特徴ベクトル算出部６４、上限値算出部６５、第１照合部６６、ＮＣＣ値算出部６７及び第２照合部６８を有する。なお、画像分割部６３は、図１に示す第１画像分割部２２と第２画像分割部２５の機能を備え、特徴ベクトル算出部６４は、図１に示す第１特徴ベクトル算出部２３と第２特徴ベクトル算出部２６の機能を備える。 FIG. 17 is a schematic configuration diagram illustrating an example in which the image matching device is applied to an image encoding device. An image collation apparatus 2 shown in FIG. 17 has an image input unit 53 in addition to each part of the image collation apparatus 1 shown in FIG. Further, the control unit 60 of the image collation apparatus 2 includes an integral image generation unit 61, an encoding processing unit 62, an image division unit 63, a feature vector calculation unit 64, an upper limit value calculation unit 65, a first collation unit 66, an NCC value. A calculation unit 67 and a second verification unit 68 are included. The image dividing unit 63 includes the functions of the first image dividing unit 22 and the second image dividing unit 25 shown in FIG. 1, and the feature vector calculating unit 64 includes the first feature vector calculating unit 23 shown in FIG. The function of the 2 feature vector calculation part 26 is provided.

画像入力部５３は、ＣＣＤ、ＣＭＯＳ等の光電変換器で構成された２次元検出器と、その２次元検出器上に撮影対象の像を結像する結像光学系等を有する。画像入力部５３は、一定の時間間隔（例えば１／３０秒）毎に撮影を行い、撮影画像を、例えば３５２×２４０画素のデジタル画像に変換し、そのデジタル画像を記憶部５２に記憶する。 The image input unit 53 includes a two-dimensional detector composed of a photoelectric converter such as a CCD or a CMOS, and an imaging optical system that forms an image to be photographed on the two-dimensional detector. The image input unit 53 captures images at regular time intervals (for example, 1/30 seconds), converts the captured image into a digital image of 352 × 240 pixels, for example, and stores the digital image in the storage unit 52.

図１８は、図１７に示す画像照合装置２による画像の符号化処理の動作を示すフローチャートである。以下、図１８に示したフローチャートを参照しつつ、画像符号化処理の動作を説明する。なお、以下に説明する動作のフローは、予め記憶部５２に記憶されているプログラムに基づき主に制御部６０により画像照合装置２の各要素と協同して実行される。 FIG. 18 is a flowchart showing the operation of the image encoding process by the image collating apparatus 2 shown in FIG. Hereinafter, the operation of the image encoding process will be described with reference to the flowchart shown in FIG. The operation flow described below is mainly executed by the control unit 60 in cooperation with each element of the image collating apparatus 2 based on a program stored in the storage unit 52 in advance.

最初に、画像入力部５３は、撮影対象を撮影した画像をデジタル画像に変換し、そのデジタル画像を記憶部５２に記憶する（ステップＳ１８０１）。 First, the image input unit 53 converts an image obtained by shooting the shooting target into a digital image, and stores the digital image in the storage unit 52 (step S1801).

次に、インテグラル画像生成部６１は、記憶部１２に保存されたデジタル画像を読み出し、インテグラル画像を生成し、記憶部５２に記憶する（ステップＳ１８０２）。 Next, the integral image generation unit 61 reads the digital image stored in the storage unit 12, generates an integral image, and stores the integral image in the storage unit 52 (step S1802).

デジタル画像及びインテグラル画像の水平方向をｘ軸、垂直方向をｙ軸とし、デジタル画像の座標（ａ、ｂ）における正規化画素値をＩ（ａ、ｂ）とする。インテグラル画像生成部６７は、各画素が式（６５）、（６６）からなるインテグラル画像（Viola, P., Jones, M., May 2004. Robust real-time face detection. International Journal of Computer Vision 57 (2), 137-154.）をそれぞれ生成する。

つまり、式（６５）により算出されるインテグラル画像の各画素値は、原点からその画素までの全ての画素の正規化画素値の総和であり、式（６６）により算出されるインテグラル画像の各画素値は、原点からその画素までの全ての画素の正規化画素値の二乗和である。 Let the horizontal direction of the digital image and the integral image be the x-axis, the vertical direction be the y-axis, and the normalized pixel value at the coordinates (a, b) of the digital image be I (a, b). The integral image generation unit 67 is an integral image (Viola, P., Jones, M., May 2004. Robust real-time face detection. International Journal of Computer Vision 57 (2) and 137-154.

That is, each pixel value of the integral image calculated by Expression (65) is the sum of normalized pixel values of all pixels from the origin to the pixel, and the integral image calculated by Expression (66) Each pixel value is the sum of squares of normalized pixel values of all pixels from the origin to that pixel.

図１９は、インテグラル画像を説明するための概略図である。例えば、インテグラル画像１９００において、座標（ｘ₁、ｙ₁）、（ｘ₁、ｙ₂）、（ｘ₂、ｙ₁）、（ｘ₂、ｙ₂）で囲まれた領域Ｄの正規化画素値の総和と正規化画素値の二乗和は、それぞれ、式（６７）、（６８）により算出することができる。

従って、予めインテグラル画像を作成しておくことにより、所定領域内の正規化画素値の総和と二乗和の算出処理を高速化できるので、特徴ベクトルの算出を高速化できる。 FIG. 19 is a schematic diagram for explaining an integral image. For example, in the integral image 1900, the normalized pixel in the region D surrounded by the coordinates (x ₁ , y ₁ ), (x ₁ , y ₂ ), (x ₂ , y ₁ ), (x ₂ , y ₂ ) The sum of the values and the square sum of the normalized pixel values can be calculated by equations (67) and (68), respectively.

Therefore, since the integral image is created in advance, the calculation process of the sum of the normalized pixel values and the sum of squares in the predetermined area can be speeded up, so that the calculation of the feature vector can be speeded up.

次に、符号化処理部６２は、記憶部１２に保存されたデジタル画像を読み出し、デジタル画像のフォーマット（画素数）変換、８×８画素、１６×１６画素等の符号化ブロックへの分割等の符号化前処理を実施する（ステップＳ１８０３）。 Next, the encoding processing unit 62 reads the digital image stored in the storage unit 12, converts the format (number of pixels) of the digital image, divides it into encoded blocks of 8 × 8 pixels, 16 × 16 pixels, and the like. The pre-encoding process is performed (step S1803).

以下のステップＳ１８０４〜Ｓ１８１４の処理は、符号化ブロック毎に実施される。まず、画像分割部６３は、符号化ブロックをさらに所定サイズの複数のブロックに分割する（ステップＳ１８０４）。例えば、画像分割部６３は、１６×１６画素の画像を４×４画素の１６ブロックに分割する。 The following steps S1804 to S1814 are performed for each coding block. First, the image dividing unit 63 further divides the encoded block into a plurality of blocks of a predetermined size (step S1804). For example, the image dividing unit 63 divides an image of 16 × 16 pixels into 16 blocks of 4 × 4 pixels.

次に、特徴ベクトル算出部６４は、インテグラル画像を用いて、符号化ブロックについて、式（３７）に示す特徴ベクトルを算出し、符号化ブロックと関連付けて記憶部１２に保存する（ステップＳ１８０５）。 Next, the feature vector calculation unit 64 calculates the feature vector shown in Expression (37) for the encoded block using the integral image, and stores it in the storage unit 12 in association with the encoded block (step S1805). .

次に、上限値算出部６５は、記憶部１２から照合元ブロックの特徴ベクトルを取得する（ステップＳ１８０６）。 Next, the upper limit calculation unit 65 acquires the feature vector of the verification source block from the storage unit 12 (step S1806).

この照合元ブロックは、例えば、１フレーム前のデジタル画像における各符号化ブロックである。なお、前方向予測だけでなく、後方向予測も用いる場合、上限値算出部６５は、１フレーム後のデジタル画像における各符号化ブロックも照合元ブロックとして用いる。 This verification source block is, for example, each encoded block in the digital image one frame before. When using not only forward prediction but also backward prediction, the upper limit calculation unit 65 also uses each encoded block in the digital image after one frame as a verification source block.

次に、上限値算出部６５は、符号化ブロックの特徴ベクトルと照合元ブロックの特徴ベクトルの内積を算出して、符号化ブロックと照合元ブロックについて、ＮＣＣの上限値ＵＢを求める（ステップＳ１８０７）。 Next, the upper limit calculation unit 65 calculates the inner product of the feature vector of the encoded block and the feature vector of the verification source block, and obtains the upper limit value UB of the NCC for the encoding block and the verification source block (step S1807). .

次に、第１照合部６６は、符号化ブロックと照合元ブロックについての上限値ＵＢが閾値θ₁以上であるか否かを判定する（ステップＳ１８０８）。 Next, the first verification unit 66 determines whether or not the upper limit value UB for the encoded block and the verification source block is equal to or greater than the threshold value θ ₁ (step S1808).

上限値ＵＢが閾値θ₁以上である場合、第１照合部６５は、その照合元ブロックを符号化ブロックと類似すると判定し、ＮＣＣ値算出部６７は、その符号化ブロックと照合元ブロックについてＮＣＣ値を算出する（ステップＳ１８０９）。 When the upper limit UB is equal to or greater than the threshold value θ ₁ , the first matching unit 65 determines that the matching source block is similar to the encoded block, and the NCC value calculating unit 67 determines the NCC value for the encoded block and the matching source block. A value is calculated (step S1809).

次に、第２照合部６８は、ＮＣＣ値算出部６７が算出したＮＣＣ値がその符号化ブロックについて算出されたＮＣＣ値のうち最大であるか否かを判定する（ステップＳ１８１０）。 Next, the second collation unit 68 determines whether or not the NCC value calculated by the NCC value calculation unit 67 is the maximum among the NCC values calculated for the encoded block (step S1810).

ＮＣＣ値算出部６７が算出したＮＣＣ値がその符号化ブロックについて算出されたＮＣＣ値のうち最大である場合、第２照合部６８は、その照合元ブロックを動き検出用ブロックとして記憶部５２に記憶（すでに記憶されている場合、更新）する（ステップＳ１８１１）。 When the NCC value calculated by the NCC value calculation unit 67 is the maximum among the NCC values calculated for the encoded block, the second verification unit 68 stores the verification source block in the storage unit 52 as a motion detection block. (Updated if already stored) (step S1811).

一方、上限値ＵＢが閾値θ₁未満である場合、ＮＣＣ値算出部６７が算出したＮＣＣ値がその符号化ブロックについて算出されたＮＣＣ値のうち最大でない場合、又は、第２照合部６８が照合元ブロックを動き検出用ブロックとして記憶部５２に記憶した場合、制御部６０は、符号化ブロックを全ての照合元ブロックと比較したか否かを判定する（ステップＳ１８１２）。 On the other hand, when the upper limit value UB is less than the threshold θ ₁ , the NCC value calculated by the NCC value calculation unit 67 is not the maximum among the NCC values calculated for the coding block, or the second verification unit 68 performs verification. When the original block is stored as the motion detection block in the storage unit 52, the control unit 60 determines whether or not the encoded block has been compared with all the verification source blocks (step S1812).

制御部６０は、まだ比較していない照合元ブロックがある場合、ステップＳ１８０７〜Ｓ１８１１の処理を繰り返し、全ての照合元ブロックとの比較が完了すると、符号化処理部６２は、記憶部５２に記憶された動き検出用ブロックの候補を用いて動き補償を行う。また、符号化処理部６２は、ＤＣＴ変換、量子化等の符号化処理を実施する（ステップＳ１８１３）。 When there is a collation source block that has not been compared yet, the control unit 60 repeats the processing of steps S1807 to S1811, and when the comparison with all the collation source blocks is completed, the encoding processing unit 62 stores it in the storage unit 52. Motion compensation is performed using the motion detection block candidates. Also, the encoding processing unit 62 performs encoding processing such as DCT transformation and quantization (step S1813).

次に、制御部６０は、全ての符号化ブロックの符号化処理が完了したか否かを判定する（ステップＳ１８１４）。全ての符号化ブロックの符号化処理が完了していない場合、ステップＳ１８０３〜Ｓ１８１３の処理を繰り返し、全ての符号化ブロックの符号化処理が完了すると、制御部６０は、一連のステップを終了する。 Next, the control unit 60 determines whether or not the encoding process for all the encoded blocks has been completed (step S1814). When the encoding process for all the encoded blocks is not completed, the processes of steps S1803 to S1813 are repeated, and when the encoding process for all the encoded blocks is completed, the control unit 60 ends the series of steps.

このようにして符号化されたデータは、インターフェース部５１を介して、外部のデコード装置（不図示）に送信される。 The encoded data is transmitted to an external decoding device (not shown) via the interface unit 51.

なお、第２照合部６８は、符号化ブロックと全ての照合元ブロックについての上限値ＵＢが閾値θ₁未満であった場合は、その符号化ブロックに類似する照合元ブロックが存在しないと判断し、任意の照合元ブロックを動き検出用ブロックとして選択する。あるいは、その場合、上限値ＵＢが最も高かった照合元ブロックを動き検出用ブロックとして選択してもよい。 Note that the second matching unit 68, if the upper limit value UB for coding block and all the collation source block is less than the threshold value theta _1, it is determined that there is no matching source block that is similar to the encoding block An arbitrary collation source block is selected as a motion detection block. Alternatively, in this case, the verification source block having the highest upper limit value UB may be selected as the motion detection block.

以上詳述したように、図１８に示したフローチャートに従って動作することによって、画像照合装置２は、画像符号化装置として動作し、高速に動き検出処理を実施することが可能となった。また、インテグラル画像を用いて特徴量算出処理を高速化することが可能となった。また、本実施形態による方法は、スライディングウィンドウに基づく探索（Wei, S.-D., Lai, S.-H., 2007. Efficient normalized cross correlation based on adaptive multilevel successive elimination. In: Proceedings of the 8th Asian conference on Computer vision - Volume Part I. ACCV’07. Springer-Verlag, Berlin, Heidelberg, pp. 638-646.）とも親和性が高い。 As described above in detail, by operating according to the flowchart shown in FIG. 18, the image collating apparatus 2 can operate as an image encoding apparatus and can perform motion detection processing at high speed. In addition, it has become possible to speed up the feature quantity calculation process using integral images. In addition, the method according to the present embodiment uses a sliding window based search (Wei, S.-D., Lai, S.-H., 2007. Efficient normalized cross correlation based on adaptive multilevel successive elimination. In: Proceedings of the 8th Asian conference on Computer vision-Volume Part I. ACCV'07. Springer-Verlag, Berlin, Heidelberg, pp. 638-646.

１、２画像照合装置
１１、５１インターフェース部
１２、５２記憶部
２０、３０、４０、６０制御部
２１第１画像取得部
２２第１画像分割部
２３第１特徴ベクトル算出部
２４第２画像取得部
２５第２画像分割部
２６第２特徴ベクトル算出部
２７、６５上限値算出部
２８、６６第１照合部
３１、６７ＮＣＣ値算出部
３２、６８第２照合部
４１インデックス判定部
５３画像入力部
６１インテグラル画像生成部
６２符号化処理部
６３画像分割部
６４特徴ベクトル算出部 DESCRIPTION OF SYMBOLS 1, 2 Image collation apparatus 11, 51 Interface part 12, 52 Storage part 20, 30, 40, 60 Control part 21 1st image acquisition part 22 1st image division part 23 1st feature vector calculation part 24 2nd image acquisition part 25 Second image dividing unit 26 Second feature vector calculating unit 27, 65 Upper limit calculating unit 28, 66 First collating unit 31, 67 NCC value calculating unit 32, 68 Second collating unit 41 Index determining unit 53 Image input unit 61 Integral image generation unit 62 Encoding processing unit 63 Image division unit 64 Feature vector calculation unit

Claims

An image dividing unit for dividing the first image into a plurality of blocks and dividing the second image into the same number of blocks as the first image;
For each of the first image and the second image, the feature of the first image is based on the sum of the normalized pixel values of the pixels included in each block and the square sum of the normalized pixel values of the pixels included in each block. A feature vector calculation unit for calculating a vector and a feature vector of the second image;
The first calculated by using an equation for calculating an upper limit value of normalized cross-correlation between two images derived from an equation for obtaining a normalized cross-correlation value between two images using Lagrange's undetermined multiplication determination method. An upper limit calculation unit that calculates an upper limit of the normalized cross-correlation between the first image and the second image based on a feature vector of the image and a feature vector of the second image;
A first collation unit that collates the first image with the second image based on whether the upper limit is equal to or greater than a first threshold;
An image collating apparatus comprising:

The feature vector calculation unit calculates m as the number of pixels in the block, n as the number of blocks, φ _{i, j} as the pixel value of the j-th pixel in the i-th block in the first image, and φ _ave as As an average value of pixel values of all the pixels of the first image, the following vector ξ _x is calculated as a feature vector of the first image;

ψ _{i, j} is the pixel value of the j-th pixel in the i-th block in the second image, ψ _ave is the average value of the pixel values of all the pixels in the second image, and the next vector ξ _y is Calculated as a feature vector of the second image,

The image collating apparatus according to claim 1, wherein the upper limit calculation unit calculates an inner product of the feature vector of the first image and the feature vector of the second image as the upper limit value.

A normalized cross-correlation value calculating unit that calculates a normalized cross-correlation value between the first image and the second image, the upper limit value being equal to or greater than the first threshold;
A second collation unit that collates the first image and the second image in detail based on whether or not the normalized cross-correlation value is equal to or greater than a second threshold value that is equal to or less than the first threshold value; The image collating device according to claim 1, further comprising:

Dividing the first image into a plurality of blocks;
Calculating a feature vector of the first image based on a sum of normalized pixel values of pixels included in each block and a square sum of normalized pixel values of pixels included in each block for the first image;
Dividing the second image into the same number of blocks as the first image;
Calculating a feature vector of the second image based on a sum of normalized pixel values of pixels included in each block and a square sum of normalized pixel values of pixels included in each block for the second image;
The first calculated by using an equation for calculating an upper limit value of normalized cross-correlation between two images derived from an equation for obtaining a normalized cross-correlation value between two images using Lagrange's undetermined multiplication determination method. Calculating an upper limit value of normalized cross-correlation between the first image and the second image based on a feature vector of the image and a feature vector of the second image;
Collating the first image with the second image based on whether the upper limit is greater than or equal to a first threshold;
The image collation method characterized by including this.

In the step of calculating the feature vector of the first image, m is the number of pixels in the block, n is the number of blocks, φ _{i, j} is the j-th pixel in the i-th block in the first image. A pixel value, φ _ave is calculated as an average value of pixel values of all the pixels of the first image, and the next vector ξ _x is calculated as a feature vector of the first image;

In the step of calculating the feature vector of the second image, m is the number of pixels in the block, n is the number of blocks, and ψ _{i, j} is the j-th pixel in the i-th block in the second image. The pixel value, ψ _ave is calculated as an average value of the pixel values of all the pixels of the second image, and the next vector ξ _y is calculated as a feature vector of the second image,

The image collating method according to claim 4, wherein in the step of calculating the upper limit value, an inner product of the feature vector of the first image and the feature vector of the second image is calculated as the upper limit value.

Calculating a normalized cross-correlation value between the first image and the second image, the upper limit value being equal to or greater than the first threshold;
Further comprising: collating the first image with the second image in detail based on whether the normalized cross-correlation value is greater than or equal to a second threshold that is less than or equal to the first threshold. The image collating method according to claim 4 or 5.

Dividing the first image into a plurality of blocks;
Calculating a feature vector of the first image based on a sum of normalized pixel values of pixels included in each block and a square sum of normalized pixel values of pixels included in each block for the first image;
Dividing the second image into the same number of blocks as the first image;
Calculating a feature vector of the second image based on a sum of normalized pixel values of pixels included in each block and a square sum of normalized pixel values of pixels included in each block for the second image;
The first calculated by using an equation for calculating an upper limit value of normalized cross-correlation between two images derived from an equation for obtaining a normalized cross-correlation value between two images using Lagrange's undetermined multiplication determination method. Calculating an upper limit value of normalized cross-correlation between the first image and the second image based on a feature vector of the image and a feature vector of the second image;
Collating the first image with the second image based on whether the upper limit is greater than or equal to a first threshold;
A computer program for causing a computer to execute.