JP2016139355A

JP2016139355A - Image search device, image search method and image search program

Info

Publication number: JP2016139355A
Application number: JP2015015031A
Authority: JP
Inventors: 昌彦杉村; Masahiko Sugimura; 馬場　孝之; Takayuki Baba; 孝之馬場; 上原　祐介; Yusuke Uehara; 祐介上原
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2015-01-29
Filing date: 2015-01-29
Publication date: 2016-08-04
Anticipated expiration: 2035-01-29
Also published as: JP6485072B2

Abstract

PROBLEM TO BE SOLVED: To reduce a calculation amount when searching similar feature areas between images.SOLUTION: A storage part 2 stores a bit string by which characteristics of a plurality of feature areas respectively configured in a first image and a second image are represented. An operation unit 3 identifies a specific bit position from bit positions of the bit string where number of "1" bits is more than half the number of all feature areas and reverses the value at the specific bit positions in the bit strings of all feature areas of respective images so as to generate conversion bit strings. The operation unit 3 performs search processing in which similar feature area similar to respective feature areas of the first image is searched among the feature areas of the second image on the basis of the Hamming distance of the conversion bit string between the feature areas. In the search processing, the operation unit 3 restricts the feature areas of the second image to be a calculation object of the Hamming distance for respective feature areas of the first image to the feature areas where the norm of the conversion bit string is included within a fixed range from a norm of the conversion bit string for respective feature areas of the first image.SELECTED DRAWING: Figure 1

Description

本発明は、画像探索装置、画像探索方法および画像探索プログラムに関する。 The present invention relates to an image search device, an image search method, and an image search program.

近年、様々な分野で画像のマッチング技術が広く利用されている。画像同士のマッチング方法の例として、第１画像の特徴点における特徴量（以下、「局所特徴量」と呼ぶ）と、第２画像の特徴点における局所特徴量とを比較して、第１画像の特徴点に対応する第２画像の特徴点（以下、「対応点」と呼ぶ）を探索する手法が利用される。探索により発見された対応点の集合を統計処理することで第２画像における第１画像の存在や第１画像の位置を認識できる。 In recent years, image matching technology has been widely used in various fields. As an example of a matching method between images, a feature amount at a feature point of the first image (hereinafter referred to as “local feature amount”) is compared with a local feature amount at a feature point of the second image, and the first image A method of searching for feature points (hereinafter referred to as “corresponding points”) of the second image corresponding to the feature points is used. By statistically processing the set of corresponding points found by the search, the presence of the first image and the position of the first image in the second image can be recognized.

また、上記のような対応点の探索に利用される局所特徴量をバイナリコードで表す方法がある。その代表例として、ＢＲＩＥＦ（Binary Robust Independent Elementary Features）がある。ＢＲＩＥＦは、特徴点の周囲に設定された複数の画素ペアのそれぞれについて計算された画素間の輝度差に基づく局所特徴量で表現される。例えば、輝度差の符号（正負）に対応するビット値の集合が局所特徴量として利用される。このように局所特徴量をバイナリコードで表す方法では、ハミング距離による高速な計算によって特徴点同士の類似度を算出できるというメリットがある。 In addition, there is a method of expressing a local feature amount used for searching for corresponding points as described above in binary code. A typical example is BRIEF (Binary Robust Independent Elementary Features). BRIEF is expressed by a local feature amount based on a luminance difference between pixels calculated for each of a plurality of pixel pairs set around a feature point. For example, a set of bit values corresponding to the sign (positive / negative) of the luminance difference is used as the local feature amount. As described above, the method of expressing the local feature amount by the binary code has an advantage that the similarity between the feature points can be calculated by high-speed calculation based on the Hamming distance.

一方、多次元の特徴量の最近傍探索を高速に行う方法として、特徴量をノルムの順にソートし、探索範囲をノルムの近いものだけに限定する“ＮＯＭ（Norm Ordering Matching）”という方法が提案されている。 On the other hand, a method called “NOM (Norm Ordering Matching)” that sorts feature values in order of norm and limits the search range to only those with close norms is proposed as a method for performing the nearest neighbor search of multidimensional feature values at high speed. Has been.

なお、ベクトル量子化による画像符号化技術の例として、エッジ部分のビットマップパターンごとに作成されたコードブック内でベクトルをノルム順に並べ替え、ベクトル量子化のマッチング処理を行う際に当該ベクトルのノルムの近傍のみを探索するようにした画像符号化方法が提案されている。 As an example of an image coding technique based on vector quantization, vectors are rearranged in the norm order in the codebook created for each bitmap pattern of the edge portion, and the norm of the vector is used when performing vector quantization matching processing. An image encoding method has been proposed in which only the vicinity of the image is searched.

特開平１１−８８４８号公報Japanese Patent Laid-Open No. 11-8848

M. Calonder, V. Lepetit, C. Strecha, and P. Fua., "BRIEF: Binary Robust Independent Elementary Features", In Proceedings of the European Conference on Computer Vision (ECCV), 2010M. Calonder, V. Lepetit, C. Strecha, and P. Fua., "BRIEF: Binary Robust Independent Elementary Features", In Proceedings of the European Conference on Computer Vision (ECCV), 2010 Mohamed Yousef and Khaled F. Hussain, "Fast exhaustive-search equivalent pattern matching through norm ordering" Journal of Visual Communication and Image Representation, vol. 24, no. 5, pp. 592？601, 2013Mohamed Yousef and Khaled F. Hussain, "Fast exhaustive-search equivalent pattern matching through norm ordering" Journal of Visual Communication and Image Representation, vol. 24, no. 5, pp. 592? 601, 2013

ここで、ＢＲＩＥＦのようなバイナリコードで表された局所特徴量を用いた画像マッチング処理に、上記のＮＯＭを適用した場合を考える。この場合、第１画像および第２画像のそれぞれの局所特徴量がノルムごとに分類されるが、いずれの画像についても、局所特徴量のノルムが、ノルムがとり得る範囲の中央値付近（例えば、局所特徴量が１２８ビットの場合、ノルム“６４”の付近）となる特徴点が極端に多くなりやすい傾向がある。このため、第１画像と第２画像との間で、局所特徴量のノルムが中央値付近となる特徴点同士の組み合わせ数が多くなり、これらの特徴点同士のハミング距離の計算量が多くなってしまう。その結果、ＮＯＭを適用したにもかかわらず、計算効率の向上効果が低いという問題がある。 Here, consider the case where the above-described NOM is applied to image matching processing using local feature amounts represented by binary codes such as BRIEF. In this case, the local feature amounts of the first image and the second image are classified for each norm. However, for any image, the norm of the local feature amount is near the median of the range that the norm can take (for example, When the local feature amount is 128 bits, there is a tendency that feature points having a norm of “64”) are extremely increased. For this reason, between the first image and the second image, the number of combinations of feature points where the norm of the local feature amount is near the median value increases, and the calculation amount of the Hamming distance between these feature points increases. End up. As a result, there is a problem that the effect of improving the calculation efficiency is low in spite of the application of NOM.

１つの側面では、本発明は、画像間で類似する特徴領域を探索する際の計算量を削減することが可能な画像探索装置、画像探索方法および画像探索プログラムを提供することを目的とする。 In one aspect, an object of the present invention is to provide an image search device, an image search method, and an image search program that can reduce the amount of calculation when searching for a similar feature region between images.

１つの案では、次のような画像処理装置が提供される。この画像処理装置は、記憶部と演算部とを有する。記憶部は、第１画像および第２画像のそれぞれに複数設定された特徴領域の特徴を示すビット列を記憶する。演算部は、ビット列のビット位置から、所定値が設定された数が、第１画像および第２画像におけるすべての特徴領域の総数の１／２より大きい所定閾値以上である特定ビット位置を特定する。また、演算部は、第１画像および第２画像におけるすべての特徴領域のビット列における特定ビット位置の値を反転することで、これらのすべての特徴領域についての変換ビット列を生成する。また、演算部は、第１画像のそれぞれの特徴領域と類似する類似特徴領域を、第２画像の特徴領域の中から特徴領域間の変換ビット列のハミング距離に基づいて探索する探索処理を実行する。この探索処理では、演算部は、第１画像の各特徴領域についてハミング距離の計算対象とする第２画像の特徴領域を、変換ビット列のノルムが第１画像の各特徴領域についての変換ビット列のノルムから一定範囲に含まれる特徴領域に限定する。 In one proposal, the following image processing apparatus is provided. The image processing apparatus includes a storage unit and a calculation unit. The storage unit stores a bit string indicating features of a plurality of feature regions set in each of the first image and the second image. The calculation unit specifies a specific bit position from which the number of predetermined values set is greater than or equal to a predetermined threshold value greater than half of the total number of all feature regions in the first image and the second image, from the bit positions of the bit string. . In addition, the calculation unit inverts the values of the specific bit positions in the bit strings of all the feature areas in the first image and the second image, thereby generating converted bit strings for all these feature areas. In addition, the arithmetic unit executes a search process for searching for a similar feature region similar to each feature region of the first image based on the Hamming distance of the converted bit string between the feature regions from the feature region of the second image. . In this search process, the calculation unit calculates the feature area of the second image for which the Hamming distance is to be calculated for each feature area of the first image, and the norm of the conversion bit string for each feature area of the first image is the norm of the conversion bit string. To a feature region included in a certain range.

また、１つの案では、上記の画像処理装置と同様の処理が実行される画像探索方法が提供される。
さらに、１つの案では、上記の画像処理装置と同様の処理をコンピュータに実行させる画像探索プログラムが提供される。 In one proposal, an image search method is provided in which processing similar to that of the image processing apparatus is executed.
Furthermore, in one proposal, there is provided an image search program that causes a computer to execute processing similar to that of the image processing apparatus.

１つの側面では、画像間で類似する特徴領域を探索する際の計算量を削減することができる。 In one aspect, the amount of calculation when searching for similar feature regions between images can be reduced.

第１の実施の形態に係る画像処理装置の構成例および処理例を示す図である。1 is a diagram illustrating a configuration example and a processing example of an image processing apparatus according to a first embodiment. 第２の実施の形態に係る画像処理装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the image processing apparatus which concerns on 2nd Embodiment. 画像検索処理の第１の比較例を示すフローチャートである。It is a flowchart which shows the 1st comparative example of an image search process. 画素ペア管理テーブルの構成例を示す図である。It is a figure which shows the structural example of a pixel pair management table. 局所特徴量を算出する処理の例を示す図である。It is a figure which shows the example of the process which calculates a local feature-value. 投票処理を説明するための図である。It is a figure for demonstrating a voting process. 投票結果に基づく類似画像の判定処理について説明するための図である。It is a figure for demonstrating the determination process of the similar image based on a vote result. 画像検索処理の第２の比較例を示すフローチャートである。It is a flowchart which shows the 2nd comparative example of an image search process. 特徴量管理テーブルの構成例を示す図である。It is a figure which shows the structural example of a feature-value management table. 第２の比較例における対応点の探索処理例を示す図である。It is a figure which shows the search process example of the corresponding point in a 2nd comparative example. ノルムのヒストグラムの例を示す図である。It is a figure which shows the example of the histogram of a norm. 局所特徴量のビット反転処理の例を示す図である。It is a figure which shows the example of the bit inversion process of a local feature-value. ビット反転処理によるノルムの分布の変化の例を示す図である。It is a figure which shows the example of the change of norm distribution by a bit inversion process. 画像処理装置が備える処理機能の構成例を示すブロック図である。It is a block diagram which shows the structural example of the processing function with which an image processing apparatus is provided. 特徴量算出処理の例を示すフローチャートである。It is a flowchart which shows the example of a feature-value calculation process. 画像検索処理の例を示すフローチャート（その１）である。It is a flowchart (the 1) which shows the example of an image search process. 画像検索処理の例を示すフローチャート（その２）である。It is a flowchart (the 2) which shows the example of an image search process.

以下、本発明の実施の形態について図面を参照して説明する。
［第１の実施の形態］
図１は、第１の実施の形態に係る画像処理装置の構成例および処理例を示す図である。図１に示す画像処理装置１は、記憶部２および演算部３を有する。記憶部２は、例えば、ＲＡＭ（Random Access Memory）、ＨＤＤ（Hard Disk Drive）などの記憶装置として実現される。演算部は、例えば、プロセッサとして実現される。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[First Embodiment]
FIG. 1 is a diagram illustrating a configuration example and a processing example of the image processing apparatus according to the first embodiment. An image processing apparatus 1 illustrated in FIG. 1 includes a storage unit 2 and a calculation unit 3. The storage unit 2 is realized as a storage device such as a RAM (Random Access Memory) and an HDD (Hard Disk Drive), for example. The calculation unit is realized as a processor, for example.

記憶部２は、第１画像および第２画像のそれぞれに複数設定された特徴領域の特徴を示すビット列を記憶する。なお、このようにビット列で表すことが可能な特徴量としては、例えば、ＢＲＩＥＦ、ＯＲＢ（Oriented Fast and Rotated BRIEF）、ＣＡＲＤ（Compact And Real-time Descriptors）などがある。 The storage unit 2 stores a bit string indicating features of a plurality of feature regions set in each of the first image and the second image. Note that the feature quantities that can be represented by a bit string in this way include, for example, BREF, ORB (Oriented Fast and Rotated BRIEF), and CARD (Compact And Real-time Descriptors).

図１の例では、記憶部２には、第１画像の特徴量１０ａと第２画像の特徴量２０ａとが記憶される。第１画像の特徴量１０ａには、第１画像の特徴領域ごとのビット列が含まれ、第２画像の特徴量２０ａには、第２画像の特徴領域ごとのビット列が含まれている。 In the example of FIG. 1, the storage unit 2 stores a feature amount 10a of the first image and a feature amount 20a of the second image. The feature quantity 10a of the first image includes a bit string for each feature area of the first image, and the feature quantity 20a of the second image includes a bit string for each feature area of the second image.

演算部３は、第１画像の各特徴領域と類似する類似特徴領域を、第２画像の特徴領域の中から特定する。この特定のために、演算部３は、記憶部２に記憶されたビット列を参照しながら次のような処理を実行する。 The calculation unit 3 identifies a similar feature region similar to each feature region of the first image from the feature regions of the second image. For this specification, the calculation unit 3 executes the following process while referring to the bit string stored in the storage unit 2.

演算部３は、ビット列のビット位置から、所定値が設定された数が所定の閾値以上である特定ビット位置を特定する。所定値とは、１または０である。以下、第１の実施の形態では、所定値を１とする。また、閾値は、第１画像および第２画像におけるすべての特徴領域の総数の１／２より大きい値に設定される。 The computing unit 3 identifies a specific bit position where the number of predetermined values set is equal to or greater than a predetermined threshold from the bit positions of the bit string. The predetermined value is 1 or 0. Hereinafter, the predetermined value is 1 in the first embodiment. Further, the threshold value is set to a value larger than ½ of the total number of all feature regions in the first image and the second image.

演算部３は、第１画像および第２画像におけるすべての特徴領域のビット列における、上記の特定ビット位置の値を反転する。これにより、これらすべての特徴領域にそれぞれ対応する変換ビット列が生成される。 The calculation unit 3 inverts the value of the specific bit position in the bit strings of all feature regions in the first image and the second image. As a result, conversion bit strings corresponding to all of these feature regions are generated.

図１の例では、第１画像および第２画像の全ビット列のビット位置のうち、上位から２ビット目において、１の値が全体の半数を超えたとする。この場合、演算部３は、全ビット列における２ビット目の値を反転する（ステップＳ１）。これにより、各ビット列は変換ビット列に変換される。なお、図１において、第１画像の特徴量１０ｂは、第１画像の特徴量１０ａに含まれる各ビット列に対応する各変換ビット列を含む。また、第２画像の特徴量２０ｂは、第２画像の特徴量２０ａに含まれる各ビット列に対応する各変換ビット列を含む。 In the example of FIG. 1, it is assumed that the value of 1 exceeds half of the whole in the second bit from the top among the bit positions of all the bit strings of the first image and the second image. In this case, the arithmetic unit 3 inverts the value of the second bit in all the bit strings (step S1). Thereby, each bit string is converted into a converted bit string. In FIG. 1, the feature amount 10 b of the first image includes converted bit sequences corresponding to the bit sequences included in the feature amount 10 a of the first image. The feature amount 20b of the second image includes each converted bit string corresponding to each bit string included in the feature amount 20a of the second image.

次に、演算部３は、第１画像のそれぞれの特徴領域と類似する類似特徴領域を、第２画像の特徴領域の中から特徴領域間の変換ビット列のハミング距離に基づいて探索する「探索処理」を実行する。例えば、演算部３は、第１画像のある特徴領域の変換ビット列と、第２画像に含まれる１以上の特徴領域のそれぞれの変換ビット列とのハミング距離を計算する。そして、演算部３は、ハミング距離の算出結果に基づいて類似特徴領域を特定する。 Next, the calculation unit 3 searches for a similar feature region similar to each feature region of the first image based on the Hamming distance of the converted bit string between the feature regions from the feature region of the second image. ”Is executed. For example, the calculation unit 3 calculates a Hamming distance between a converted bit string of a certain feature area of the first image and each converted bit string of one or more feature areas included in the second image. And the calculating part 3 specifies a similar feature area | region based on the calculation result of Hamming distance.

この探索処理では、演算部３は、第１画像の各特徴領域についてハミング距離の計算対象とする第２画像の特徴領域を、変換ビット列のノルムが第１画像の各特徴領域についての変換ビット列のノルムから一定範囲に含まれる特徴領域に限定する。ここで、変換ビット列のノルムとは、変換ビット列に含まれる１の数を示す。 In this search processing, the calculation unit 3 uses the feature region of the second image to be calculated for the Hamming distance for each feature region of the first image, and the norm of the transform bit sequence for the feature region of the first image. The feature area is limited to a certain range from the norm. Here, the norm of the converted bit string indicates the number of 1 included in the converted bit string.

例えば、図１において、第１画像のある特徴領域（以下、「対象特徴領域」と記載する）についての変換ビット列１１のノルムが２であったとする。演算部３は、変換ビット列１１と、第２画像の１以上の特徴領域に対応する変換ビット列とのハミング距離を計算することで、第２画像の特徴領域の中から対象特徴領域に類似する類似特徴領域を探索する。このとき、演算部３は、ハミング距離の計算対象とする第２画像の特徴領域を、変換ビット列のノルムが２から一定範囲に含まれる特徴領域に限定する。 For example, in FIG. 1, it is assumed that the norm of the converted bit string 11 for a certain feature region (hereinafter referred to as “target feature region”) of the first image is 2. The calculation unit 3 calculates a hamming distance between the converted bit string 11 and a converted bit string corresponding to one or more feature areas of the second image, so that the similarity similar to the target feature area among the feature areas of the second image is calculated. Search for feature regions. At this time, the calculation unit 3 limits the feature region of the second image that is the calculation target of the Hamming distance to a feature region in which the norm of the converted bit string is included in a certain range from 2.

例としてこの一定範囲をプラスマイナス１の範囲とし、図１の例では、第２画像の各特徴領域に対応する変換ビット列のうち、変換ビット列２１，２２のみがノルムが１〜３に含まれるものとする。この場合、演算部３は、変換ビット列１１と第２画像の全変換ビット列とのハミング距離を計算するのではなく、変換ビット列１１と変換ビット列２１とのハミング距離、および、変換ビット列１１と変換ビット列２２とのハミング距離のみを計算する（ステップＳ２）。 As an example, this fixed range is set to a range of plus or minus 1, and in the example of FIG. 1, among the converted bit strings corresponding to each feature area of the second image, only the converted bit strings 21 and 22 include norms 1 to 3. And In this case, the arithmetic unit 3 does not calculate the Hamming distance between the converted bit string 11 and all the converted bit strings of the second image, but the Hamming distance between the converted bit string 11 and the converted bit string 21, and the converted bit string 11 and the converted bit string. Only the Hamming distance with 22 is calculated (step S2).

ここで、ビット列同士のハミング距離は、ビット列間で値が異なるビットの数を示す。一方、ビット列のノルムは、ビット列に含まれる１の数を示す。このため、ノルムが近いビット列同士では、各ビット列に含まれる１の数が近いことから、ハミング距離が小さくなる可能性が高い。一方、ノルムが異なるビット列同士では、各ビット列に含まれる１の数が異なることから、ハミング距離が大きくなる可能性が高い。したがって、上記の探索処理のように、ハミング距離の計算対象がノルムに応じて限定された場合でも、ハミング距離に基づく類似特徴領域の探索精度が低下する可能性は低い。すなわち、上記の探索処理によれば、類似特徴領域の探索精度を維持しつつ、ハミング距離の計算回数を減らし、処理全体に要する時間を短縮することができる。 Here, the Hamming distance between bit strings indicates the number of bits having different values between the bit strings. On the other hand, the norm of the bit string indicates the number of 1 included in the bit string. For this reason, there is a high possibility that the Hamming distance becomes small between bit strings having close norms because the number of 1s included in each bit string is close. On the other hand, bit strings having different norms have a high possibility of increasing the Hamming distance because the number of 1s included in each bit string is different. Therefore, even when the calculation target of the Hamming distance is limited according to the norm as in the above search processing, the possibility that the search accuracy of the similar feature region based on the Hamming distance is low is low. That is, according to the above search processing, the number of times of Hamming distance calculation can be reduced and the time required for the entire processing can be shortened while maintaining the search accuracy of similar feature regions.

ただし、ノルムについては、ノルムの値の分布が、ノルムがとり得る範囲の中央値付近に極端に集中しやすいという性質がある。この性質から、例えば、ビット反転処理が施される前のビット列を用いて上記のような探索処理が実行された場合には、第１画像と第２画像との間で、ビット列のノルムが中央値付近となる特徴領域の組み合わせ数が多くなる。このため、それらの組み合わせによるハミング距離の計算回数が多くなってしまう。この場合、ハミング距離の計算対象をノルムに応じて限定したにもかかわらず、計算効率の向上効果が低い。 However, the norm has a property that the distribution of norm values tends to be extremely concentrated near the median of the range that the norm can take. From this property, for example, when the above search processing is executed using the bit sequence before the bit inversion processing, the norm of the bit sequence is the center between the first image and the second image. The number of combinations of feature regions near the value increases. For this reason, the number of times of Hamming distance calculation by those combinations increases. In this case, although the calculation target of the Hamming distance is limited according to the norm, the effect of improving the calculation efficiency is low.

このような問題に対し、第１の実施の形態に係る画像処理装置１は、上記のようにビット反転処理を施した変換ビット列を用いて探索処理を行う。ビット反転処理では、１の数が少なくとも全体の半数を超えるビット位置が特定ビット位置として特定される。そして、第１画像および第２画像におけるすべての特徴領域のビット列における特定ビット位置の値を反転され、変換ビット列が生成される。 For such a problem, the image processing apparatus 1 according to the first embodiment performs a search process using the converted bit string subjected to the bit inversion process as described above. In the bit inversion processing, a bit position where the number of 1 exceeds at least half of the whole is specified as the specific bit position. Then, the values of specific bit positions in the bit strings of all the feature regions in the first image and the second image are inverted, and a converted bit string is generated.

このようなビット反転処理により、特定ビット位置における１の値が減少し、その結果、全変換ビット列における１の値が減少する。これにより、変換ビット列に基づくノルムの分布は、ノルムの中央値からより小さい領域に分散するようになり、ノルムの度数のピーク値は減少する。したがって、探索処理において、ビット列のノルムが中央値付近をとる特徴領域の組み合わせ数が減少し、それらの組み合わせによるハミング距離の計算回数が減少する。 By such bit inversion processing, the value of 1 at the specific bit position is decreased, and as a result, the value of 1 in the entire converted bit string is decreased. As a result, the norm distribution based on the converted bit string is distributed in a smaller area from the median value of the norm, and the peak value of the frequency of the norm is reduced. Therefore, in the search process, the number of combinations of feature regions in which the norm of the bit string takes the vicinity of the median value is reduced, and the number of times of Hamming distance calculation by those combinations is reduced.

ここで、ビット反転処理により、ノルムの中央値付近以外の範囲では特徴領域同士の組み合わせ数は増加する。しかし、ビット反転処理によってノルムの分布が分散することで、変換ビット列に基づくノルムのヒストグラムでは、度数が減少した各ノルムでの度数の減少数より、度数が増加した各ノルムでの度数の増加数の方が大きくなりやすい。しかも、このようなノルムの分布の変化が、第１画像と第２画像の両方において発生する。このため、全体としてはハミング距離の計算回数が大きく減少する可能性が高い。 Here, by bit inversion processing, the number of combinations of feature regions increases in a range other than the vicinity of the median value of the norm. However, because the norm distribution is dispersed by bit inversion processing, the norm histogram based on the transformed bit string has a frequency increase in each norm with an increased frequency than a frequency decrease in each norm with a decreased frequency. Tends to be larger. In addition, such a change in norm distribution occurs in both the first image and the second image. For this reason, as a whole, there is a high possibility that the number of Hamming distance calculations will be greatly reduced.

したがって、第１の実施の形態によれば、第１画像内の各特徴領域と類似する類似特徴領域を第２の画像内の特徴領域から探索する処理における計算量を削減することができ、その処理に要する時間が短縮され、処理効率が向上する。 Therefore, according to the first embodiment, it is possible to reduce the amount of calculation in the process of searching for a similar feature region similar to each feature region in the first image from the feature region in the second image. Processing time is shortened and processing efficiency is improved.

また、第１画像および第２画像のすべての特徴領域に対応するビット列について、同じビット位置の値が反転されたとしても、ビット反転後の変換ビット列を用いたハミング距離の計算結果は、ビット反転前のビット列を用いたハミング距離の計算結果と変わらない。このため、上記のように変換ビット列を用いて探索処理が行われた場合でも、類似特徴領域の特定精度に変化はない。 Further, even if the values of the same bit position are inverted for the bit strings corresponding to all the feature regions of the first image and the second image, the calculation result of the Hamming distance using the converted bit string after the bit inversion is the bit inversion This is the same as the calculation result of the Hamming distance using the previous bit string. For this reason, even when the search process is performed using the converted bit string as described above, the accuracy of specifying the similar feature region does not change.

なお、以上の第１の実施の形態では、特定ビット位置を特定する際に計数される所定値を１としたが、この値を０とすることもできる。この場合、上記のようなビット反転処理により、特定ビット位置における０の値が減少し、その結果、全変換ビット列における０の値がビット反転処理前より減少する。これにより変換ビット列に基づくノルムの分布は、ノルムの中央値からより大きい領域に分散するようになり、ノルムの度数のピーク値は減少する。したがって、所定値を１とした場合と同様の効果が得られる。 In the first embodiment described above, the predetermined value counted when specifying the specific bit position is set to 1, but this value can also be set to 0. In this case, the value of 0 in the specific bit position is reduced by the bit inversion processing as described above, and as a result, the value of 0 in all the converted bit strings is reduced from that before the bit inversion processing. As a result, the norm distribution based on the converted bit string is dispersed in a larger region from the median value of the norm, and the peak value of the norm frequency is reduced. Therefore, the same effect as when the predetermined value is 1 can be obtained.

［第２の実施の形態］
次に、第２の実施の形態として、複数の撮像画像の中からキー画像が選択され、キー画像以外の撮像画像からキー画像と類似するシーンの撮像画像を検索する画像処理装置について説明する。なお、第２の実施の形態では、画像の特徴量としてＢＲＩＥＦを用いるが、例えば、ＯＲＢ、ＣＡＲＤなどの他の種類のバイナリ特徴量を用いることもできる。 [Second Embodiment]
Next, as a second embodiment, an image processing apparatus that selects a key image from a plurality of captured images and searches a captured image of a scene similar to the key image from captured images other than the key image will be described. In the second embodiment, BRIEF is used as the image feature amount, but other types of binary feature amounts such as ORB and CARD can also be used.

図２は、第２の実施の形態に係る画像処理装置のハードウェア構成例を示す図である。第２の実施の形態に係る画像処理装置１００は、例えば、図２に示すようなコンピュータとして実現される。 FIG. 2 is a diagram illustrating a hardware configuration example of the image processing apparatus according to the second embodiment. The image processing apparatus 100 according to the second embodiment is realized as a computer as shown in FIG. 2, for example.

画像処理装置１００は、プロセッサ１０１によって装置全体が制御されている。プロセッサ１０１は、マルチプロセッサであってもよい。プロセッサ１０１は、例えばＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、またはＰＬＤ（Programmable Logic Device）である。またプロセッサ１０１は、ＣＰＵ、ＭＰＵ、ＤＳＰ、ＡＳＩＣ、ＰＬＤのうちの２以上の要素の組み合わせであってもよい。 The entire image processing apparatus 100 is controlled by a processor 101. The processor 101 may be a multiprocessor. The processor 101 is, for example, a central processing unit (CPU), a micro processing unit (MPU), a digital signal processor (DSP), an application specific integrated circuit (ASIC), or a programmable logic device (PLD). The processor 101 may be a combination of two or more elements among CPU, MPU, DSP, ASIC, and PLD.

プロセッサ１０１には、バス１０８を介して、ＲＡＭ１０２と複数の周辺機器が接続されている。
ＲＡＭ１０２は、画像処理装置１００の主記憶装置として使用される。ＲＡＭ１０２には、プロセッサ１０１に実行させるＯＳ（Operating System）プログラムやアプリケーションプログラムの少なくとも一部が一時的に格納される。また、ＲＡＭ１０２には、プロセッサ１０１による処理に必要な各種データが格納される。 A RAM 102 and a plurality of peripheral devices are connected to the processor 101 via a bus 108.
The RAM 102 is used as a main storage device of the image processing apparatus 100. The RAM 102 temporarily stores at least part of an OS (Operating System) program and application programs to be executed by the processor 101. The RAM 102 stores various data necessary for processing by the processor 101.

バス１０８に接続されている周辺機器としては、ＨＤＤ１０３、グラフィック処理装置１０４、入力インタフェース１０５、読み取り装置１０６および通信インタフェース１０７がある。 Peripheral devices connected to the bus 108 include an HDD 103, a graphic processing device 104, an input interface 105, a reading device 106, and a communication interface 107.

ＨＤＤ１０３は、画像処理装置１００の補助記憶装置として使用される。ＨＤＤ１０３には、ＯＳプログラム、アプリケーションプログラム、および各種データが格納される。なお、補助記憶装置としては、ＳＳＤ（Solid State Drive）などの他の種類の不揮発性記憶装置を使用することもできる。 The HDD 103 is used as an auxiliary storage device of the image processing apparatus 100. The HDD 103 stores an OS program, application programs, and various data. As the auxiliary storage device, other types of nonvolatile storage devices such as SSD (Solid State Drive) can be used.

グラフィック処理装置１０４には、表示装置１０４ａが接続されている。グラフィック処理装置１０４は、プロセッサ１０１からの命令にしたがって、画像を表示装置１０４ａに表示させる。表示装置としては、ＣＲＴ（Cathode Ray Tube）を用いた表示装置や液晶表示装置などがある。 A display device 104 a is connected to the graphic processing device 104. The graphic processing device 104 displays an image on the display device 104a in accordance with an instruction from the processor 101. Examples of the display device include a display device using a CRT (Cathode Ray Tube) and a liquid crystal display device.

入力インタフェース１０５には、入力装置１０５ａが接続されている。入力インタフェース１０５は、入力装置１０５ａから出力される信号をプロセッサ１０１に送信する。入力装置１０５ａとしては、キーボードやポインティングデバイスなどがある。ポインティングデバイスとしては、マウス、タッチパネル、タブレット、タッチパッド、トラックボールなどがある。 An input device 105 a is connected to the input interface 105. The input interface 105 transmits a signal output from the input device 105a to the processor 101. Examples of the input device 105a include a keyboard and a pointing device. Examples of pointing devices include a mouse, a touch panel, a tablet, a touch pad, and a trackball.

読み取り装置１０６には、可搬型記録媒体１０６ａが脱着される。読み取り装置１０６は、可搬型記録媒体１０６ａに記録されたデータを読み取ってプロセッサ１０１に送信する。可搬型記録媒体１０６ａとしては、光ディスク、光磁気ディスク、半導体メモリなどがある。 A portable recording medium 106 a is detached from the reading device 106. The reading device 106 reads the data recorded on the portable recording medium 106 a and transmits it to the processor 101. Examples of the portable recording medium 106a include an optical disk, a magneto-optical disk, and a semiconductor memory.

通信インタフェース１０７は、ネットワーク１０７ａを介して他の装置との間でデータの送受信を行う。
以上のようなハードウェア構成によって、画像処理装置１００の処理機能を実現することができる。 The communication interface 107 transmits / receives data to / from other devices via the network 107a.
With the hardware configuration described above, the processing functions of the image processing apparatus 100 can be realized.

ところで、上記の画像処理装置１００の記憶装置（例えば、ＨＤＤ１０３）には、複数の撮像画像のデータが記憶される。これらの撮像画像は、撮像装置によって撮像された画像である。これらの撮像画像のデータは、例えば、可搬型記録媒体１０６ａを用いて画像処理装置１００の記憶装置に格納されてもよいし、あるいは、ネットワーク１０７ａを介して画像処理装置１００の記憶装置に格納されてもよい。 Incidentally, a plurality of captured image data are stored in the storage device (for example, the HDD 103) of the image processing apparatus 100. These captured images are images captured by the imaging device. These captured image data may be stored in the storage device of the image processing apparatus 100 using, for example, the portable recording medium 106a, or stored in the storage device of the image processing apparatus 100 via the network 107a. May be.

画像処理装置１００では、写真管理ソフトウェアが実行されることで次のような処理が行われる。ユーザの入力操作により、記憶装置内の複数の撮像画像からキー画像が選択される。すると、画像処理装置１００は、記憶装置内の複数の撮像画像のうちキー画像を除く撮像画像（以下、「対象画像」と呼ぶ）から、キー画像と類似するシーンの撮像画像を抽出する。例えば、キー画像に含まれる対象物と同じ対象物が写っていると推定される対象画像が、キー画像と類似するシーンの撮像画像として抽出される。これにより、ユーザは、例えば、素材として必要な画像を画像処理装置１００内から検索したり、同じイベントの開催時の写真を集めて自動的に整理することができる。したがって、ユーザに利便性や娯楽性を提供することができる。 In the image processing apparatus 100, the following processing is performed by executing the photo management software. A key image is selected from a plurality of captured images in the storage device by a user input operation. Then, the image processing apparatus 100 extracts a captured image of a scene similar to the key image from a captured image excluding the key image (hereinafter referred to as “target image”) among the plurality of captured images in the storage device. For example, a target image estimated to include the same target object as the target object included in the key image is extracted as a captured image of a scene similar to the key image. Thus, for example, the user can search for an image necessary as a material from the image processing apparatus 100, or can collect and automatically organize photos at the same event. Therefore, convenience and entertainment can be provided to the user.

このような画像処理装置１００は、例えば、パーソナルコンピュータ、スマートフォンなどのユーザによって操作される端末装置として実現される。また、画像処理装置１００は、ネットワーク上のサーバ装置として実現されてもよい。この場合、撮像画像のデータは、例えば、ユーザの端末装置からネットワークを介して画像処理装置１００にアップロードされる。 Such an image processing device 100 is realized as a terminal device operated by a user such as a personal computer or a smartphone. Further, the image processing apparatus 100 may be realized as a server apparatus on a network. In this case, the captured image data is uploaded from the user terminal device to the image processing apparatus 100 via the network, for example.

なお、画像処理装置１００の画像検索機能は、上記のような撮像画像の管理の他、例えば、プレゼンテーション資料などの文書コンテンツの管理に利用することもできる。例えば、画像処理装置１００の記憶装置に複数の文書のデータが記憶され、これらの中からキー文書が選択される。画像処理装置１００は、例えば、文書表示時の見た目がキー文書と似ている文を含む文書をその他の文書の中から抽出することもできるし、あるいは、キー文書と同じ画像や表、グラフなどを含む文書をその他の文書の中から抽出することもできる。これにより、文書を探すための作業時間を低減できる。また、過去の文書資産の再利用が推進され、業務の効率化を図ることもできる。 Note that the image search function of the image processing apparatus 100 can be used for managing document content such as presentation materials in addition to management of captured images as described above. For example, data of a plurality of documents is stored in the storage device of the image processing apparatus 100, and a key document is selected from these. For example, the image processing apparatus 100 can extract a document including a sentence that looks similar to a key document when displaying the document from other documents, or the same image, table, graph, or the like as the key document. It is also possible to extract a document including “” from other documents. Thereby, the work time for searching for a document can be reduced. In addition, the reuse of past document assets is promoted, and the efficiency of operations can be improved.

次に、画像検索処理の比較例について説明し、比較例での問題点について説明する。そして、その後、第２の実施の形態での画像検索処理の詳細について説明する。
図３は、画像検索処理の第１の比較例を示すフローチャートである。図３に示す比較例では、図３のステップＳ１０１，Ｓ１０２でＢＲＩＥＦの局所特徴量が算出され、ステップＳ１０３以後の処理でキー画像と類似する対象画像が局所特徴量に基づいて抽出される。 Next, a comparative example of image search processing will be described, and problems in the comparative example will be described. Then, details of the image search process in the second embodiment will be described.
FIG. 3 is a flowchart illustrating a first comparative example of the image search process. In the comparative example shown in FIG. 3, the local feature value of BREF is calculated in steps S101 and S102 of FIG.

［ステップＳ１０１］画像処理装置は、各撮像画像上に複数の特徴点を設定する。ここでは例として、撮像画像上に等間隔（例えば、２４画素間隔）で特徴点を設定するＤｅｎｓｅＳａｍｐｌｉｎｇを用いる。 [Step S101] The image processing apparatus sets a plurality of feature points on each captured image. Here, as an example, Dense Sampling that sets feature points at equal intervals (for example, at intervals of 24 pixels) on the captured image is used.

［ステップＳ１０２］画像処理装置は、各撮像画像の各特徴点についての局所特徴量を算出する。
ＢＲＩＥＦは、各特徴点を中心とする一定領域（以下、「特徴領域」と呼ぶ）ごとの局所特徴量として算出される。特徴領域は、例えば、特徴点を中心とする４８画素四方の矩形領域とされる。また、特徴領域の内部には、複数の画素ペアがあらかじめ設定される。ある特徴点の局所特徴量は、対応する特徴領域内の各画素ペアの輝度差の符号を組み合わせて構成されたビット列として算出される。 [Step S102] The image processing apparatus calculates a local feature amount for each feature point of each captured image.
BRIEF is calculated as a local feature amount for each fixed region (hereinafter referred to as “feature region”) centered on each feature point. The feature region is, for example, a rectangular region of 48 pixels around the feature point. In addition, a plurality of pixel pairs are set in advance in the feature region. The local feature amount of a certain feature point is calculated as a bit string configured by combining the signs of the luminance difference of each pixel pair in the corresponding feature region.

ここで、図４は、画素ペア管理テーブルの構成例を示す図である。画素ペアを構成する各画素の座標は、画素ペア管理テーブル１１２にあらかじめ登録されている。図４に示すように、画素ペア管理テーブル１１２には、画素ペアを識別するためのＩＤと、画素ペアを構成する第１画素および第２画素の各座標とが登録される。画素ペアは、例えば、ランダムに設定される。画素ペア管理テーブル１１２に登録された画素ペアの情報は、すべての特徴領域に対して共通に適用される。 Here, FIG. 4 is a diagram illustrating a configuration example of the pixel pair management table. The coordinates of each pixel constituting the pixel pair are registered in advance in the pixel pair management table 112. As shown in FIG. 4, in the pixel pair management table 112, an ID for identifying a pixel pair and the coordinates of the first pixel and the second pixel constituting the pixel pair are registered. Pixel pairs are set at random, for example. The pixel pair information registered in the pixel pair management table 112 is commonly applied to all feature regions.

図５は、局所特徴量を算出する処理の例を示す図である。図５では、撮像画像２００における各特徴点の局所特徴量を特徴量管理テーブル１３１に登録する処理の例を示す。なお、特徴量管理テーブル１３１は、撮像画像ごとに作成されるものとする。 FIG. 5 is a diagram illustrating an example of processing for calculating a local feature amount. FIG. 5 shows an example of processing for registering the local feature amount of each feature point in the captured image 200 in the feature amount management table 131. Note that the feature amount management table 131 is created for each captured image.

例えば、撮像画像２００に設定された特徴点２０１の局所特徴量は、次のようにして算出される。画像処理装置は、特徴点２０１に対応する特徴領域２０２について、各画素ペアの輝度差を計算する（ステップＳ１０２ａ）。なお、画素ペアの輝度差は、例えば、画素ペア管理テーブル１１２における第１画素の輝度値から第２画素の輝度値を減算することで得られる。 For example, the local feature amount of the feature point 201 set in the captured image 200 is calculated as follows. The image processing apparatus calculates the luminance difference of each pixel pair for the feature region 202 corresponding to the feature point 201 (step S102a). The luminance difference between the pixel pairs can be obtained by subtracting the luminance value of the second pixel from the luminance value of the first pixel in the pixel pair management table 112, for example.

画像処理装置は、算出された輝度差の符号に応じたビット値を組み合わせることでビット列２０３を生成する（ステップＳ１０２ｂ）。例えば、画像処理装置は、画素ペア順に、輝度差が正値の場合にはビット値“１”を、輝度差が０以下の場合にはビット値“０”をビット列に付加する。図４のようにＭ組の画素ペアが設定されている場合、Ｍビットのビット列が生成される。画像処理装置は、生成されたビット列２０３を、特徴点２０１の局所特徴量として特徴量管理テーブル１３１に登録する（ステップＳ１０２ｃ）。 The image processing apparatus generates the bit string 203 by combining the bit values corresponding to the calculated sign of the luminance difference (step S102b). For example, in the pixel pair order, the image processing apparatus adds a bit value “1” to the bit string when the luminance difference is a positive value and a bit value “0” when the luminance difference is 0 or less. When M pixel pairs are set as shown in FIG. 4, an M-bit bit string is generated. The image processing apparatus registers the generated bit string 203 in the feature amount management table 131 as a local feature amount of the feature point 201 (step S102c).

このようにして、撮像画像２００に対応する特徴量管理テーブル１３１には、撮像画像２００に設定された各特徴点についての局所特徴量が登録される。
以下、図３に戻って説明する。 In this way, the local feature amount for each feature point set in the captured image 200 is registered in the feature amount management table 131 corresponding to the captured image 200.
Hereinafter, the description will be returned to FIG.

［ステップＳ１０３］画像処理装置は、ユーザの操作入力に応じて撮像画像からキー画像を選択する。
［ステップＳ１０４］画像処理装置は、キー画像以外の撮像画像（対象画像）の中から１つを選択する。 [Step S103] The image processing apparatus selects a key image from the captured image in response to a user operation input.
[Step S104] The image processing apparatus selects one of the captured images (target images) other than the key image.

［ステップＳ１０５］画像処理装置は、キー画像の特徴点を１つ選択する。
［ステップＳ１０６］画像処理装置は、ステップＳ１０４で選択した対象画像から、ステップＳ１０５でキー画像から選択した特徴点に類似する特徴点（対応点）を探索する。この処理では、画像処理装置は、キー画像から選択した特徴点の局所特徴量と、対象画像の各特徴点の局所特徴量とのハミング距離を計算し、対象画像の特徴点のうちハミング距離が最小の特徴点を、類似度が最も高い対応点として抽出する。 [Step S105] The image processing apparatus selects one feature point of the key image.
[Step S106] The image processing apparatus searches for a feature point (corresponding point) similar to the feature point selected from the key image in Step S105 from the target image selected in Step S104. In this processing, the image processing apparatus calculates the Hamming distance between the local feature amount of the feature point selected from the key image and the local feature amount of each feature point of the target image, and the Hamming distance among the feature points of the target image is The smallest feature point is extracted as the corresponding point having the highest similarity.

［ステップＳ１０７］画像処理装置は、ステップＳ１０５で選択した特徴点とステップＳ１０６で探索された対応点とが一致するように対象画像にキー画像を重ねた場合の、対象画像におけるキー画像の中心位置を推定する。画像処理装置は、対象画像の各画素のうち、推定された中心位置の画素に対して投票する。なお、実際には、画像処理装置は、例えば、推定された中心位置を中心とした所定領域（例えば、１０画素四方の矩形領域）に含まれる各画素に投票してもよい。 [Step S107] When the key image is overlaid on the target image so that the feature point selected in Step S105 matches the corresponding point searched in Step S106, the image processing apparatus performs the center position of the key image in the target image. Is estimated. The image processing device votes for the pixel at the estimated center position among the pixels of the target image. In practice, the image processing apparatus may vote for each pixel included in a predetermined area (for example, a rectangular area of 10 pixels square) centered on the estimated center position, for example.

［ステップＳ１０８］画像処理装置は、キー画像の全特徴点について処理済みかを判定する。処理済みでない特徴点がある場合、ステップＳ１０５の処理が実行される。全特徴点について処理済みの場合、ステップＳ１０９の処理が実行される。 [Step S108] The image processing apparatus determines whether all feature points of the key image have been processed. If there is a feature point that has not been processed, the process of step S105 is executed. If all feature points have been processed, the process of step S109 is executed.

［ステップＳ１０９］画像処理装置は、ステップＳ１０４で選択した対象画像の各画素についての投票数の最大値が所定の閾値を超えている場合、この対象画像をキー画像に類似する画像と判定する。一方、画像処理装置は、投票数の最大値が閾値以下の場合、この対象画像をキー画像に類似しない画像と判定する。 [Step S109] When the maximum number of votes for each pixel of the target image selected in Step S104 exceeds a predetermined threshold, the image processing apparatus determines that the target image is an image similar to the key image. On the other hand, when the maximum number of votes is equal to or less than the threshold value, the image processing apparatus determines that the target image is an image that is not similar to the key image.

［ステップＳ１１０］画像処理装置は、全対象画像について処理済みかを判定する。処理済みでない対象画像がある場合、ステップＳ１０４の処理が実行される。一方、全対象画像について処理済みの場合、画像処理装置は、ステップＳ１０９でキー画像に類似すると判定された対象画像の識別情報を出力して、画像検索処理を終了する。 [Step S110] The image processing apparatus determines whether all target images have been processed. If there is a target image that has not been processed, the process of step S104 is executed. On the other hand, if all the target images have been processed, the image processing apparatus outputs the identification information of the target image determined to be similar to the key image in step S109, and ends the image search process.

ここで、図６は、投票処理を説明するための図である。図６では、キー画像２００ａの特徴点２０１と類似する対象画像２１０の対応点を探索する処理の例を示す。画像処理装置は、例えば、キー画像２００ａの特徴点２０１の局所特徴量と、対象画像２１０の各特徴点の局所特徴量とのハミング距離を計算することで、対応点を探索する（ステップＳ１０６ａ）。 Here, FIG. 6 is a diagram for explaining the voting process. FIG. 6 shows an example of processing for searching for corresponding points of the target image 210 similar to the feature points 201 of the key image 200a. For example, the image processing apparatus searches for a corresponding point by calculating a Hamming distance between the local feature amount of the feature point 201 of the key image 200a and the local feature amount of each feature point of the target image 210 (step S106a). .

キー画像２００ａの特徴点２０１と類似する対応点として、対象画像２１０の特徴点２１１が抽出されたものとする。このとき、画像処理装置は、特徴点２０１と特徴点２１１（対応点）とが一致するように対象画像２１０にキー画像２００ａを重ねた場合の、対象画像２１０におけるキー画像２００ａの中心位置２０４を推定する（ステップＳ１０７ａ）。 It is assumed that the feature point 211 of the target image 210 is extracted as a corresponding point similar to the feature point 201 of the key image 200a. At this time, the image processing apparatus determines the center position 204 of the key image 200a in the target image 210 when the key image 200a is superimposed on the target image 210 so that the feature point 201 and the feature point 211 (corresponding point) match. Estimate (step S107a).

ここで、対象画像の横幅および高さの画素数をそれぞれｗｉ，ｈｉとし、キー画像の横幅および高さの画素数をそれぞれｗｒ，ｈｒとする。キー画像の特徴点（ｘｒ，ｙｒ）に対応する対象画像の対応点として対象画像の特徴点（ｘｉ，ｙｉ）が探索されたとすると、対象画像におけるキー画像の中心点の位置（ｘｖ，ｙｖ）は、次の式（１−１），（１−２）を用いて算出される。
ｘｖ＝ｘｉ・ｘｒ＋（ｗｒ／２）・・・（１−１）
ｙｖ＝ｙｉ・ｙｒ＋（ｈｒ／２）・・・（１−２）
図６の特徴点２０１と特徴点２１１との対応関係に基づいて対象画像２１０におけるキー画像２００ａの中心位置として画素２１４が推定されたとすると、画像処理装置は、対象画像２１０の画素のうち、画素２１４に対して投票する。この投票処理には、例えば、対象画像２１０の各画素に対応するエントリを有する投票マップ１１４が用いられる。投票マップ１１４の各エントリの初期値は０とされる。図６の処理では、投票マップ１１４における画素２１４に対応するエントリに１が加算される（ステップＳ１０７ｂ）。 Here, the number of pixels of the horizontal width and height of the target image is wi and hi, respectively, and the number of pixels of the horizontal width and height of the key image are wr and hr, respectively. If the feature point (xi, yi) of the target image is searched as the corresponding point of the target image corresponding to the feature point (xr, yr) of the key image, the position (xv, yv) of the center point of the key image in the target image Is calculated using the following equations (1-1) and (1-2).
xv = xi · xr + (wr / 2) (1-1)
yv = yi · yr + (hr / 2) (1-2)
If the pixel 214 is estimated as the center position of the key image 200a in the target image 210 based on the correspondence relationship between the feature point 201 and the feature point 211 in FIG. Vote for 214. For this voting process, for example, a voting map 114 having an entry corresponding to each pixel of the target image 210 is used. The initial value of each entry in the voting map 114 is 0. In the process of FIG. 6, 1 is added to the entry corresponding to the pixel 214 in the voting map 114 (step S107b).

なお、実際には、画像処理装置は、例えば、画素２１４を中心とした所定領域（例えば、１０画素四方の矩形領域）に含まれる各画素に投票してもよい。これにより、キー画像２００ａと対象画像２１０との違いに対してあるロバストな認識処理を行うことが可能になる。 In practice, for example, the image processing apparatus may vote for each pixel included in a predetermined area (for example, a rectangular area of 10 pixels square) with the pixel 214 as the center. This makes it possible to perform a certain robust recognition process for the difference between the key image 200a and the target image 210.

図７は、投票結果に基づく類似画像の判定処理について説明するための図である。図７に示す投票マップ１１４ａは、図６に示すような処理がキー画像２００ａの各特徴点について実行された後の投票マップ１１４の状態を示す。画像処理装置は、投票マップ１１４ａにおける各画素についての投票数のうちの最大値を抽出し、この最大値が所定の閾値を超えているかを判定する。 FIG. 7 is a diagram for explaining a similar image determination process based on a vote result. The voting map 114a shown in FIG. 7 shows the state of the voting map 114 after the process shown in FIG. 6 is executed for each feature point of the key image 200a. The image processing apparatus extracts the maximum value of the number of votes for each pixel in the voting map 114a, and determines whether this maximum value exceeds a predetermined threshold.

ここで、キー画像２００ａと対象画像２１０とに同じ対象物が写っている場合、キー画像２００ａの特徴点と対象画像２１０の対応点との位置関係が、キー画像の特徴点間で同じ場合が多い。この場合、投票マップ１１４ａにおける同じ画素に対応するエントリに投票数が集中する。一方、キー画像２００ａと対象画像２１０との関連性が低い場合、キー画像２００ａの特徴点と対象画像２１０の対応点との位置関係が、キー画像の特徴点間で異なる場合が多い。この場合、投票マップ１１４ａにおいて投票数が分散する。 Here, when the same object is shown in the key image 200a and the target image 210, the positional relationship between the feature points of the key image 200a and the corresponding points of the target image 210 may be the same between the feature points of the key image. Many. In this case, the number of votes concentrates on entries corresponding to the same pixel in the vote map 114a. On the other hand, when the relevance between the key image 200a and the target image 210 is low, the positional relationship between the feature points of the key image 200a and the corresponding points of the target image 210 is often different between the feature points of the key image. In this case, the number of votes is distributed in the vote map 114a.

したがって、投票マップ１１４ａにおける投票数の最大値が閾値を超えた場合には、投票数が同じ画素に集中していると推定されるため、キー画像２００ａと対象画像２１０とに同じ対象物が写っている可能性が高いと判断できる。このことから、画像処理装置は、投票数の最大値が閾値を超えた場合、対象画像２１０をキー画像２００ａに類似する画像であると判定する。 Therefore, when the maximum number of votes in the voting map 114a exceeds the threshold, it is estimated that the number of votes is concentrated on the same pixel, so the same object appears in the key image 200a and the target image 210. It can be judged that there is a high possibility. From this, the image processing apparatus determines that the target image 210 is an image similar to the key image 200a when the maximum value of the number of votes exceeds the threshold value.

なお、実際には、対象画像２１０における特徴点の数によって投票数の最大値が影響を受けることから、例えば、対象画像２１０における特徴点の数で投票数を除算するなどの正規化処理が行われた上で、閾値との比較が行われることが望ましい。 In practice, since the maximum number of votes is affected by the number of feature points in the target image 210, for example, a normalization process such as dividing the number of votes by the number of feature points in the target image 210 is performed. It is desirable that a comparison with a threshold value is performed.

ところで、上記の第１の比較例では、図３のステップＳ１０６での特徴点同士のハミング距離の計算時間が膨大であるという問題がある。これは、キー画像内のすべての特徴点と対象画像内のすべての特徴点との組み合わせについてハミング距離が計算されるためである。例えば、各画像の特徴点数が１０００個の場合、１００００００回のハミング距離の計算が行われる。このようなハミング距離の計算時間は、画像検索処理全体に要する時間の８割以上を占める場合があり、この計算時間を短縮できれば画像検索処理全体に要する時間を大幅に短縮できる。 By the way, in the first comparative example, there is a problem that the calculation time of the Hamming distance between the feature points in step S106 in FIG. 3 is enormous. This is because the Hamming distance is calculated for combinations of all feature points in the key image and all feature points in the target image. For example, when the number of feature points of each image is 1000, the Hamming distance is calculated 1000000 times. Such a Hamming distance calculation time may occupy 80% or more of the time required for the entire image search process. If this calculation time can be shortened, the time required for the entire image search process can be greatly reduced.

そこで、次の第２の比較例では、前述のＮＯＭを利用してハミング距離計算に要する時間を短縮する。具体的には、キー画像および対象画像の各特徴点の局所特徴量が、各局所特徴量のノルムによって分類される。そして、キー画像の１つの特徴点に対する対応点の探索範囲がノルムの近いものだけに限定される。これにより、ハミング距離の計算の際の特徴点の組み合わせ数が減少し、その計算時間が短縮される。 Therefore, in the following second comparative example, the time required for the Hamming distance calculation is shortened by using the above-described NOM. Specifically, the local feature amounts of the feature points of the key image and the target image are classified by the norm of each local feature amount. And the search range of the corresponding point with respect to one feature point of the key image is limited to only one having a close norm. As a result, the number of combinations of feature points when calculating the Hamming distance is reduced, and the calculation time is shortened.

図８は、画像検索処理の第２の比較例を示すフローチャートである。図８の処理は、図３に示した第２の比較例の処理を次のように変形したものである。図８の処理では、図３のステップＳ１０２とステップＳ１０３との間に、ステップＳ１２１が実行される。また、図３のステップＳ１０６の代わりにステップＳ１２２が実行される。以下、ステップＳ１２１，Ｓ１２２についてのみ説明し、図３と同じ処理が実行される処理ステップについては説明を省略する。 FIG. 8 is a flowchart illustrating a second comparative example of the image search process. The process of FIG. 8 is a modification of the process of the second comparative example shown in FIG. 3 as follows. In the process of FIG. 8, step S121 is executed between step S102 and step S103 of FIG. Further, step S122 is executed instead of step S106 in FIG. Hereinafter, only steps S121 and S122 will be described, and description of processing steps in which the same processing as that in FIG. 3 is executed will be omitted.

［ステップＳ１２１］画像処理装置は、撮像画像ごとに次のような処理を行う。画像処理装置は、撮像画像内の各特徴点について、局所特徴量のノルムを計算する。バイナリ値のビット列のノルムは、ビット列に含まれる１の数として計算される。画像処理装置は、撮像画像内の特徴点を、局所特徴量のノルムが小さい順に並び替える。 [Step S121] The image processing apparatus performs the following processing for each captured image. The image processing device calculates the norm of the local feature amount for each feature point in the captured image. The norm of the bit string of binary values is calculated as the number of 1 included in the bit string. The image processing apparatus rearranges the feature points in the captured image in ascending order of the local feature norm.

［ステップＳ１２２］画像処理装置は、ステップＳ１０４で選択した対象画像の特徴点から、対応するノルムの値が、キー画像におけるステップＳ１０５で選択した特徴点に対応するノルムを中心とした一定範囲に含まれる特徴点を特定する。一定範囲としては、例えば、プラスマイナス１の範囲とされる。画像処理装置は、キー画像の特徴点に対する対応点の探索範囲を特定した特徴点に限定して、対応点の探索を行う。すなわち、画像処理装置は、キー画像の特徴点の局所特徴量と、対象画像から特定した各特徴点の局所特徴量とのハミング距離を計算する。そして、画像処理装置は、算出されたハミング距離が最少の特徴点を対応点として抽出する。 [Step S122] From the feature point of the target image selected in Step S104, the image processing apparatus includes a corresponding norm value within a certain range centered on the norm corresponding to the feature point selected in Step S105 in the key image. Identify the feature points The fixed range is, for example, a range of plus or minus 1. The image processing apparatus searches for the corresponding points by limiting the search range of the corresponding points with respect to the feature points of the key image to the specified feature points. That is, the image processing apparatus calculates the Hamming distance between the local feature amount of the feature point of the key image and the local feature amount of each feature point specified from the target image. Then, the image processing apparatus extracts a feature point having the smallest calculated Hamming distance as a corresponding point.

図９は、特徴量管理テーブルの構成例を示す図である。第２の比較例では、例えば、図９に示すような特徴量管理テーブル１１３が用いられる。特徴量管理テーブル１１３は、撮像画像ごとに用意される。 FIG. 9 is a diagram illustrating a configuration example of the feature amount management table. In the second comparative example, for example, a feature amount management table 113 as shown in FIG. 9 is used. The feature amount management table 113 is prepared for each captured image.

特徴量管理テーブル１１３には、撮像画像内の特徴点ごとにレコードが登録される。各レコードには、ＩＤ、特徴点座標、局所特徴量およびノルムが登録される。ＩＤは、撮像画像内の特徴点を識別するための識別番号を示す。特徴点座標は、特徴点の座標を示す。局所特徴量の項目には、特徴点の局所特徴量を示すビット列が登録される。ノルムの項目には、局所特徴量から算出されたノルムが登録される。 A record is registered in the feature amount management table 113 for each feature point in the captured image. In each record, an ID, a feature point coordinate, a local feature amount, and a norm are registered. ID indicates an identification number for identifying a feature point in the captured image. The feature point coordinates indicate the coordinates of the feature points. In the item of the local feature amount, a bit string indicating the local feature amount of the feature point is registered. The norm calculated from the local feature is registered in the norm item.

図８のステップＳ１０２では、算出された局所特徴量が対応する特徴量管理テーブル１１３における対応するレコードに登録される。そして、ステップＳ１２１では、例えば、特徴量管理テーブル１１３におけるレコードが算出されたノルムの大きさにしたがって並び替えられる。 In step S102 of FIG. 8, the calculated local feature value is registered in the corresponding record in the corresponding feature value management table 113. In step S121, for example, the records in the feature quantity management table 113 are rearranged according to the calculated norm size.

図１０は、第２の比較例における対応点の探索処理例を示す図である。図１０では、キー画像内の各特徴点の局所特徴量が登録された特徴量管理テーブル１１３ａと、対象画像内の各特徴点の局所特徴量が登録された特徴量管理テーブル１１３ｂとが例示されている。 FIG. 10 is a diagram illustrating an example of corresponding point search processing in the second comparative example. FIG. 10 illustrates a feature quantity management table 113a in which local feature quantities of each feature point in the key image are registered, and a feature quantity management table 113b in which local feature quantities of each feature point in the target image are registered. ing.

画像処理装置は、特徴量管理テーブル１１３ａに登録された各局所特徴量のノルムを計算する。そして、画像処理装置は、特徴量管理テーブル１１３ａに登録された局所特徴量を、例えば、ノルムの値が小さい順に並び替える。同様に、画像処理装置は、特徴量管理テーブル１１３ｂに登録された各局所特徴量のノルムを計算する。そして、画像処理装置は、特徴量管理テーブル１１３ｂに登録された局所特徴量を、例えば、ノルムの値が小さい順に並び替える。 The image processing apparatus calculates the norm of each local feature value registered in the feature value management table 113a. Then, the image processing apparatus sorts the local feature amounts registered in the feature amount management table 113a, for example, in ascending order of the norm value. Similarly, the image processing apparatus calculates the norm of each local feature amount registered in the feature amount management table 113b. Then, the image processing apparatus rearranges the local feature amounts registered in the feature amount management table 113b, for example, in ascending order of the norm value.

次に、画像処理装置は、キー画像の各局所特徴量と対象画像の各局所特徴量とのハミング距離を計算することで、キー画像の各特徴領域と類似する対象画像の特徴領域を探索する。このとき、キー画像の各特徴領域に対する対象画像の特徴領域の探索範囲は、対象画像の特徴領域のうち、局所特徴量のノルムの値が、キー画像の特徴領域についての局所特徴量のノルムと近い特徴領域に限定される。すなわち、キー画像から選択された局所特徴量とハミング距離の計算が行われる計算対象の局所特徴量は、対象画像の局所特徴量のうち、算出されたノルムの値がキー画像から選択された局所特徴量のノルムを中心とした一定範囲に含まれる局所特徴量に限定される。 Next, the image processing device searches for a feature region of the target image similar to each feature region of the key image by calculating a Hamming distance between each local feature amount of the key image and each local feature amount of the target image. . At this time, the search range of the feature region of the target image with respect to each feature region of the key image is such that the norm value of the local feature amount of the feature region of the target image is the norm of the local feature amount of the feature region of the key image. Limited to close feature regions. In other words, the local feature quantity selected from the key image and the local feature quantity to be calculated for the Hamming distance are calculated from the local feature quantities of the target image, and the calculated norm value is selected from the key image. It is limited to the local feature amount included in a certain range centered on the norm of the feature amount.

例えば、図１０において、キー画像の特徴領域についての局所特徴量２５１から算出されるノルムは、３であるとする。ここで、探索範囲を決めるノルムの範囲をプラスマイナス１とすると、キー画像の局所特徴量２５１との間でハミング距離の計算が行われる計算対象は、対象画像の局所特徴量のうち、ノルムが２から４までとなる局所特徴量に限定される。 For example, in FIG. 10, it is assumed that the norm calculated from the local feature amount 251 for the feature region of the key image is 3. Here, assuming that the norm range for determining the search range is plus or minus 1, the calculation target for which the Hamming distance is calculated with respect to the local feature value 251 of the key image is the norm among the local feature values of the target image. The local feature amount is limited to 2 to 4.

以上のようにハミング距離の計算対象がノルムに応じて限定されることで、キー画像の各局所特徴量と対象画像のすべての局所特徴量との組み合わせについてハミング距離が計算される場合と比較して、ハミング距離の計算量を低減することができる。 As described above, the calculation target of the Hamming distance is limited according to the norm, so that the Hamming distance is calculated for a combination of each local feature amount of the key image and all the local feature amounts of the target image. Thus, the calculation amount of the Hamming distance can be reduced.

また、ビット列同士のハミング距離は、ビット列間で値が異なるビットの数を示す。一方、ビット列のノルムは、ビット列に含まれる１の数を示す。このため、ノルムが近いビット列同士では、各ビット列に含まれる１の数が近いことから、ハミング距離が小さくなる可能性が高い。一方、ノルムが異なるビット列同士では、各ビット列に含まれる１の数が異なることから、ハミング距離が大きくなる可能性が高い。したがって、上記のようにハミング距離の計算対象がノルムに応じて限定された場合でも、ハミング距離に基づく特徴領域の類似判定精度が低下する可能性は低い。 Further, the Hamming distance between bit strings indicates the number of bits having different values between the bit strings. On the other hand, the norm of the bit string indicates the number of 1 included in the bit string. For this reason, there is a high possibility that the Hamming distance becomes small between bit strings having close norms because the number of 1s included in each bit string is close. On the other hand, bit strings having different norms have a high possibility of increasing the Hamming distance because the number of 1s included in each bit string is different. Therefore, even when the calculation target of the Hamming distance is limited according to the norm as described above, the possibility that the similarity determination accuracy of the feature region based on the Hamming distance is low is low.

次に、上記の第２の比較例における問題点について説明する。
局所特徴量のノルムの値は、０から、局所特徴量の次元数（最大値）までの整数となる。例えば、局所特徴量が１２８ビットのビット列として表される場合、ノルムは０から１２８までの値をとり得る。また、ノルムの値は、次の図１１の例のように、ノルムがとり得る範囲の中央値付近に集中して分布する傾向がある。 Next, problems in the second comparative example will be described.
The norm value of the local feature amount is an integer from 0 to the number of dimensions (maximum value) of the local feature amount. For example, when the local feature amount is expressed as a bit string of 128 bits, the norm can take a value from 0 to 128. Further, the norm value tends to be concentrated and distributed in the vicinity of the median of the range that the norm can take, as in the example of FIG.

図１１は、ノルムのヒストグラムの例を示す図である。例えば、ある撮像画像における各局所特徴量のノルムが算出された場合、ノルムの値の出現個数は図１１のように分布する。この図１１に示すように、ノルムの値の出現個数は、ノルムの範囲の中央値（図１１の例では“６４”）付近に極端に集中することが多い。 FIG. 11 is a diagram illustrating an example of a norm histogram. For example, when the norm of each local feature amount in a certain captured image is calculated, the number of appearances of the norm value is distributed as shown in FIG. As shown in FIG. 11, the number of occurrences of the norm value is often extremely concentrated around the median value of the norm range (“64” in the example of FIG. 11).

これは、次のような理由による。ノルムの値が小さいビット列では、１の数より０の数の方が多い。また、ノルムの値が大きいビット列では、０の数より１の数の方が多い。一方、ノルムの値が中央値付近となるビット列では、１の数と０の数とがほぼ同数となる。この場合、１と０との組み合わせによって生成可能なビット列のパターン数は、１と０との数が大きく異なる場合に生成可能なビット列のパターン数より多くなる。このため、ノルムの値が中央値付近となるビット列の数は、ノルムの値が相対的に小さいビット列や大きいビット列より多くなってしまう。 This is due to the following reason. In a bit string having a small norm value, the number of 0s is greater than the number of 1s. In a bit string having a large norm value, the number of 1s is greater than the number of 0s. On the other hand, in the bit string in which the norm value is near the median value, the number of 1s is almost equal to the number of 0s. In this case, the number of bit string patterns that can be generated by the combination of 1 and 0 is larger than the number of bit string patterns that can be generated when the numbers of 1 and 0 differ greatly. For this reason, the number of bit strings having a norm value near the median value is larger than that of a bit string having a relatively small norm value or a large bit string.

通常、キー画像と対象画像のどちらについても、上記のようなノルムの分布の傾向が見られる。このため、上記の第２の比較例のようにハミング距離の計算対象がノルムに応じて限定された場合でも、キー画像と対象画像との間では、ノルムが中央値付近となる局所特徴量同士の組み合わせが多くなり、それらの組み合わせによるハミング距離の計算回数が多くなってしまう。 Usually, the trend of norm distribution as described above is observed for both the key image and the target image. For this reason, even when the calculation target of the Hamming distance is limited according to the norm as in the second comparative example, between the key image and the target image, the local feature amounts whose norm is near the median value The number of combinations increases, and the number of times of Hamming distance calculation by these combinations increases.

ハミング距離の計算回数は、キー画像と対象画像の両方においてノルムが近い局所特徴量の数のかけ算によって算出される。このため、ノルムがある値となる局所特徴量の数が２倍になると、ノルムがその値となる局所特徴量の組み合わせ数は４倍になる。このように、ノルムの値がノルムの範囲の中央値付近に集中して分布すると、ノルムが中央値付近となる局所特徴量同士の組み合わせが指数関数的に増加し、それらの組み合わせによるハミング距離の計算回数が膨大になる。その結果、ハミング距離の計算対象をノルムに応じて限定したにもかかわらず、計算効率の向上効果が低いという問題がある。 The number of times of Hamming distance calculation is calculated by multiplying the number of local feature quantities having a close norm in both the key image and the target image. For this reason, when the number of local feature values having a certain norm value is doubled, the number of combinations of local feature values having the norm value is quadrupled. In this way, when the norm values are concentrated and distributed near the median of the norm range, the combination of local features whose norm is near the median increases exponentially, and the Hamming distance of those combinations increases. The number of calculations is enormous. As a result, there is a problem that the effect of improving the calculation efficiency is low although the calculation target of the Hamming distance is limited according to the norm.

また、上記のように、ノルムの値が中央値付近となるようなビット列のパターン数は多い。このため、ノルムの値が中央値付近となる局所特徴量の組み合わせの中には、ハミング距離が大きい局所特徴量の組み合わせが潜在的に含まれ得る。これは、ハミング距離の計算対象をハミング距離が小さいもの同士に限定するという目的に反する。その意味では、ノルムが中央値付近となる局所特徴量同士を組み合わせてハミング距離を計算する処理は、無駄が多いと言える。 As described above, the number of bit string patterns in which the norm value is near the median value is large. For this reason, a combination of local feature values having a large Hamming distance may potentially be included in a combination of local feature values having a norm value near the median value. This is contrary to the purpose of limiting the calculation target of the Hamming distance to those having a small Hamming distance. In that sense, it can be said that the process of calculating the Hamming distance by combining the local feature amounts whose norm is near the median is wasteful.

このような問題に対し、第２の実施の形態では、ハミング距離の計算の対象となるすべての局所特徴量において同じ位置のビット値が反転されても、ハミング距離の計算結果は変化しない、という性質を利用して、第２の比較例の処理が次のように変形される。第２の実施の形態に係る画像処理装置１００は、上記の性質に基づき、ハミング距離計算に用いるすべての局所特徴量における適切な位置のビット値をあらかじめ反転することより、局所特徴量に含まれる１のビット数を減少させる。これにより、ノルムの分布がノルムの中央値から小さい方向に分散するように変化させ、ノルムの中央値付近に対するノルムの分布の集中度合いを軽減する。その結果、ノルムが中央値付近をとる局所特徴量の組み合わせ数を減少させ、それらの組み合わせによるハミング距離の計算回数を減少させる。 With respect to such a problem, in the second embodiment, the calculation result of the Hamming distance does not change even if the bit values at the same position are inverted in all the local feature quantities that are the targets of the Hamming distance calculation. Using the property, the process of the second comparative example is modified as follows. Based on the above properties, the image processing apparatus 100 according to the second embodiment is included in the local feature amount by inverting the bit values at appropriate positions in all the local feature amounts used for the Hamming distance calculation in advance. Decrease the number of 1 bits. As a result, the norm distribution is changed so as to be dispersed in a smaller direction from the median value of the norm, and the degree of concentration of the norm distribution around the median value of the norm is reduced. As a result, the number of combinations of local feature values whose norm takes the vicinity of the median value is reduced, and the number of Hamming distances calculated by the combination is reduced.

図１２は、局所特徴量のビット反転処理の例を示す図である。図１２では、１番目からＮ番目までの各撮像画像に対応する特徴量管理テーブル１１３＿１，１１３＿２，・・・，１１３＿Ｎの例を示す。特徴量管理テーブル１１３＿１，１１３＿２，・・・，１１３＿Ｎでは、説明をわかりやすくするため、局所特徴量の値がビットごとに表されている。 FIG. 12 is a diagram illustrating an example of local feature amount bit inversion processing. FIG. 12 shows an example of the feature amount management tables 113_1, 113_2,..., 113_N corresponding to the first to Nth captured images. In the feature quantity management tables 113_1, 113_2,..., 113_N, the value of the local feature quantity is represented for each bit for easy understanding.

画像処理装置１００は、全撮像画像における全局所特徴量について、ビットごとに１の数を計数する。図１２の例では、ビットごとの１の計数値が集計テーブル１１５に登録されている。画像処理装置１００は、１の数が全局所特徴量の数（総特徴点数）の１／２を超えるビットを特定し、全局所特徴量における特定したビットのビット値を反転する。図１２の例では、１の数が全局所特徴量の数の１／２を超えたビットとして、上位から２番目のビットが特定されたものとする。この場合、画像処理装置１００は、全局所特徴量における上位から２番目のビットのビット値を反転する。 The image processing apparatus 100 counts the number of 1 for each bit for all local feature values in all captured images. In the example of FIG. 12, a count value of 1 for each bit is registered in the aggregation table 115. The image processing apparatus 100 identifies a bit whose number exceeds one-half of the total number of local feature values (total number of feature points), and inverts the bit value of the specified bit in all local feature values. In the example of FIG. 12, it is assumed that the second bit from the top is specified as a bit whose number exceeds one-half of the number of all local feature values. In this case, the image processing apparatus 100 inverts the bit value of the second highest bit in all local feature values.

このようなビット反転が施された局所特徴量では、ビット反転前と比較して０の数が増加している。このため、これらの局所特徴量のノルムの分布は、ビット反転前と比較して、ノルムの中央値から小さい方向に分散し、ノルムの度数のピーク値も減少する。 In the local feature amount subjected to such bit inversion, the number of 0 is increased as compared with that before the bit inversion. Therefore, the norm distribution of these local feature amounts is dispersed in a smaller direction from the median value of the norm, and the peak value of the norm frequency is also reduced, compared to before the bit inversion.

図１３は、ビット反転処理によるノルムの分布の変化の例を示す図である。なお、図１３では、説明をわかりやすくするため、局所特徴量のビット数や特徴点数が少ない場合の例を示している。 FIG. 13 is a diagram illustrating an example of a change in the norm distribution by the bit inversion process. Note that FIG. 13 shows an example in which the number of bits and the number of feature points of the local feature amount are small for easy understanding.

グラフ２２１ａは、あるキー画像の局所特徴量に基づくノルムのヒストグラムの例を示す。また、グラフ２２２ａは、ある対象画像の局所特徴量に基づくノルムのヒストグラムの例を示す。グラフ２２１ａ，２２２ａでは、いずれもノルムの中央値付近に分布が集中している。このようなキー画像と対象画像との間で類似特徴領域の探索が行われた場合、ハミング距離の計算回数は、４×５＋８×９＋６×４＝１１６（回）となる。 The graph 221a shows an example of a norm histogram based on a local feature amount of a certain key image. The graph 222a shows an example of a norm histogram based on the local feature amount of a certain target image. In each of the graphs 221a and 222a, the distribution is concentrated near the median value of the norm. When a similar feature region is searched between the key image and the target image, the number of Hamming distance calculations is 4 × 5 + 8 × 9 + 6 × 4 = 116 (times).

一方、グラフ２２１ｂ，２２２ｂは、上記手順でキー画像および対象画像の全局所特徴量についてビット反転処理が施された後におけるヒストグラムの例を示す。すなわち、グラフ２２１ｂは、キー画像についてのビット反転処理後の局所特徴量に基づくノルムのヒストグラムの例を示し、グラフ２２２ｂは、対象画像についてのビット反転処理後の局所特徴量に基づくノルムのヒストグラムの例を示す。 On the other hand, graphs 221b and 222b show examples of histograms after the bit inversion processing is performed on all the local feature amounts of the key image and the target image in the above procedure. That is, the graph 221b shows an example of the norm histogram based on the local feature after bit inversion processing for the key image, and the graph 222b shows the norm histogram on the key image based on the local feature after bit inversion processing. An example is shown.

グラフ２２１ｂでは、グラフ２２１ａと比較して、ノルムが中央値となる局所特徴量の数が８から５に大きく減少し、その分だけ、ノルムが中央値より小さい領域に分散して分布している。グラフ２２２ｂでも、グラフ２２２ａと比較して、ノルムが中央値となる局所特徴量の数が９から６に大きく減少し、その分だけ、ノルムが中央値より小さい領域に分散して分布している。 In the graph 221b, compared to the graph 221a, the number of local feature values having a norm having a median value is greatly reduced from 8 to 5, and the norm is distributed and distributed in an area smaller than the median. . Also in the graph 222b, compared to the graph 222a, the number of local feature values having a norm having a median value is greatly reduced from 9 to 6, and the norm is distributed and distributed in an area smaller than the median. .

このようにビット反転後の局所特徴量を用いた場合のハミング距離の計算回数は、１×０＋２×２＋３×３＋３×４＋５×６＋４×３＝６７（回）となり、ビット反転前より大幅に減少する。すなわち、ビット反転により局所特徴量同士の組み合わせ数が減少し、それによってハミング距離の計算回数が減少する。したがって、ハミング距離の計算に要する時間が短縮され、その計算効率が向上する。 Thus, the number of times of Hamming distance calculation using the local feature after bit inversion is 1 × 0 + 2 × 2 + 3 × 3 + 3 × 4 + 5 × 6 + 4 × 3 = 67 (times), which is significantly reduced from before bit inversion. . That is, bit inversion reduces the number of combinations of local feature quantities, thereby reducing the number of Hamming distance calculations. Therefore, the time required for calculating the Hamming distance is shortened and the calculation efficiency is improved.

なお、図１３，図１４に示したように、本実施の形態の画像処理装置１００は、ビットごとに１の数を計数するものとするが、１の数の代わりに０の数を計数してもよい。この場合、画像処理装置１００は、０の数が全局所特徴量の数（総特徴点数）の１／２を超えるビットを特定し、全局所特徴量における特定したビットのビット値を反転する。このようにしてビット反転が施された後の局所特徴量に基づくノルムは、その中央値より大きい領域に分散して分布する。これにより、ビット反転後の局所特徴量を用いた場合のハミング距離の計算回数は、１の数を計数した場合と同様に減少する。 As shown in FIGS. 13 and 14, the image processing apparatus 100 according to the present embodiment counts the number of 1 for each bit, but counts the number of 0 instead of the number of 1. May be. In this case, the image processing apparatus 100 specifies bits whose number of 0 exceeds 1/2 of the total number of local feature values (total number of feature points), and inverts the bit values of the specified bits in all local feature values. The norm based on the local feature amount after the bit inversion is performed in this manner is distributed and distributed in a region larger than the median value. As a result, the number of Hamming distance calculations when the local feature after bit inversion is used is reduced in the same way as when the number of 1 is counted.

図１４は、画像処理装置が備える処理機能の構成例を示すブロック図である。画像処理装置１００は、記憶部１１０、画像取得部１２１、特徴量算出部１２２、特徴量変更部１２３および画像認識部１２４を有する。 FIG. 14 is a block diagram illustrating a configuration example of processing functions included in the image processing apparatus. The image processing apparatus 100 includes a storage unit 110, an image acquisition unit 121, a feature amount calculation unit 122, a feature amount change unit 123, and an image recognition unit 124.

記憶部１１０は、画像処理装置１００が備える記憶装置（例えば、ＲＡＭ１０２またはＨＤＤ１０３の記憶領域として実装される。記憶部１１０には、画像データ１１１、画素ペア管理テーブル１１２および特徴量管理テーブル１１３が記憶される。画像データ１１１は、撮像画像のデータを示す。画素ペア管理テーブル１１２には、図４に示したように、各画素ペアを構成する第１画素および第２画素の座標が登録される。特徴量管理テーブル１１３は、撮像画像ごとに用意される。特徴量管理テーブル１１３には、図９に示したように、撮像画像内の各特徴点に対応するＩＤ、特徴点座標、局所特徴量およびノルムが登録される。 The storage unit 110 is implemented as a storage device (for example, a storage area of the RAM 102 or the HDD 103) included in the image processing apparatus 100. The storage unit 110 stores the image data 111, the pixel pair management table 112, and the feature amount management table 113. The image data 111 indicates captured image data, and the coordinates of the first pixel and the second pixel constituting each pixel pair are registered in the pixel pair management table 112 as shown in FIG. The feature amount management table 113 is prepared for each captured image, and the feature amount management table 113 includes an ID, a feature point coordinate, and a local feature corresponding to each feature point in the captured image, as shown in FIG. Quantity and norm are registered.

なお、記憶部１１０には、他に、図６に示した投票マップ１１４や、図１２に示した集計テーブル１１５が記憶されてもよい。
画像取得部１２１、特徴量算出部１２２、特徴量変更部１２３および画像認識部１２４の処理は、例えば、所定のプログラムがプロセッサ１０１に実行されることによって実現される。 In addition, the voting map 114 shown in FIG. 6 and the tabulation table 115 shown in FIG.
The processing of the image acquisition unit 121, the feature amount calculation unit 122, the feature amount change unit 123, and the image recognition unit 124 is realized, for example, by executing a predetermined program on the processor 101.

画像取得部１２１は、撮像画像の画像データ１１１を取得して記憶部１１０に格納する。例えば、画像取得部１２１は、撮像画像の画像データ１１１を可搬型記録媒体１０６ａを介して、あるいはネットワーク１０７ａを介して取得する。 The image acquisition unit 121 acquires the image data 111 of the captured image and stores it in the storage unit 110. For example, the image acquisition unit 121 acquires the image data 111 of the captured image via the portable recording medium 106a or the network 107a.

特徴量算出部１２２は、画像データ１１１および画素ペア管理テーブル１１２を参照しながら、撮像画像内の各特徴点についての局所特徴量を算出し、算出した局所特徴量を対応する特徴量管理テーブル１１３に登録する。 The feature amount calculation unit 122 calculates a local feature amount for each feature point in the captured image while referring to the image data 111 and the pixel pair management table 112, and the feature amount management table 113 corresponding to the calculated local feature amount. Register with.

特徴量変更部１２３は、全撮像画像の全特徴点に対応する局所特徴量についてビットごとに１の数を計数し、１の数が総特徴点数の１／２を超えるビットを特定する。特徴量変更部１２３は、全局所特徴量における特定したビットのビット値を反転する。さらに、特徴量変更部１２３は、ビット反転処理後の各局所特徴量のノルムを計算し、撮像画像ごとに、局所特徴量をノルムが小さい順に並び替える。 The feature amount changing unit 123 counts the number of 1 for each bit with respect to the local feature amounts corresponding to all feature points of all captured images, and identifies the bit whose number exceeds 1/2 of the total number of feature points. The feature amount changing unit 123 inverts the bit value of the specified bit in all the local feature amounts. Furthermore, the feature amount changing unit 123 calculates the norm of each local feature amount after the bit inversion processing, and rearranges the local feature amounts in ascending order of the norm for each captured image.

画像認識部１２４は、キー画像の選択操作を受け付け、選択されたキー画像以外の撮像画像の中からキー画像と類似する類似画像を検索する。
次に、画像処理装置１００の処理についてフローチャートを用いて説明する。 The image recognition unit 124 receives a key image selection operation and searches for a similar image similar to the key image from captured images other than the selected key image.
Next, processing of the image processing apparatus 100 will be described using a flowchart.

図１５は、特徴量算出処理の例を示すフローチャートである。
［ステップＳ１１］特徴量算出部１２２は、各撮像画像上に複数の特徴点を設定する。例えば、撮像画像上に等間隔（例えば、２４画素間隔）で特徴点を設定するＤｅｎｓｅＳａｍｐｌｉｎｇが用いられる。特徴量算出部１２２は、各撮像画像に対応する特徴量管理テーブル１１３に、設定した各特徴点についてのレコードを作成し、作成した各レコードにＩＤおよび特徴点座標を登録する。 FIG. 15 is a flowchart illustrating an example of a feature amount calculation process.
[Step S11] The feature amount calculation unit 122 sets a plurality of feature points on each captured image. For example, Dense Sampling is used in which feature points are set on a captured image at regular intervals (for example, 24 pixel intervals). The feature amount calculation unit 122 creates a record for each set feature point in the feature amount management table 113 corresponding to each captured image, and registers an ID and feature point coordinates in each created record.

［ステップＳ１２］特徴量算出部１２２は、撮像画像を１つ選択する。
［ステップＳ１３］特徴量算出部１２２は、ステップＳ１２で選択した撮像画像から特徴点を１つ選択する。 [Step S12] The feature amount calculation unit 122 selects one captured image.
[Step S13] The feature amount calculation unit 122 selects one feature point from the captured image selected in step S12.

［ステップＳ１４］特徴量算出部１２２は、ステップＳ１３で選択した特徴点を中心とした一定範囲の特徴領域において、画素ペア管理テーブル１１２に基づく画素ペアごとに輝度差を計算する。輝度差は、画素ペアを構成する画素のうち、第１画素の輝度値から第２画素の輝度値を減算することで算出される。 [Step S14] The feature amount calculation unit 122 calculates a luminance difference for each pixel pair based on the pixel pair management table 112 in a certain range of feature regions centered on the feature point selected in step S13. The luminance difference is calculated by subtracting the luminance value of the second pixel from the luminance value of the first pixel among the pixels constituting the pixel pair.

［ステップＳ１５］特徴量算出部１２２は、算出された各画素ペアの輝度差の符号に応じた値を画素ペアの順にビット列に付加する。例えば、輝度差が正値の場合はビット値“１”が付加され、輝度差が０以下の場合はビット値“０”が付加される。これにより、ステップＳ１３で選択した特徴点に対応する局所特徴量を示すビット列が算出される。特徴量算出部１２２は、算出したビット列を特徴量管理テーブル１１３における対応するレコードに登録する。 [Step S15] The feature amount calculation unit 122 adds a value corresponding to the calculated sign of the luminance difference of each pixel pair to the bit string in the order of the pixel pair. For example, when the luminance difference is a positive value, a bit value “1” is added, and when the luminance difference is 0 or less, a bit value “0” is added. Thereby, a bit string indicating the local feature amount corresponding to the feature point selected in step S13 is calculated. The feature amount calculation unit 122 registers the calculated bit string in a corresponding record in the feature amount management table 113.

［ステップＳ１６］特徴量算出部１２２は、撮像画像内の全特徴点について処理済みかを判定する。処理済みでない特徴点がある場合、ステップＳ１３に戻り、他の特徴点が選択される。一方、全特徴点について処理済みの場合、ステップＳ１７の処理が実行される。 [Step S16] The feature quantity calculation unit 122 determines whether all feature points in the captured image have been processed. If there is a feature point that has not been processed, the process returns to step S13, and another feature point is selected. On the other hand, when all the feature points have been processed, the process of step S17 is executed.

［ステップＳ１７］特徴量算出部１２２は、全撮像画像について処理済みかを判定する。処理済みでない撮像画像がある場合、ステップＳ１２に戻り、他の撮像画像が選択される。一方、全撮像画像について処理済みの場合、図１５の処理は終了される。 [Step S17] The feature amount calculation unit 122 determines whether all captured images have been processed. If there is a captured image that has not been processed, the process returns to step S12, and another captured image is selected. On the other hand, if all captured images have been processed, the processing in FIG. 15 is terminated.

以上の図１５の処理により、各撮像画像に対応する特徴量管理テーブル１１３に、各特徴点に対応する局所特徴量が登録される。
なお、図１５の処理は、次の図１６，図１７の処理が実行される画像処理装置１００とは別の装置において実行されてもよい。この場合、画像処理装置１００は、図１５の処理が実行された装置から、特徴量管理テーブル１１３の内容を取得する。 With the processing in FIG. 15 described above, local feature amounts corresponding to each feature point are registered in the feature amount management table 113 corresponding to each captured image.
Note that the processing in FIG. 15 may be executed in an apparatus different from the image processing apparatus 100 in which the processing in FIGS. 16 and 17 is executed. In this case, the image processing apparatus 100 acquires the contents of the feature amount management table 113 from the apparatus that has executed the process of FIG.

図１６，図１７は、画像検索処理の例を示すフローチャートである。
まず、ステップＳ２１〜Ｓ２６において、局所特徴量のビット反転処理が実行される。
［ステップＳ２１］特徴量変更部１２３は、ビット列におけるビットの位置を示す変数ｂを０に初期化する。 16 and 17 are flowcharts showing an example of the image search process.
First, in steps S21 to S26, bit inversion processing of local feature values is executed.
[Step S21] The feature amount changing unit 123 initializes a variable b indicating a bit position in the bit string to 0.

［ステップＳ２２］全撮像画像の総特徴点数をＬとする。特徴量変更部１２３は、Ｌ個の特徴点のそれぞれに対応する局所特徴量のビットのうち、上位からｂ番目のビットを参照する。特徴量変更部１２３は、全局所特徴量における上位からｂ番目のビットにセットされた１の個数Ｓ（ｂ）を算出する。 [Step S22] Let L be the total number of feature points of all captured images. The feature amount changing unit 123 refers to the b-th bit from the top among the bits of the local feature amount corresponding to each of the L feature points. The feature amount changing unit 123 calculates the number S (b) of 1 set in the b-th bit from the top in all the local feature amounts.

［ステップＳ２３］特徴量変更部１２３は、算出された１の個数Ｓ（ｂ）が、Ｌ／２より大きいかを判定する。Ｓ（ｂ）がＬ／２より大きい場合、ステップＳ２４の処理が実行され、Ｓ（ｂ）がＬ／２以下の場合、ステップＳ２５の処理が実行される。なお、ステップＳ２３での判定閾値は、Ｌ／２より大きい値とされてもよい。 [Step S23] The feature amount changing unit 123 determines whether the calculated number S (b) of 1 is larger than L / 2. When S (b) is greater than L / 2, the process of step S24 is executed, and when S (b) is equal to or less than L / 2, the process of step S25 is executed. Note that the determination threshold value in step S23 may be a value larger than L / 2.

［ステップＳ２４］特徴量変更部１２３は、特徴量管理テーブル１１３において、Ｌ個のすべての局所特徴量におけるｂ番目のビットを反転する。
［ステップＳ２５］特徴量変更部１２３は、変数ｂを１だけインクリメントする。 [Step S24] The feature quantity changing unit 123 inverts the b-th bit in all the L local feature quantities in the feature quantity management table 113.
[Step S25] The feature amount changing unit 123 increments the variable b by 1.

［ステップＳ２６］特徴量変更部１２３は、変数ｂの値が局所特徴量のビット数（特徴領域内の画素ペア数）Ｍと一致するかを判定する。変数ｂの値がビット数より小さい場合、すなわち、処理済みでないビットが残っている場合には、ステップＳ２２の処理が実行される。一方、変数ｂの値がビット数と一致する場合、すなわち、全ビットについて処理済みの場合には、ステップＳ２７の処理が実行される。 [Step S26] The feature amount changing unit 123 determines whether or not the value of the variable b matches the number of bits (the number of pixel pairs in the feature region) M of the local feature amount. If the value of the variable b is smaller than the number of bits, that is, if there are unprocessed bits, the process of step S22 is executed. On the other hand, when the value of the variable b matches the number of bits, that is, when all the bits have been processed, the process of step S27 is executed.

次に、ステップＳ２７〜Ｓ３０では、ステップＳ２４のビット反転が施された特徴量管理テーブル１１３を参照しながら、ノルムに応じた局所特徴量の並び替え処理が実行される。 Next, in steps S27 to S30, the local feature quantity rearrangement process according to the norm is executed while referring to the feature quantity management table 113 subjected to the bit inversion in step S24.

［ステップＳ２７］特徴量変更部１２３は、撮像画像を１つ選択する。
［ステップＳ２８］特徴量変更部１２３は、選択した撮像画像内の各特徴点について、局所特徴量のノルムを算出する。特徴量変更部１２３は、算出したノルムを、選択した撮像画像に対応する特徴量管理テーブル１１３に登録する。 [Step S27] The feature amount changing unit 123 selects one captured image.
[Step S28] The feature amount changing unit 123 calculates the norm of the local feature amount for each feature point in the selected captured image. The feature amount changing unit 123 registers the calculated norm in the feature amount management table 113 corresponding to the selected captured image.

［ステップＳ２９］特徴量変更部１２３は、選択した撮像画像に含まれる特徴点を、算出したノルムの大きさ順に並び替える。ここでは、特徴点は、ノルムが小さい順に並び替えられるものとする。また、ここでは、特徴量変更部１２３は、選択した撮像画像に対応する特徴量管理テーブル１１３のレコードを、算出したノルムが小さい順に並び替えるものとする。 [Step S29] The feature amount changing unit 123 rearranges the feature points included in the selected captured image in order of the calculated norm size. Here, it is assumed that the feature points are rearranged in ascending order of norm. Here, it is assumed that the feature amount changing unit 123 rearranges the records of the feature amount management table 113 corresponding to the selected captured image in ascending order of the calculated norm.

［ステップＳ３０］特徴量変更部１２３は、全撮像画像について処理済みかを判定する。処理済みでない撮像画像がある場合、ステップＳ２７に戻り、他の撮像画像が選択される。一方、全撮像画像について処理済みの場合、図１７のステップＳ３１の処理が実行される。 [Step S30] The feature amount changing unit 123 determines whether all captured images have been processed. If there is a captured image that has not been processed, the process returns to step S27, and another captured image is selected. On the other hand, if all captured images have been processed, the process of step S31 in FIG. 17 is executed.

次に、ステップＳ３１〜Ｓ４０では、ステップＳ２９でレコードが並び替えられた特徴量管理テーブル１１３を参照しながら、キー画像に類似する類似画像をキー画像以外の撮像画像の中から特定する処理が実行される。 Next, in steps S31 to S40, processing for specifying a similar image similar to the key image from captured images other than the key image is performed while referring to the feature amount management table 113 in which the records are rearranged in step S29. Is done.

［ステップＳ３１］画像認識部１２４は、ユーザからのキー画像の選択入力操作を受け付ける。
［ステップＳ３２］画像認識部１２４は、選択されたキー画像以外の撮像画像（対象画像）の中から、対象画像を１つ選択する。 [Step S31] The image recognition unit 124 receives a selection input operation of a key image from the user.
[Step S32] The image recognition unit 124 selects one target image from among captured images (target images) other than the selected key image.

［ステップＳ３３］画像認識部１２４は、キー画像の特徴点を１つ選択する。このステップＳ３３では、具体的には、キー画像に対応する特徴量管理テーブル１１３の先頭から順に、１つの特徴点に対応するレコードが選択される。 [Step S33] The image recognition unit 124 selects one feature point of the key image. In this step S33, specifically, a record corresponding to one feature point is selected in order from the top of the feature amount management table 113 corresponding to the key image.

［ステップＳ３４］画像認識部１２４は、ハミング距離の計算対象とするノルムの範囲を特定する。具体的には、画像認識部１２４は、ステップＳ３３で選択した特徴点のレコードからノルムの値を取得する。ここで、取得したノルムの値をｎとする。画像認識部１２４は、ｎ−ｄからｎ＋ｄまでの範囲をハミング距離の計算対象とするノルムの範囲とする。なお、ｄは０以上の整数であり、例えば１とされる。 [Step S34] The image recognizing unit 124 specifies a norm range to be calculated for the Hamming distance. Specifically, the image recognition unit 124 acquires the norm value from the record of the feature point selected in step S33. Here, the acquired norm value is n. The image recognizing unit 124 sets the range from n−d to n + d as the norm range for which the Hamming distance is calculated. Note that d is an integer greater than or equal to 0, for example, 1.

［ステップＳ３５］画像認識部１２４は、対象画像に対応する特徴量管理テーブル１１３のレコードのうち、登録されたノルムがｎ−ｄからｎ＋ｄまでの値であるレコードを１つずつ選択する。画像認識部１２４は、対象画像に対応する特徴量管理テーブル１１３から選択したレコード内の局所特徴量と、ステップＳ３３で選択したレコード内の局所特徴量とのハミング距離を計算する。画像認識部１２４は、対象画像に対応する特徴量管理テーブル１１３から選択したレコードのうち、ハミング距離が最小のレコードに対応する特徴点を、類似度が最も高い対応点として抽出する。 [Step S35] The image recognizing unit 124 selects, from the records in the feature amount management table 113 corresponding to the target image, records whose registered norms are values from nd to n + d one by one. The image recognition unit 124 calculates the Hamming distance between the local feature amount in the record selected from the feature amount management table 113 corresponding to the target image and the local feature amount in the record selected in step S33. The image recognition unit 124 extracts the feature point corresponding to the record with the smallest Hamming distance among the records selected from the feature amount management table 113 corresponding to the target image as the corresponding point with the highest similarity.

［ステップＳ３６］画像認識部１２４は、ステップＳ３３で選択した特徴点と、ステップＳ３５で抽出された対応点とが一致するように対象画像にキー画像を重ねた場合の、対象画像におけるキー画像の中心点の位置を推定する。この処理では、前述した式（１−１），（１−２）を用いて中心点の位置が算出される。 [Step S36] The image recognizing unit 124 overlays the key image on the target image so that the feature point selected in step S33 matches the corresponding point extracted in step S35. Estimate the position of the center point. In this process, the position of the center point is calculated using the above-described equations (1-1) and (1-2).

画像認識部１２４は、対象画像の画素のうち、算出された中心点の位置に対応する画素に投票する。例えば、画像認識部１２４は、対象画像の各画素をマッピングした投票マップ１１４のエントリのうち、算出された中心点の位置に対応するエントリの投票数を１だけインクリメントする。なお、投票先の画素は、中心点の位置に対応する画素だけでなく、その画素を中心とした一定範囲内の各画素とされてもよい。 The image recognition unit 124 votes for the pixel corresponding to the calculated center point position among the pixels of the target image. For example, the image recognition unit 124 increments the number of votes of an entry corresponding to the calculated position of the center point among the entries of the voting map 114 mapping each pixel of the target image by one. Note that the voting destination pixel may be not only the pixel corresponding to the position of the center point but also each pixel within a certain range centered on the pixel.

［ステップＳ３７］画像認識部１２４は、キー画像内の全特徴点について処理済みかを判定する。処理済みでない特徴点がある場合、ステップＳ３３に戻り、他の特徴点が選択される。一方、全特徴点について処理済みの場合、ステップＳ３８の処理が実行される。 [Step S37] The image recognition unit 124 determines whether all feature points in the key image have been processed. If there is a feature point that has not been processed, the process returns to step S33, and another feature point is selected. On the other hand, if all feature points have been processed, the process of step S38 is executed.

［ステップＳ３８］画像認識部１２４は、対象画像の各画素に対する投票数の最大値が所定の閾値を超えたかを判定する。画像認識部１２４は、投票数の最大値が閾値を超えた場合に、対象画像を類似画像であると判定し、投票数の最大値が閾値以下の場合に、対象画像を類似画像でないと判定する。 [Step S38] The image recognition unit 124 determines whether the maximum number of votes for each pixel of the target image exceeds a predetermined threshold. The image recognition unit 124 determines that the target image is a similar image when the maximum number of votes exceeds a threshold, and determines that the target image is not a similar image when the maximum value of votes is equal to or less than the threshold. To do.

なお、このステップＳ３８では、画像認識部１２４は、例えば、投票数に基づいてキー画像と対象画像との間の類似度を算出することもできる。
［ステップＳ３９］画像認識部１２４は、全対象画像について処理済みかを判定する。処理済みでない対象画像がある場合、ステップＳ３２に戻り、他の対象画像が選択される。一方、全対象画像について処理済みの場合、ステップＳ４０の処理が実行される。 In step S38, the image recognition unit 124 can also calculate the similarity between the key image and the target image based on the number of votes, for example.
[Step S39] The image recognition unit 124 determines whether all target images have been processed. If there is a target image that has not been processed, the process returns to step S32 and another target image is selected. On the other hand, if all the target images have been processed, the process of step S40 is executed.

［ステップＳ４０］画像認識部１２４は、類似画像の検索結果を出力する。例えば、画像認識部１２４は、画面上に検索された類似画像のファイル名やサムネイル画像を表示させる。 [Step S40] The image recognition unit 124 outputs a search result of similar images. For example, the image recognition unit 124 displays the file name and thumbnail image of the similar image searched for on the screen.

以上説明した第２の実施の形態では、ハミング距離計算に用いるすべての局所特徴量について、１の数が全特徴点数の半数を超えるビットのビット値が反転される。これにより、局所特徴量のビット列における１の数が減少し、０の数が増加する。このようにビット反転が施された局所特徴量のノルムの分布は、ビット反転が施されていない場合と比較して、ノルムがとり得る範囲の中央値における度数が減少し、その中央値から小さい範囲に分散する。その結果、キー画像の特徴点の対応点の探索範囲を、対象画像の特徴点のうちノルムが近い特徴点に限定したとき、ノルムが中央値付近をとる特徴点同士の組み合わせ数が減少し、その分だけハミング距離の計算回数が減少する。 In the second embodiment described above, the bit values of the bits in which the number of 1 exceeds half of the total number of feature points are inverted for all the local feature amounts used for the Hamming distance calculation. As a result, the number of 1s in the local feature bit string decreases and the number of 0s increases. In this way, the distribution of norms of local features that have undergone bit inversion is smaller than the median of the median of the range that the norm can take, compared to the case where no bit inversion is applied. Disperse to range. As a result, when the search range of the corresponding points of the feature points of the key image is limited to the feature points having a near norm among the feature points of the target image, the number of combinations of feature points having a norm near the median decreases, The number of Hamming distance calculations decreases accordingly.

ここで、ビット反転処理により、ノルムの中央値付近以外の範囲では特徴点同士の組み合わせ数は増加する。しかしながら、ビット反転を行わない場合にはノルムの中央値付近での集中度合いが極端に高かったことから、ビット反転処理後の特徴量に基づくノルムのヒストグラムでは、特徴点の度数が減少した各ノルムでの度数の減少数より、特徴点の度数が増加した各ノルムでの度数の増加数の方が大きくなりやすい。しかも、このようなノルムの分布の変化が、キー画像と対象画像の両方において発生する。このため、全体としてはハミング距離の計算回数が大きく減少する可能性が高い。したがって、第２の実施の形態によれば、対象画像がキー画像と類似するかを判定するための処理に要する時間が短縮され、処理効率が向上する。 Here, by bit inversion processing, the number of combinations of feature points increases in a range other than the vicinity of the median value of the norm. However, when bit inversion is not performed, the degree of concentration near the median value of the norm is extremely high. The increase in frequency at each norm in which the frequency of feature points has increased is more likely to be larger than the decrease in frequency in. Moreover, such a change in norm distribution occurs in both the key image and the target image. For this reason, as a whole, there is a high possibility that the number of Hamming distance calculations will be greatly reduced. Therefore, according to the second embodiment, the time required for the process for determining whether the target image is similar to the key image is reduced, and the processing efficiency is improved.

また、検索処理に利用するすべての撮像画像について事前にビット反転処理を施し、その後にキー画像に類似する類似画像を他の撮像画像から検索する手順としたことにより、処理効率をさらに向上させることができる。 Further, the processing efficiency is further improved by performing a bit inversion process on all captured images used for the search process in advance, and then searching for similar images similar to the key image from other captured images. Can do.

なお、例えば、撮像画像の数が少ない場合には、１枚のキー画像の局所特徴量と１枚の対象画像の局所特徴量との間で、次のようにしてビット反転処理が行われてもよい。画像処理装置１００は、キー画像の各局所特徴量に基づいて、１（または０）の数がキー画像内の総特徴点数の１／２を超えるビットを特定する。また、画像処理装置１００は、対象画像の各局所特徴量に基づいて、１（または０）の数が対象画像内の総特徴点数の１／２を超えるビットを特定する。そして、画像処理装置１００は、キー画像と対象画像の両方において、１（または０）の数が各画像内の総特徴点数の１／２を超えたビットを特定し、両画像の全局所特徴量における特定したビットの値を反転する。これにより、キー画像と対象画像の両方について、ノルムの分布を、その中央値より小さい（または大きい）領域の方向に確実に分散させることができるようになり、ハミング距離の計算数を確実に低減することが可能になる。 For example, when the number of captured images is small, bit inversion processing is performed between the local feature amount of one key image and the local feature amount of one target image as follows. Also good. The image processing apparatus 100 specifies bits whose number of 1 (or 0) exceeds 1/2 of the total number of feature points in the key image based on each local feature amount of the key image. In addition, the image processing apparatus 100 specifies bits whose number of 1 (or 0) exceeds 1/2 of the total number of feature points in the target image based on each local feature amount of the target image. Then, the image processing apparatus 100 identifies a bit in which the number of 1 (or 0) exceeds 1/2 of the total number of feature points in each image in both the key image and the target image, and all local features of both images. Inverts the value of the specified bit in the quantity. As a result, for both the key image and the target image, the norm distribution can be reliably distributed in the direction of the area smaller (or larger) than its median value, and the number of Hamming distance calculations can be reliably reduced. It becomes possible to do.

上記のような１枚のキー画像の局所特徴量と１枚の対象画像の局所特徴量との間でのビット反転処理は、撮像画像の数が少ない場合に有効である。しかし、撮像画像の数が多くなるほど、画像の組み合わせごとに、１（または０）の数が各画像内の総特徴点数の１／２を超えたビットを特定してビット反転するという処理の負荷が相対的に大きくなり、処理効率が低下する。このため、撮像画像が多いほど、図１６，図１７の処理のように全撮像画像内の局所特徴点について一度にビット反転処理を行う方が、処理効率が高くなる。 The bit inversion processing between the local feature amount of one key image and the local feature amount of one target image as described above is effective when the number of captured images is small. However, as the number of captured images increases, for each combination of images, the processing load of specifying and inverting bits where the number of 1 (or 0) exceeds 1/2 of the total number of feature points in each image Becomes relatively large, and processing efficiency decreases. For this reason, as the number of captured images increases, the processing efficiency increases when the bit inversion processing is performed on local feature points in all captured images at once as in the processing of FIGS.

なお、上記の各実施の形態に示した装置（画像処理装置１，１００）の処理機能は、コンピュータによって実現することができる。その場合、各装置が有すべき機能の処理内容を記述したプログラムが提供され、そのプログラムをコンピュータで実行することにより、上記処理機能がコンピュータ上で実現される。処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、磁気記憶装置、光ディスク、光磁気記録媒体、半導体メモリなどがある。磁気記憶装置には、ハードディスク装置（ＨＤＤ）、フレキシブルディスク（ＦＤ）、磁気テープなどがある。光ディスクには、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ、ＣＤ−ＲＯＭ（Compact Disc-Read Only Memory）、ＣＤ−Ｒ（Recordable）／ＲＷ（ReWritable）などがある。光磁気記録媒体には、ＭＯ（Magneto-Optical disk）などがある。 The processing functions of the apparatuses (image processing apparatuses 1 and 100) described in the above embodiments can be realized by a computer. In that case, a program describing the processing contents of the functions that each device should have is provided, and the processing functions are realized on the computer by executing the program on the computer. The program describing the processing contents can be recorded on a computer-readable recording medium. Examples of the computer-readable recording medium include a magnetic storage device, an optical disk, a magneto-optical recording medium, and a semiconductor memory. Examples of the magnetic storage device include a hard disk device (HDD), a flexible disk (FD), and a magnetic tape. Optical disks include DVD (Digital Versatile Disc), DVD-RAM, CD-ROM (Compact Disc-Read Only Memory), CD-R (Recordable) / RW (ReWritable), and the like. Magneto-optical recording media include MO (Magneto-Optical disk).

プログラムを流通させる場合には、例えば、そのプログラムが記録されたＤＶＤ、ＣＤ−ＲＯＭなどの可搬型記録媒体が販売される。また、プログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することもできる。 When distributing the program, for example, a portable recording medium such as a DVD or a CD-ROM in which the program is recorded is sold. It is also possible to store the program in a storage device of a server computer and transfer the program from the server computer to another computer via a network.

プログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラムまたはサーバコンピュータから転送されたプログラムを、自己の記憶装置に格納する。そして、コンピュータは、自己の記憶装置からプログラムを読み取り、プログラムに従った処理を実行する。なお、コンピュータは、可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することもできる。また、コンピュータは、ネットワークを介して接続されたサーバコンピュータからプログラムが転送されるごとに、逐次、受け取ったプログラムに従った処理を実行することもできる。 The computer that executes the program stores, for example, the program recorded on the portable recording medium or the program transferred from the server computer in its own storage device. Then, the computer reads the program from its own storage device and executes processing according to the program. The computer can also read the program directly from the portable recording medium and execute processing according to the program. In addition, each time a program is transferred from a server computer connected via a network, the computer can sequentially execute processing according to the received program.

１画像処理装置
２記憶部
３演算部
１０ａ，１０ｂ第１画像の特徴量
１１，２１，２２変換ビット列
２０ａ，２０ｂ第２画像の特徴量 DESCRIPTION OF SYMBOLS 1 Image processing apparatus 2 Memory | storage part 3 Operation part 10a, 10b The feature-value of 1st image 11, 21, 22, 22 Conversion bit string 20a, 20b The feature-value of 2nd image

Claims

A storage unit for storing a bit string indicating features of a plurality of feature regions set in each of the first image and the second image;
An arithmetic unit;
Have
The computing unit is
Specifying a specific bit position from which the number of predetermined values set is greater than or equal to a predetermined threshold greater than half of the total number of all the feature regions in the first image and the second image from the bit positions of the bit string;
Inverting the value of the specific bit position in the bit string of all the feature areas in the first image and the second image to generate a conversion bit string for all the feature areas,
Performing a search process for searching for a similar feature region similar to each feature region of the first image based on a Hamming distance of the converted bit string between the feature regions in the feature region of the second image; In the processing, the feature area of the second image that is the object of calculation of the Hamming distance for each feature area of the first image is determined from the norm of the transform bit string for each feature area of the first image. Limited to feature regions within a certain range,
Image processing device.

The storage unit stores the bit string indicating features of a plurality of feature regions set in each of the plurality of second images,
The computing unit is
In specifying the specific bit position, the number of the predetermined value set is equal to or greater than a predetermined determination threshold value that is greater than ½ of the total number of all feature regions in the first image and the plurality of second images. Specifying the position as the specific bit position;
In the generation of the converted bit sequence, the converted bit sequence for all the feature regions is obtained by inverting the values of the specific bit positions in the bit sequences of all the feature regions in the first image and the plurality of second images. Generate
In the search process, the similar feature region similar to each feature region of the first image is searched from the feature regions in each of the plurality of second images.
The image processing apparatus according to claim 1.

The calculation unit further includes:
For each feature region of the first image, the position of the first image included in the second image is specified based on a positional relationship between the feature region of the first image and the corresponding similar feature region;
Outputting information based on the similarity between the first image and the second image based on the result of specifying the position;
The image processing apparatus according to claim 1.

An image processing apparatus capable of acquiring the bit string from a storage unit that stores bit strings indicating features of a plurality of feature regions set in each of the first image and the second image,
Specifying a specific bit position from which the number of predetermined values set is greater than or equal to a predetermined threshold greater than half of the total number of all the feature regions in the first image and the second image from the bit positions of the bit string;
Inverting the value of the specific bit position in the bit string of all the feature areas in the first image and the second image to generate a conversion bit string for all the feature areas,
Performing a search process for searching for a similar feature region similar to each feature region of the first image based on a Hamming distance of the converted bit string between the feature regions in the feature region of the second image; In the processing, the feature area of the second image that is the object of calculation of the Hamming distance for each feature area of the first image is determined from the norm of the transform bit string for each feature area of the first image. Limited to feature regions within a certain range,
Image search method.

A computer capable of acquiring the bit string from a storage unit that stores bit strings indicating features of a plurality of feature regions set in each of the first image and the second image;
Specifying a specific bit position from which the number of predetermined values set is greater than or equal to a predetermined threshold greater than half of the total number of all the feature regions in the first image and the second image from the bit positions of the bit string;
Inverting the value of the specific bit position in the bit string of all the feature areas in the first image and the second image to generate a conversion bit string for all the feature areas,
Performing a search process for searching for a similar feature region similar to each feature region of the first image based on a Hamming distance of the converted bit string between the feature regions in the feature region of the second image; In the processing, the feature area of the second image that is the object of calculation of the Hamming distance for each feature area of the first image is determined from the norm of the transform bit string for each feature area of the first image. Limited to feature regions within a certain range,
An image search program for executing processing.