JP2012079186A

JP2012079186A - Image retrieval device, image retrieval method and program

Info

Publication number: JP2012079186A
Application number: JP2010225303A
Authority: JP
Inventors: Stejic Zoran; ゾランステイチ
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2010-10-05
Filing date: 2010-10-05
Publication date: 2012-04-19
Anticipated expiration: 2030-10-05
Also published as: JP5014479B2

Abstract

PROBLEM TO BE SOLVED: To realize a similar image retrieval with high accuracy, paying attention to characteristic region that a query image has.SOLUTION: A region including partial images in a retrieval target image similar to a partial image in a query image is cut out and a similarity degree between this region and the query image is calculated. Since a rate between the number of partial images included in the region and the number of partial images similar to the partial image extracted from the query image among the number of partial images in the region is weighted, the similarity degree is calculated through paying attention to the partial images similar to the query image.

Description

本発明は、検索キーとなる画像に類似する画像を検索する画像検索装置等に関する。 The present invention relates to an image search apparatus for searching for an image similar to an image serving as a search key.

画像を検索キーとして入力し、画像の特徴量（配色、テクスチャ、形状等の画像の特徴を数値化して表現したもの）を比較することにより、検索キーである画像（以下「クエリ画像」という）に類似する画像を検索する技術が知られている。ユーザがクエリ画像を入力すると、クエリ画像から特徴量を抽出して、検索対象の画像の特徴量との類似度を算出することで、類似画像を検索する（例えば、特許文献１）。 By inputting an image as a search key and comparing image feature amounts (represented by quantifying image features such as color scheme, texture, and shape), an image that is a search key (hereinafter referred to as a “query image”) A technique for searching for an image similar to is known. When the user inputs a query image, the feature amount is extracted from the query image, and the similarity with the feature amount of the search target image is calculated to search for a similar image (for example, Patent Document 1).

１枚の画像から抽出される特徴量は、その画像全体の特徴を示すものであるため、全体的に類似している画像を検索する際に有効である。これに対し、画像を構成する部分的な領域による画像（以下「部分画像」という）が類似している画像を検索する場合には、一枚の画像が複数の領域により構成されていると捉えて、その部分画像毎の特徴量によって画像の特徴を表すことで、部分画像を重要視した類似画像検索が可能になる。 Since the feature amount extracted from one image indicates the feature of the entire image, it is effective when searching for images that are generally similar. On the other hand, when searching for an image having similar images (hereinafter referred to as “partial images”) by partial areas constituting the image, it is assumed that one image is composed of a plurality of areas. Thus, by representing the feature of the image by the feature amount for each partial image, it is possible to perform a similar image search that places importance on the partial image.

但し、画像を部分画像に分割して、その部分画像毎から算出した特徴量を単に比較するとなると、例えば、１００分割した２枚の画像であれば、１００×１００通りのパターンで特徴量を比較し類似度を算出ため、計算量が膨大となる。 However, if the image is divided into partial images and the feature values calculated from the partial images are simply compared, for example, if two images are divided into 100, the feature values are compared in 100 × 100 patterns. Since the similarity is calculated, the calculation amount is enormous.

そこで、部分画像を用いた計算量を抑えて類似度の算出が可能な部分画像ビジュアルキーワードという手法が考案された。ビジュアルキーワードでは、１枚の画像が複数の部分画像により構成されていると捉え、画像から部分画像を抽出して、予め画像がクラスタリングされて形成されたクラスタに対して、その部分画像を特徴量に基づいて分類し、各部分画像が属するクラスタの数に基づいて特徴ベクトルが生成される。 Therefore, a technique called a partial image visual keyword has been devised, which can calculate the degree of similarity while suppressing the amount of calculation using partial images. With visual keywords, a single image is considered to be composed of a plurality of partial images, and partial images are extracted from the images. And a feature vector is generated based on the number of clusters to which each partial image belongs.

このように、ビジュアルキーワードを用いることで、画像全体から抽出される特徴量ではなく、画像を細かな領域として捉えた特徴量により、精度のよい画像検索が可能になる。 As described above, by using the visual keyword, it is possible to perform an image search with high accuracy not by the feature amount extracted from the entire image but by the feature amount obtained by capturing the image as a fine area.

特開２００１−５２１７５号公報JP 2001-52175 A

ところで、ユーザがクエリとして入力する画像には、ユーザの検索意図を含む特徴的な領域が含まれており、類似度の判定は、この「特徴的な領域」に注視して行うことが望ましい。例えば、クエリ画像に人物が含まれている場合には、人物の特徴に注視して類似度の判定を行うことが望ましい。 By the way, the image input by the user as a query includes a characteristic region including the user's search intention, and it is desirable to determine the similarity by paying attention to this “characteristic region”. For example, when a person is included in the query image, it is desirable to determine the similarity by paying attention to the characteristics of the person.

しかし、従来のビジュアルキーワードや特許文献１の技術では、画像全体を満遍なく捉えて類似度を算出するため、画像の持つ特徴的な領域に特化した検索ができなかった。 However, with the conventional visual keyword and the technique of Patent Document 1, since the similarity is calculated by capturing the entire image uniformly, a search specialized for the characteristic region of the image cannot be performed.

本発明は、上述の課題に鑑みて為されたものであり、その目的とするところは、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することである。 The present invention has been made in view of the above-described problems, and an object of the present invention is to realize a high-accuracy similar image search focused on a characteristic region of a query image.

上記目的を達成するため、第１の発明は、クエリ画像との類似度を算出して該類似度の高い画像を検索対象画像の中から検索する画像検索装置において、前記クエリ画像内から複数の部分画像を抽出する第１抽出手段と、前記検索対象画像内から複数の部分画像を抽出する第２抽出手段と、前記クエリ画像から抽出した部分画像と類似する前記検索対象画像内の部分画像を選択し、この選択した部分画像を含む領域を前記検索対象画像内から切り出す領域切出手段と、前記クエリ画像と、前記切り出された領域内の画像との間の類似度を算出する際に、前記切り出された領域内に含まれる前記部分画像の数と、該領域内の部分画像のうちの前記クエリ画像から抽出した部分画像と類似する部分画像の数との比を重みとして付与する類似度算出手段と、を備えることを特徴としている。 To achieve the above object, according to a first aspect of the present invention, there is provided an image search apparatus that calculates a similarity with a query image and searches an image with a high similarity from the search target images. A first extracting means for extracting a partial image; a second extracting means for extracting a plurality of partial images from the search target image; and a partial image in the search target image similar to the partial image extracted from the query image. When selecting the area including the selected partial image from the search target image, and calculating the similarity between the query image and the image in the extracted area, Similarity that gives as a weight the ratio between the number of partial images included in the clipped region and the number of partial images similar to the partial image extracted from the query image among the partial images in the region Calculation It is characterized in that it comprises a stage, a.

第１の発明によれば、クエリ画像内の部分画像と類似する検索対象画像内の部分画像を含む領域を切り出して、この領域とクエリ画像との間で類似度を算出する。また、この類似度には、領域内に含まれる部分画像の数と、該領域内の部分画像のうちのクエリ画像から抽出した部分画像と類似する部分画像の数との比が重み付けされるため、クエリ画像と類似する部分画像に注視した類似度が算出される。従って、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することができる。 According to the first aspect, a region including a partial image in the search target image that is similar to the partial image in the query image is cut out, and the similarity is calculated between the region and the query image. In addition, the similarity is weighted by a ratio between the number of partial images included in the region and the number of partial images similar to the partial image extracted from the query image among the partial images in the region. Then, the degree of similarity obtained by paying attention to the partial image similar to the query image is calculated. Therefore, it is possible to realize a high-accuracy similar image search that pays attention to a characteristic region of the query image.

また、第２の発明における前記類似度算出手段は、前記クエリ画像内に含まれる前記部分画像の数と、該クエリ画像内の部分画像のうちの前記検索対象画像内の部分画像と類似する部分画像の数との比を更に前記類似度の重みとして付与することを特徴としている。 In the second invention, the similarity calculation means includes a number of the partial images included in the query image and a portion similar to the partial image in the search target image among the partial images in the query image. A ratio with the number of images is further given as a weight of the similarity.

第２の発明によれば、クエリ画像内に含まれる部分画像の数と、該クエリ画像内の部分画像のうちの検索対象画像内の部分画像と類似する部分画像の数との比が類似度に重み付けされるため、クエリ画像内での検索対象画像と類似する領域に注視した類似度を算出できる。従って、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することができる。 According to the second invention, the ratio between the number of partial images included in the query image and the number of partial images similar to the partial images in the search target image among the partial images in the query image is the degree of similarity Therefore, it is possible to calculate the degree of similarity by paying attention to a region similar to the search target image in the query image. Therefore, it is possible to realize a high-accuracy similar image search that pays attention to a characteristic region of the query image.

また、第３の発明における前記領域切出手段は、前記クエリ画像から抽出した部分画像と類似する前記検索対象画像内の部分画像を選択する際に、該部分画像の集合のうちの外側に位置する部分画像を選択から除外して、その除外後の部分画像を含む領域を切り出すことを特徴としている。 In the third aspect of the present invention, the region cutout unit is located outside the set of partial images when selecting a partial image in the search target image similar to the partial image extracted from the query image. The partial image to be excluded is excluded from selection, and a region including the partial image after the exclusion is cut out.

第３の発明によれば、クエリ画像から抽出した部分画像と類似する検索対象画像内の部分画像のうちの、外側から部分画像を除外して領域を切り出すため、類似度を算出する対象の領域を特徴的な領域に絞り込んでいくことができる。従って、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することができる。 According to the third aspect of the present invention, since the partial image in the search target image similar to the partial image extracted from the query image is cut out from the outside by removing the partial image, the region whose similarity is to be calculated Can be narrowed down to a characteristic area. Therefore, it is possible to realize a high-accuracy similar image search that pays attention to a characteristic region of the query image.

本発明によれば、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することができる。 ADVANTAGE OF THE INVENTION According to this invention, the highly accurate similar image search which observed the characteristic area | region which a query image has can be implement | achieved.

本発明に係る画像検索装置の機能構成を示すブロック図。The block diagram which shows the function structure of the image search device which concerns on this invention. 特徴ベクトル生成処理のフローチャート。The flowchart of a feature vector generation process. 画像データからの領域画像の抽出とビジュアルキーワードへのマッピングの様子を示す図。The figure which shows the mode of the extraction of the area image from image data, and the mapping to a visual keyword. 比較領域の切り出しと、類似度の算出の処理のフローチャート。The flowchart of the process of cutting out a comparison area and calculating similarity. 比較領域の切り出しと、類似度の算出を説明するための第１の概念図。The 1st conceptual diagram for demonstrating extraction of a comparison area | region, and calculation of a similarity. 比較領域の切り出しと、類似度の算出を説明するための第２の概念図。The 2nd conceptual diagram for demonstrating extraction of a comparison area | region, and calculation of a similarity. 比較領域の切り出しと、類似度の算出を説明するための第３の概念図。The 3rd conceptual diagram for demonstrating extraction of a comparison area | region, and calculation of a similarity. 比較領域の切り出しの他の例を示す概念図。The conceptual diagram which shows the other example of extraction of a comparison area.

［画像検索装置の構成］
以下、本発明の実施の形態を図面に基づいて説明する。
図１は、本発明を適用した画像検索装置１の機能ブロック図である。画像検索装置１は、通信ネットワークを介して接続されたインターネットに接続され、該インターネットを介してウェブ上から画像データを収集可能となっている。この収集したデータをデータベース（ＤＢ）に蓄積して、検索対象の画像を作成する。 [Configuration of image search device]
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a functional block diagram of an image search apparatus 1 to which the present invention is applied. The image search device 1 is connected to the Internet connected via a communication network, and can collect image data from the web via the Internet. The collected data is accumulated in a database (DB) to create a search target image.

画像検索装置１は、通信ネットワークを介して接続されたパーソナルコンピュータや携帯端末等のクライアント端末から送信されるクエリ画像を検索要求として受信する。そして、その検索要求に応じた類似画像検索を行って、類似度順にランキングした検索結果をクライアント端末に返送する。 The image search apparatus 1 receives a query image transmitted from a client terminal such as a personal computer or a mobile terminal connected via a communication network as a search request. Then, a similar image search is performed according to the search request, and search results ranked in the order of similarity are returned to the client terminal.

本実施形態における画像検索装置１は、ビジュアルキーワードの手法を用いて画像をインデックス化する。ビジュアルキーワードによる画像検索とは、画像を複数の画像領域の集合として表現し、各画像を構成する画像領域（以下、適宜「部分画像」という）から得られる特徴量に基づいて画像のインデックス（特徴ベクトル）を生成する技術であり、テキスト中のキーワードから文章の特徴量を求めるテキスト検索技術の応用といえる。 The image search apparatus 1 according to the present embodiment indexes images using a visual keyword technique. An image search using visual keywords represents an image as a set of a plurality of image areas, and an image index (feature) based on a feature amount obtained from image areas (hereinafter referred to as “partial images” as appropriate) constituting each image. Vector), which can be said to be an application of text search technology for obtaining feature values of sentences from keywords in text.

このため、ビジュアルキーワードによる画像検索では、画像中の画像領域をキーワードとして扱うことでテキスト検索技術（転置インデックスやベクトル空間モデル、単語の出現頻度等）における技術を画像領域検索へ適用して、大規模且つ高速性を実現することができる。 For this reason, in image search using visual keywords, text search technology (transposition index, vector space model, word appearance frequency, etc.) is applied to image region search by treating the image region in the image as a keyword. Scale and high speed can be realized.

ビジュアルキーワードによる画像検索についての参考技術文献としては、
・Sivic and Zisserman:“Efficient visual search for objects in videos”, Proceedings of the IEEE, Vol.96,No.4.,pp.548-566,Apr 2008.
・Yang and Hauptmann:“A text categorization approach to video scene classification using keypoint features”,Carnegie Mellon University Technical Report,pp.25,Oct 2006.
・Jiang and Ngo:“Bag-of-visual-words expansion using visual relatedness for video indexing”,Proc.31^st ACM SIGIR Conf.,pp.769-770,Jul 2008.
・Jiang, Ngo, andYang:“Towards optimal bag-of-features for object categorization and semantic video retrieval”,Proc.6th ACM CIVR Conf.,pp.494-501,Jul.2007.
・Yang, Jiang, Hauptmann, and Ngo:“Evaluating bag-of-visual-words representations in scene classification”,Proc.15^th ACM MM Conf., Workshop onMMIR,pp.197-206,Sep. 2007.
等が挙げられる。 As reference technical literature on image search using visual keywords,
Sivic and Zisserman: “Efficient visual search for objects in videos”, Proceedings of the IEEE, Vol.96, No.4, pp.548-566, Apr 2008.
・ Yang and Hauptmann: “A text categorization approach to video scene classification using keypoint features”, Carnegie Mellon University Technical Report, pp. 25, Oct 2006.
・ Jiang and Ngo: “Bag-of-visual-words expansion using visual relatedness for video indexing”, Proc. 31 ^st ACM SIGIR Conf., Pp.769-770, Jul 2008.
・ Jiang, Ngo, and Yang: “Towards optimal bag-of-features for object categorization and semantic video retrieval”, Proc. 6th ACM CIVR Conf., Pp.494-501, Jul. 2007.
・ Yang, Jiang, Hauptmann, and Ngo: “Evaluating bag-of-visual-words representations in scene classification”, Proc. 15 ^th ACM MM Conf., Workshop on MMIR, pp.197-206, Sep. 2007.
Etc.

また、ある一つの画像を複数の部分画像の集合として表現することによって、一般的な類似画像検索とは異なり、画像中の一部分を任意の大きさや位置で切り出した画像をクエリ画像とした検索が可能となる。このため、ユーザは、所望の検索結果を得るために、図５の画像Ｇ１の破線のように一部分を指定するといった操作により、より直接・正確にクエリを表現することができる。 In addition, by expressing a single image as a set of a plurality of partial images, unlike general similar image search, search using an image obtained by cutting a part of an image in an arbitrary size or position as a query image is possible. It becomes possible. Therefore, the user can express the query more directly and accurately by an operation such as designating a part as shown by a broken line in the image G1 in FIG. 5 in order to obtain a desired search result.

図１に示すように、画像検索装置１は、クエリ画像受付部１０、特徴ベクトル生成部２０、比較領域切出部３０、類似度算出部４０、検索結果出力部５０、ビジュアルキーワード生成部６０、ビジュアルキーワードＤＢ６５、インデクシング部７０、インデックスＤＢ７５、領域管理ＤＢ８０及び検索対象画像ＤＢ９０を備えて構成される。 As shown in FIG. 1, the image search device 1 includes a query image reception unit 10, a feature vector generation unit 20, a comparison region extraction unit 30, a similarity calculation unit 40, a search result output unit 50, a visual keyword generation unit 60, A visual keyword DB 65, an indexing unit 70, an index DB 75, an area management DB 80, and a search target image DB 90 are provided.

これらの機能部は、所謂コンピュータにより構成され、演算／制御装置としてのＣＰＵ（Central Processing Unit）、記憶媒体としてのＲＡＭ（Random Access Memory）及びＲＯＭ（Read Only Memory）、通信インターフェイス等が連関することで実現される。 These functional units are configured by so-called computers, and are associated with a CPU (Central Processing Unit) as an arithmetic / control device, a RAM (Random Access Memory) and a ROM (Read Only Memory) as a storage medium, a communication interface, and the like. It is realized with.

クエリ画像受付部１０は、クライアント端末から送信される類似画像検索の検索キーとなるクエリ画像を受信して受け付ける。このクエリ画像は、検索対象画像ＤＢ９０に格納されている画像や、その画像データの一部分の領域を指定する操作により切り出された画像、新たに受信した画像がある。また、クエリ画像としては、１つの画像であってもよいし、複数の画像の組み合わせでもよい。 The query image receiving unit 10 receives and receives a query image that is a search key for similar image search transmitted from a client terminal. This query image includes an image stored in the search target image DB 90, an image cut out by an operation for designating a partial area of the image data, and a newly received image. The query image may be a single image or a combination of a plurality of images.

特徴ベクトル生成部２０は、クエリ画像から部分画像を抽出し、その部分画像の特徴量に基づいて特徴ベクトルを生成する特徴ベクトル生成処理（図２参照）を行って、クエリ画像から特徴ベクトルを生成する。特徴ベクトル生成処理については後述する。 The feature vector generation unit 20 extracts a partial image from the query image, performs a feature vector generation process (see FIG. 2) that generates a feature vector based on the feature amount of the partial image, and generates a feature vector from the query image. To do. The feature vector generation process will be described later.

比較領域切出部３０は、クエリ画像から抽出した部分画像と、検索対象画像から抽出した部分画像とのマッチングを行い、マッチした検索対象画像内の部分画像を含む領域を比較領域として切り出す処理を行う。この部分画像のマッチング、即ち、類似する部分画像の選定には、各画像の特徴ベクトルを用いる。 The comparison area cutout unit 30 performs a process of matching the partial image extracted from the query image with the partial image extracted from the search target image, and cutting out a region including the partial image in the matched search target image as a comparison area. Do. For matching of partial images, that is, selecting similar partial images, feature vectors of the images are used.

類似度算出部４０は、インデックスＤＢ７５に記憶された検索対象画像毎の特徴ベクトルと、クエリ画像から生成した特徴ベクトルとの間の類似度を算出する。この類似度の算出には、コサイン距離やBhattacharyya距離等の公知技術が用いられる。尚、比較領域の切り出しや、類似度の算出の詳細については後述する。 The similarity calculation unit 40 calculates the similarity between the feature vector for each search target image stored in the index DB 75 and the feature vector generated from the query image. For calculating the similarity, a known technique such as a cosine distance or a Bhattacharyya distance is used. The details of extracting the comparison area and calculating the similarity will be described later.

検索結果出力部５０は、類似度算出部４０により算出された類似度に基づいて、検索対象の画像をランク付けしたデータを生成する。この検索結果出力部５０が出力するデータは、例えば、検索対象画像の画像ＩＤを類似度に基づいてソートしたデータである。画像ＩＤには、検索対象画像ＤＢ９０にアクセスするためのアドレス（ＵＲＬ）を付加してもよい。 The search result output unit 50 generates data that ranks the images to be searched based on the similarity calculated by the similarity calculation unit 40. The data output by the search result output unit 50 is, for example, data obtained by sorting the image IDs of search target images based on the similarity. An address (URL) for accessing the search target image DB 90 may be added to the image ID.

ビジュアルキーワード生成部６０は、画像データの特徴ベクトルを生成する際に、画像内の部分画像をマッピングする対象の分類（クラスタ）を生成する。ビジュアルキーワード生成部６０は、画像検索に用いる画像や学習用に予め用意された画像データから複数の部分画像を抽出し、その部分画像の有する特徴量に基づいてそれらをクラスタリングする。尚、クラスタリングの標準的な手法としては、k-means, Hierarchical Agglomerative Clustering(HAC)などが用いられる。 When generating the feature vector of the image data, the visual keyword generation unit 60 generates a classification (cluster) for mapping the partial images in the image. The visual keyword generation unit 60 extracts a plurality of partial images from images used for image search and image data prepared in advance for learning, and clusters them based on the feature amounts of the partial images. As a standard method of clustering, k-means, Hierarchical Agglomerative Clustering (HAC) or the like is used.

特徴ベクトル生成部２０は、画像から検出した部分画像を、ビジュアルキーワード生成部６０のクラスタリングにより形成されるクラスタにマッピング（分類）することで、特徴ベクトルを生成する。このクラスタを、画像を視覚的なキーワードの集まりとして表現するための特徴量空間として「ビジュアルキーワード」という。 The feature vector generation unit 20 generates a feature vector by mapping (classifying) the partial image detected from the image to a cluster formed by the clustering of the visual keyword generation unit 60. This cluster is referred to as a “visual keyword” as a feature amount space for expressing an image as a collection of visual keywords.

ビジュアルキーワードＤＢ６５は、ビジュアルキーワード生成部６０のクラスタリングにより形成されたクラスタを識別するビジュアルキーワードＩＤ（ＶＫＩＤ）と、そのクラスタの特徴量空間（多次元空間）での中心点の座標である中心座標と、該クラスタの範囲を示す半径とを対応付けて記憶するデータベースである。 The visual keyword DB 65 includes a visual keyword ID (VKID) for identifying a cluster formed by clustering by the visual keyword generation unit 60, and a center coordinate that is a coordinate of a center point in the feature amount space (multidimensional space) of the cluster. , A database that stores the radius indicating the cluster range in association with each other.

中心座標は、各クラスタに属する画像の特徴量の平均値を示す値であり、特徴量空間上での多次元の座標により示される。半径は、例えば、クラスタに属する画像のうちの、中心座標から最遠の画像との距離により求められる。 The center coordinate is a value indicating an average value of feature amounts of images belonging to each cluster, and is represented by multidimensional coordinates on the feature amount space. The radius is obtained, for example, by the distance from the image farthest from the center coordinate among the images belonging to the cluster.

インデクシング部７０は、図２の特徴ベクトル生成処理に基づいて検索対象画像ＤＢ９０に記憶された画像データについての特徴ベクトルを生成して、この生成した特徴ベクトルを画像データのインデックスとしてインデックスＤＢ７５に対応付けて記憶する。 The indexing unit 70 generates a feature vector for the image data stored in the search target image DB 90 based on the feature vector generation process of FIG. 2, and associates the generated feature vector with the index DB 75 as an index of the image data. Remember.

また、インデクシング部７０は、画像データから検出した部分画像に領域ＩＤを割り振り、その部分画像をマッピングしたビジュアルキーワードのＶＫＩＤを画像ＩＤと領域ＩＤとに対応付けて領域管理ＤＢ８０に記憶する。この領域ＩＤは、画像内でのＸＹ座標であってもよいし、領域分割した際の行番号・列番号であってもよい。 Further, the indexing unit 70 assigns a region ID to the partial image detected from the image data, and stores the VKID of the visual keyword mapping the partial image in the region management DB 80 in association with the image ID and the region ID. The area ID may be an XY coordinate in the image, or may be a row number / column number when the area is divided.

インデックスＤＢ７５は、検索対象画像ＤＢ９０に記憶された画像データの画像ＩＤと、この画像データから生成した特徴ベクトル（ビジュアルキーワード毎の部分画像の出現頻度）とを対応付けて記憶するデータベースである。 The index DB 75 is a database that stores an image ID of image data stored in the search target image DB 90 and a feature vector (frequency of appearance of partial images for each visual keyword) generated from the image data in association with each other.

領域管理ＤＢ８０は、検索対象画像内の部分画像をマッピングしたビジュアルキーワードの対応関係を管理するデータベースであり、図１に示すように、検索対象画像の画像ＩＤと、領域ＩＤと、ＶＫＩＤとを対応付けて記憶する。 The area management DB 80 is a database that manages the correspondence between visual keywords that map partial images in a search target image, and corresponds to the image ID, area ID, and VKID of the search target image as shown in FIG. Add and remember.

検索対象画像ＤＢ９０は、類似画像の検索対象としてインターネット上から収集した画像データ（「検索対象画像」という）を蓄積記憶するデータベースであって、図１に示すように、画像ＩＤと、画像データとを対応付けて記憶する。画像ＩＤは、各画像データを固有に識別するための識別情報であって、キーワード及び画像データを記憶する際に、割り振られる。 The search target image DB 90 is a database for accumulating and storing image data collected from the Internet as a search target for similar images (referred to as “search target image”). As shown in FIG. 1, an image ID, image data, Are stored in association with each other. The image ID is identification information for uniquely identifying each image data, and is assigned when a keyword and image data are stored.

〔特徴ベクトル生成処理〕
ここで、特徴ベクトル生成処理について、図２のフローチャートと、図３の概念図とを参照しながら説明する。特徴ベクトル生成処理は、特徴ベクトル生成部２０がクエリ画像に対して、インデクシング部７０が検索対象画像に対して行うが、以下の説明では、特徴ベクトル生成部２０が行う場合を取り上げて説明する。 [Feature vector generation processing]
Here, the feature vector generation processing will be described with reference to the flowchart of FIG. 2 and the conceptual diagram of FIG. The feature vector generation process is performed on the query image by the feature vector generation unit 20 and the search target image by the indexing unit 70. In the following description, the case of the feature vector generation unit 20 will be described.

先ず、クエリ画像から複数の部分画像を検出する（ステップＳ１１）。この部分画像の検出方法としては、画像中の特徴的な領域（特徴領域）を検出する手法と、画像を所定領域で分割することで検出する手法とがある。 First, a plurality of partial images are detected from the query image (step S11). As a method for detecting the partial image, there are a method for detecting a characteristic region (characteristic region) in the image and a method for detecting the partial image by dividing the image into predetermined regions.

特徴領域を検出する手法としては、
・Ｈａｒｒｉｓ−ａｆｆｉｎｅ
・Ｈｅｓｓｉａｎ−ａｆｆｉｎｅ
・Ｍａｘｉｍａｌｌｙｓｔａｂｌｅｅｘｔｒｅｍａｌｒｅｇｉｏｎｓ（ＭＳＥＲ）
・ＤｉｆｆｅｒｅｎｃｅｏｆＧａｕｓｓｉａｎｓ（ＤｏＧ）
・ＬａｐｌａｃｉａｎｏｆＧａｕｓｓｉａｎ（ＬｏＧ）
・ＤｅｔｅｒｍｉｎａｎｔｏｆＨｅｓｓｉａｎ（ＤｏＨ）
等がある。 As a technique for detecting feature regions,
・ Harris-affine
・ Hessian-affine
・ Maximally stable extremal regions (MSER)
・ Difference of Gaussians (DoG)
・ Laplacian of Gaussian (LoG)
・ Determinant of Hessian (DoH)
Etc.

また、特徴領域の検出技術については、“Local Invariant Feature Detectors: A Survey”（Foundations and Trends in Computer Graphics and Vision,Vol.3,No.3,pp.177-280,2007.）等において公開されており、適宜公知技術を採用可能である。 The feature region detection technology is published in “Local Invariant Feature Detectors: A Survey” (Foundations and Trends in Computer Graphics and Vision, Vol.3, No.3, pp.177-280, 2007.). Any known technique can be adopted as appropriate.

また、画像を所定領域で分割して検出する手法としては、例えば、予め定めたＭ×Ｎブロックに分割したり、分割後のブロックの大きさが予め定めたｍ×ｎ画素となるように分割したりする手法がある。例えば、画像を１０×１０のブロックに分割する場合、画像の大きさが６４０×４８０画素であれば、１ブロックの大きさは６４×４８画素となる。 In addition, as a method of detecting an image by dividing it into predetermined areas, for example, the image is divided into predetermined M × N blocks, or divided so that the size of the divided blocks becomes predetermined m × n pixels. There is a technique to do. For example, when an image is divided into 10 × 10 blocks, if the size of the image is 640 × 480 pixels, the size of one block is 64 × 48 pixels.

図３では、画像を所定領域に分割した例を示しており、Ｎｏ．０００１の画像については７×６ブロックに分割されている。また、Ｎｏ．０００２の画像については５×７ブロック、Ｎｏ．０００３の画像については６×６ブロックに分割されている。尚、図示の例では、説明の簡略化のために数ブロックに分割しているが、数百〜数千のブロックに分割される。 3 shows an example in which an image is divided into predetermined areas. The 0001 image is divided into 7 × 6 blocks. No. For images of 0002, 5 × 7 blocks, No. The 0003 image is divided into 6 × 6 blocks. In the illustrated example, it is divided into several blocks for simplification of description, but it is divided into several hundred to several thousand blocks.

次に、検出した部分画像が有する特徴量を算出する（ステップＳ１２）。尚、特徴領域を抽出している場合には、スケール変化や回転、角度変化等のアフィン変換に耐性を持つ局所特徴量を抽出する。局所特徴量の一例としては、例えば次のものが挙げられる。 Next, the feature amount of the detected partial image is calculated (step S12). When the feature region is extracted, a local feature amount resistant to affine transformation such as scale change, rotation, and angle change is extracted. Examples of the local feature amount include the following.

・ＳＩＦＴ
・ｇｒａｄｉｅｎｔｌｏｃａｔｉｏｎａｎｄｏｒｉｅｎｔａｔｉｏｎｈｉｓｔｏｇｒａｍ
・ｓｈａｐｅｃｏｎｔｅｘｔ
・ＰＣＡ−ＳＩＦＴ
・ｓｐｉｎｉｍａｇｅｓ
・ｓｔｅｅｒａｂｌｅｆｉｌｔｅｒｓ
・ｄｉｆｆｅｒｅｎｔｉａｌｉｎｖａｒｉａｎｔｓ
・ｃｏｍｐｌｅｘｆｉｌｔｅｒｓ
・ｍｏｍｅｎｔｉｎｖａｒｉａｎｔｓ・ SIFT
・ Gradient location and orientation histogram
・ Shape context
・ PCA-SIFT
・ Spin images
・ Steerable filters
・ Differential inverters
・ Complex filters
・ Moment inviteants

局所特徴量の抽出については、“A performance evaluation of local descriptors”（IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.27, No.10,pp.1615-1630,2005.）等において公開されており、適宜公知技術を採用可能である。 The extraction of local features is published in “A performance evaluation of local descriptors” (IEEE Trans. Pattern Analysis and Machine Intelligence, Vol.27, No.10, pp.1615-1630, 2005.) A known technique can be adopted as appropriate.

この特徴領域から抽出した特徴量に基づいて生成した特徴ベクトルは、オブジェクト（物体）の存在する可能性の高い特徴領域から生成されるため、画像中のオブジェクトの特徴を示す指標として有効である。 Since the feature vector generated based on the feature amount extracted from the feature region is generated from the feature region where the object (object) is highly likely to exist, it is effective as an index indicating the feature of the object in the image.

また、領域分割により部分画像を抽出している場合には、画像の配色やテクスチャ、形状等の各画像の特徴を数値化して表現した画像特徴量を用いる。この領域分割により検出した領域画像の特徴量から生成した特徴ベクトルは、画像を構成する各部分から生成されるため、画像の全体的な構成を示す指標として有効である。 Further, when partial images are extracted by area division, image feature amounts expressed by quantifying the features of each image such as the color scheme, texture, and shape of the image are used. Since the feature vector generated from the feature amount of the region image detected by the region division is generated from each part constituting the image, it is effective as an index indicating the overall configuration of the image.

そして、画像データから検出した複数の部分画像を、その部分画像が有する特徴量に基づいてビジュアルキーワードにマッピング（分類）する（ステップＳ１３）。ビジュアルキーワードへのマッピングは、各ビジュアルキーワード（クラスタ）の中心点と、領域画像の特徴量との特徴量空間における距離に基づいて、距離が最も近いビジュアルキーワードを選定することで行う。 Then, the plurality of partial images detected from the image data are mapped (classified) to visual keywords based on the feature values of the partial images (step S13). The mapping to the visual keyword is performed by selecting the visual keyword having the closest distance based on the distance in the feature amount space between the center point of each visual keyword (cluster) and the feature amount of the region image.

図３の例では、画像ＩＤ‘０００１’の画像から検出した部分画像Ｔ１、Ｔ３〜Ｔ６がビジュアルキーワード＃１、部分画像Ｔ２がビジュアルキーワード＃２にマッピングされている。また、画像ＩＤ‘０００２’の画像から検出した部分画像Ｔ１２〜Ｔ１４がビジュアルキーワード＃３にマッピングされている。また、画像ＩＤ‘０００２’の画像の部分画像Ｔ１１と、画像ＩＤ‘０００３’の画像の部分画像Ｔ２１がビジュアルキーワード＃４にマッピングされている。 In the example of FIG. 3, the partial images T1, T3 to T6 detected from the image with the image ID “0001” are mapped to the visual keyword # 1, and the partial image T2 is mapped to the visual keyword # 2. Also, the partial images T12 to T14 detected from the image with the image ID “0002” are mapped to the visual keyword # 3. Also, the partial image T11 of the image with the image ID “0002” and the partial image T21 of the image with the image ID “0003” are mapped to the visual keyword # 4.

特徴ベクトル生成部２０は、各部分画像をビジュアルキーワード（クラスタ）にマッピングすると、各ビジュアルキーワードでの領域画像の出現頻度を計上し、このビジュアルキーワード毎での領域画像の出現頻度により多次元で表される特徴ベクトルを生成する（ステップＳ１４）。インデクシング部７０の特徴ベクトル生成処理においては、生成した特徴ベクトルと、画像ＩＤを対応付けてインデックスＤＢ７５に記憶する。 When each partial image is mapped to a visual keyword (cluster), the feature vector generation unit 20 counts the appearance frequency of the region image for each visual keyword, and displays it in a multidimensional manner according to the appearance frequency of the region image for each visual keyword. A feature vector to be generated is generated (step S14). In the feature vector generation process of the indexing unit 70, the generated feature vector and the image ID are associated with each other and stored in the index DB 75.

例えば、図３の‘０００１’の画像であれば、該画像から検出した領域画像の出現頻度は、ビジュアルキーワード＃１では‘５’、ビジュアルキーワード＃２では‘１’、ビジュアルキーワード＃３では‘０’となる。この複数のビジュアルキーワードに対する出現頻度をベクトル要素として生成した特徴ベクトルを画像のインデックスとしてインデックスＤＢ７５に記憶する。 For example, in the case of the image “0001” in FIG. 3, the appearance frequency of the region image detected from the image is “5” for the visual keyword # 1, “1” for the visual keyword # 2, and “1” for the visual keyword # 3. 0 '. A feature vector generated by using the appearance frequency for the plurality of visual keywords as a vector element is stored in the index DB 75 as an image index.

また、インデクシング部７０の特徴ベクトル生成処理においては、検索対象画像から検出した領域画像に領域ＩＤを割り振り、その領域画像をマッピングしたビジュアルキーワードのＶＫＩＤを画像ＩＤと領域ＩＤとに対応付けて領域管理ＤＢ８０に記憶する。この領域管理ＤＢ８０により、各画像内の部分画像がどのビジュアルキーワードにマッピングされているかを確認することができる。 In the feature vector generation process of the indexing unit 70, a region ID is assigned to a region image detected from the search target image, and region management is performed by associating the VKID of the visual keyword mapping the region image with the image ID and region ID. Store in DB80. With this area management DB 80, it is possible to confirm to which visual keyword a partial image in each image is mapped.

〔比較領域の切り出しと、類似度の算出〕
次に、図４のフローチャートと、図５〜図７の概念図とを用いて、比較領域の切り出しの処理と、類似度の算出処理とを説明する。これらの処理は、比較領域切出部３０が、インデックスＤＢ７５及び領域管理ＤＢ８０から、一つずつ画像ＩＤを選択して行われる。 [Cutting out comparison area and calculating similarity]
Next, the comparison area cut-out process and the similarity calculation process will be described with reference to the flowchart of FIG. 4 and the conceptual diagrams of FIGS. These processes are performed by the comparison area extraction unit 30 selecting image IDs one by one from the index DB 75 and the area management DB 80.

先ず、比較領域切出部３０は、クエリ画像内から抽出した部分画像と、選択した検索対象画像内から抽出した部分画像とのマッチングを行い、マッチする部分画像、即ち類似度の高い部分画像を選択する（ステップＳ２１）。具体的には、クエリ画像の特徴ベクトルと、検索対象画像の特徴ベクトルとが有するビジュアルキーワード毎の部分画像の出現頻度に基づいてマッチングを行うことができる。 First, the comparison area cutout unit 30 performs matching between the partial image extracted from the query image and the partial image extracted from the selected search target image, and selects a matching partial image, that is, a partial image with high similarity. Select (step S21). Specifically, matching can be performed based on the appearance frequency of the partial image for each visual keyword included in the feature vector of the query image and the feature vector of the search target image.

例えば、図５に示すように、クエリ画像Ｇ３の特徴ベクトルのうちビジュアルキーワード＃１に属する部分画像の数は‘４’となり、検索対象画像Ｇ５の特徴ベクトルのビジュアルキーワード＃１に属する部分画像の数は‘５’である。これは、同じビジュアルキーワード、即ち、クラスタにマッピングされている部分画像の数を示していることから、これらの部分画像は類似度が高いと判定することができる。 For example, as shown in FIG. 5, the number of partial images belonging to the visual keyword # 1 among the feature vectors of the query image G3 is “4”, and the partial images belonging to the visual keyword # 1 of the feature vector of the search target image G5 are displayed. The number is '5'. Since this indicates the same visual keyword, that is, the number of partial images mapped to the cluster, it can be determined that these partial images have high similarity.

図５の例では、部分画像Ｇ３０、Ｇ３１と、部分画像Ｇ５１がビジュアルキーワード＃１にマッピングされているから、領域管理ＤＢ８０を参照することで、同一のビジュアルキーワードにマッピングされている部分画像を検索して、マッチしている部分画像が選択される。このように、ビジュアルキーワードから生成した特徴ベクトルを用いることで、部分画像のマッチングが容易にできる。 In the example of FIG. 5, since the partial images G30 and G31 and the partial image G51 are mapped to the visual keyword # 1, the partial image mapped to the same visual keyword is searched by referring to the area management DB 80. Then, the matching partial image is selected. In this manner, partial images can be easily matched by using feature vectors generated from visual keywords.

次に、比較領域切出部３０は、クエリ画像の部分画像とマッチした検索対象画像内の部分画像に基づいて、比較領域の切り出しを行う（ステップＳ２２）。具体的には、マッチした部分画像を含むような領域の切り出しを行う。図５においては、クエリ画像内の部分画像Ｇ３０〜Ｇ３３が検索対象画像内の部分画像Ｇ５１〜Ｇ５３とマッチしたと判定され、これらを含むような矩形が形成されて、これが比較領域Ｒとして切り出される。 Next, the comparison area cutout unit 30 cuts out the comparison area based on the partial image in the search target image that matches the partial image of the query image (step S22). Specifically, a region including the matched partial image is cut out. In FIG. 5, it is determined that the partial images G30 to G33 in the query image match the partial images G51 to G53 in the search target image, and a rectangle including these is formed, and this is cut out as the comparison region R. .

次に、類似度算出部４０は、クエリ画像と、比較領域内の画像との類似スコアを算出する（ステップＳ２３）。具体的には、比較領域内の画像についての特徴ベクトルを生成して、この特徴ベクトルと、クエリ画像の特徴ベクトルとの距離を類似スコアとして算出する。 Next, the similarity calculation unit 40 calculates a similarity score between the query image and the image in the comparison area (step S23). Specifically, a feature vector for the image in the comparison area is generated, and the distance between the feature vector and the feature vector of the query image is calculated as a similarity score.

比較領域内の画像の特徴ベクトルについては、ステップＳ２２において切り出した比較領域、即ち、比較対象として位置を考慮して切り出した領域内の部分画像を、領域管理ＤＢ８０の領域ＩＤに基づいて検索して、その領域ＩＤに対応付けられたＶＫＩＤから該部分画像が属するビジュアルキーワードを判定できることから、このビジュアルキーワード毎の部分画像の出現頻度を新たに算出することで特徴ベクトルが生成される。 For the feature vector of the image in the comparison area, the comparison area cut out in step S22, that is, the partial image in the area cut out in consideration of the position as a comparison target is searched based on the area ID of the area management DB 80. Since the visual keyword to which the partial image belongs can be determined from the VKID associated with the region ID, a feature vector is generated by newly calculating the appearance frequency of the partial image for each visual keyword.

この処理は、既にビジュアルキーワードにマッピングした部分画像の領域ＩＤ及びＶＫＩＤを集計すればよく、図６のように比較領域Ｒに対する特徴ベクトルＶ７を生成することができる。また、類似スコアの算出については、クエリ画像内の部分画像と、比較領域内の部分画像との間の類似度を各々に算出して、その総計を類似スコアとしてもよい。 In this process, the area IDs and VKIDs of the partial images that have already been mapped to the visual keywords may be totaled, and the feature vector V7 for the comparison area R can be generated as shown in FIG. As for the calculation of the similarity score, the similarity between the partial image in the query image and the partial image in the comparison area may be calculated for each, and the total may be used as the similarity score.

次に、類似度算出部４０は、クエリ画像内から抽出した部分画像に対する、比較領域内の部分画像とマッチした部分画像の割合を算出する（ステップＳ２４）。例えば、図６に示すクエリ画像Ｇ３において抽出された部分画像の数は‘８’であり、比較領域内の部分画像とマッチした部分画像の数は‘４’であるから、クエリ画像Ｇ３内における全部分画像数に対するマッチした部分画像の割合は１／２となる。 Next, the similarity calculation unit 40 calculates the ratio of the partial image that matches the partial image in the comparison area to the partial image extracted from the query image (step S24). For example, the number of partial images extracted in the query image G3 shown in FIG. 6 is “8”, and the number of partial images that match the partial images in the comparison region is “4”. The ratio of the matched partial image to the total number of partial images is ½.

次に、類似度算出部４０は、比較領域内から抽出した部分画像に対する、クエリ画像内の部分画像とマッチした部分画像の割合を算出する（ステップＳ２５）。例えば、図６に示す比較領域Ｒにおいて抽出された部分画像の数は‘９’であり、クエリ画像Ｇ３内の部分画像とマッチした部分画像の数は‘３’であるから、比較領域Ｒ内における全部分画像数に対するマッチした部分画像の割合は１／３となる。 Next, the similarity calculation unit 40 calculates the ratio of the partial image that matches the partial image in the query image to the partial image extracted from the comparison region (step S25). For example, the number of partial images extracted in the comparison region R shown in FIG. 6 is “9”, and the number of partial images matching the partial image in the query image G3 is “3”. The ratio of the matched partial image to the total number of partial images is 1/3.

類似度算出部４０は、ステップＳ２３において算出した類似スコアに、クエリ画像内及び比較領域内でのマッチした部分画像の割合を重み付けとして乗算することで、類似度を算出する（ステップＳ２６）。 The similarity calculation unit 40 calculates the similarity by multiplying the similarity score calculated in step S23 by weighting the ratio of the matched partial images in the query image and the comparison region as a weight (step S26).

上記の類似度の算出を式で表すと次のようになる。
類似度＝クエリ画像内のマッチした部分画像の割合Ａ×比較領域内のマッチした部分画像の割合Ｂ×類似スコアＣ
クエリ画像内のマッチした部分画像の割合Ａ＝クエリ画像内のマッチした部分画像の個数／クエリ画像内の部分画像の個数
比較領域内のマッチした部分画像の割合Ｂ＝比較領域内のマッチした部分画像の個数／比較領域内の部分画像の個数 The above calculation of the similarity is expressed as follows.
Similarity = Ratio of matched partial images in query image A × Ratio of matched partial images in comparison region B × Similarity score C
Ratio of matched partial images in query image A = number of matched partial images in query image / number of partial images in query image Ratio of matched partial images in comparison region B = matched portion in comparison region Number of images / number of partial images in comparison area

例えば、図７のクエリ画像Ｇ３と、比較領域Ｒとの間の類似度を算出した場合、比較領域Ｒ内において画像の上部と、下部にマッチする部分画像Ｇ７１〜Ｇ７３がある。従来の類似度の算出では、クエリ画像内の部分画像との類似の度合いによって類似度が算出されるため、図７のような場合には高い類似度が得られる。 For example, when the similarity between the query image G3 in FIG. 7 and the comparison region R is calculated, there are partial images G71 to G73 that match the upper and lower portions of the image in the comparison region R. In the conventional calculation of the similarity, the similarity is calculated based on the degree of similarity with the partial image in the query image. Therefore, a high similarity is obtained in the case of FIG.

しかし、比較領域Ｒ内の中央部にマッチする部分画像がない場合には、類似度が低くなることが望ましい。上述の式に基づいて、比較領域内の部分画像の個数に対するマッチした部分画像の個数の割合を用いることで、マッチしていない領域がある画像については類似度を低めることができる。 However, when there is no partial image that matches the central portion in the comparison region R, it is desirable that the similarity is low. Based on the above formula, by using the ratio of the number of matched partial images to the number of partial images in the comparison region, the degree of similarity can be lowered for an image having a region that does not match.

このように、クエリ画像内の部分画像とマッチした検索対象画像内の領域内で、更に内領域内でマッチした部分画像に注視した類似度を算出することができる。従って、クエリ画像の有する特徴的な領域に注視した高精度の類似画像検索を実現することができる。 In this way, it is possible to calculate the degree of similarity of a region in the search target image that matches the partial image in the query image, and further paying attention to the partial image that matches in the inner region. Therefore, it is possible to realize a high-accuracy similar image search that pays attention to a characteristic region of the query image.

尚、上述した実施形態では、類似スコアに対する重み付けに、クエリ画像内におけるマッチした部分画像の割合と、比較領域内におけるマッチした部分画像の割合との両方を用いることとして説明したが、何れか一方によって重み付けすることとしてもよい。 In the above-described embodiment, the similarity score is weighted by using both the ratio of the matched partial image in the query image and the ratio of the matched partial image in the comparison area. It is good also as weighting by.

また、比較領域の切り出しを、マッチした部分画像全てを含む領域として行うこととして説明したが、例えば、マッチした部分画像の集合に対して、上下左右の各方向で最も外側に位置する部分画像を除去して、その除去後の部分画像の集合を含む領域を比較領域として切り出すこととしてもよい。 In addition, the comparison area has been cut out as an area including all the matched partial images.For example, for the set of matched partial images, the partial image located on the outermost side in each of the upper, lower, left, and right directions is displayed. An area including a set of partial images after removal may be cut out as a comparison area.

例えば、図８においては、比較領域Ｒ内の部分画像の集合に対して、この比較領域Ｒの中央から最も遠方にある部分画像はＧ５２となる。従って、この部分画像Ｇ５２を除去した部分画像のうちの、クエリ画像Ｇ３の部分画像とマッチする部分画像であるＧ５１、Ｇ５３を含む比較領域Ｒ１０を切り出す。この除去する部分画像の個数は、定数（例えば、２個）で定めてもよいし、抽出された部分画像の全体数に対する割合（例えば、５％）で定めてもよい。 For example, in FIG. 8, with respect to the set of partial images in the comparison region R, the partial image farthest from the center of the comparison region R is G52. Therefore, the comparison region R10 including G51 and G53, which are partial images matching the partial image of the query image G3, is extracted from the partial images from which the partial image G52 has been removed. The number of partial images to be removed may be determined by a constant (for example, 2) or may be determined by a ratio (for example, 5%) to the total number of extracted partial images.

このように、マッチした部分画像の中で外側の部分画像を除去して比較領域を切り出すことで、類似度を算出する検索画像の領域を更に絞り込むことができるため、高精度な類似画像検索を実現できる。 In this way, by removing the outer partial image from the matched partial images and cutting out the comparison region, the region of the search image for calculating the similarity can be further narrowed down. realizable.

また、テキスト検索における単語の重み付け手法であるＴＦ／ＩＤＦ（term frequency-inverse document frequency）により、特徴ベクトルに重み付けを行うこととしてもよい。 Further, the feature vectors may be weighted by TF / IDF (term frequency-inverse document frequency) which is a word weighting method in text search.

ＴＦ／ＩＤＦに関する参考資料としては、
C.D.Manning, P.Raghavan and H.Schutze:" Introduction to Information Retrieval",Cambridge University Press.2008.
が知られている。 For reference materials on TF / IDF,
CDManning, P. Raghavan and H. Schutze: "Introduction to Information Retrieval", Cambridge University Press. 2008.
It has been known.

ＴＦ／ＩＤＦは、文章中の特徴的な単語を抽出するためのアルゴリズムであり、単語の出現頻度であるＴＦと、逆出現頻度であるＩＤＦとの二つの指標により算出される。具体的には、次式により求められる。
ＴＦ／ＩＤＦ＝ＴＦ（ｉ,ｊ）／Ｔ（ｉ）＊ＩＤＦ（ｊ）
ＩＤＦ（ｉ）＝ｌｏｇ（Ｎ／ＤＦ（ｉ）） TF / IDF is an algorithm for extracting a characteristic word in a sentence, and is calculated by two indexes, ie, TF that is the appearance frequency of the word and IDF that is the reverse appearance frequency. Specifically, it is calculated | required by following Formula.
TF / IDF = TF (i, j) / T (i) * IDF (j)
IDF (i) = log (N / DF (i))

ここで、
ＴＦ（ｉ，ｊ）は、キーワード抽出対象のドキュメントｉ中でのキーワードｊの出現数
Ｔ（ｉ）は、ドキュメントｉ中の全ての単語の数
Ｎは、全てのドキュメント数
ＤＦ（ｊ）は、キーワードｊが含まれるドキュメントの数
である。 here,
TF (i, j) is the number of occurrences of keyword j in document i to be extracted, T (i) is the number of all words in document i N is the number of all documents DF (j) is The number of documents containing the keyword j.

これを、ドキュメントを画像、単語を同一のビジュアルキーワードに属する部分画像として捉え、各画像のビジュアルキーワード毎にＴＦ／ＩＤＦ値を求めて、このＴＦ／ＩＤＦ値をビジュアルキーワード毎に加算することで、特徴ベクトルを生成する。 By treating the document as an image and the word as a partial image belonging to the same visual keyword, obtaining a TF / IDF value for each visual keyword of each image, and adding this TF / IDF value for each visual keyword, Generate a feature vector.

このとき、画像ＩＤをｉ、各ビジュアルキーワードｋとして、各ビジュアルキーワードの重み値であるＴＦ／ＩＤＦ（ｉ,ｋ）は以下の式により算出する。 At this time, assuming that the image ID is i and each visual keyword k, TF / IDF (i, k) that is a weight value of each visual keyword is calculated by the following equation.

ＴＦ／ＩＤＦ（ｉ,ｋ）＝ＴＦ（ｉ,ｋ）／Ｔ（ｉ）＊ＩＤＦ（ｋ）
ＩＤＦ（ｋ）＝ｌｏｇ（Ｎ／ＤＦ（ｋ）） TF / IDF (i, k) = TF (i, k) / T (i) * IDF (k)
IDF (k) = log (N / DF (k))

尚、ＴＦ（ｉ,ｋ）は、画像ｉから抽出した部分画像がビジュアルキーワードｋで出現する数に重み付けを行ったものであり、各ビジュアルキーワードｋ内に属する（出現する）部分画像と、ビジュアルキーワードｋの中心点との距離に基づく上述した重み値（０〜１）となる。 TF (i, k) is obtained by weighting the number of partial images extracted from the image i that appear in the visual keyword k, and the partial images belonging to (appearing in) each visual keyword k and visual The weight value (0 to 1) described above based on the distance from the center point of the keyword k is obtained.

また、Ｔ（ｉ）は、画像ｉから抽出した部分画像の総数に、ビジュアルキーワードとの距離に基づく重み付けをした値であり、画像ｉから抽出した各部分画像が属するクラスタとの距離に基づいた重み値を合計したものである。 T (i) is a value obtained by weighting the total number of partial images extracted from the image i based on the distance from the visual keyword, and is based on the distance from the cluster to which each partial image extracted from the image i belongs. This is the sum of the weight values.

また、ＤＦ（ｋ）は、各ビジュアルキーワードｋに分類した部分画像が、各ビジュアルキーワードｋに出現する数に、ビジュアルキーワードとの距離に基づく重み付けを行った値である。また、Ｎは、検索対象画像ＤＢ９０の画像総数である。 DF (k) is a value obtained by weighting the number of partial images classified into each visual keyword k appearing in each visual keyword k based on the distance from the visual keyword. N is the total number of images in the search target image DB 90.

このように、ＴＦ／ＩＤＦにおけるドキュメントを画像とみなし、ドキュメント内の単語を同一のビジュアルキーワードに属する部分画像とみなして重み付けを行うことで、各画像に出現する部分画像の重要度を下げ、特定の画像に際立って出現する特徴的な部分画像についての重要度を上げるように特徴ベクトルのスカラ値に重み付けを行うことができる。 In this way, the document in TF / IDF is regarded as an image, and the words in the document are regarded as partial images belonging to the same visual keyword, and weighting is performed, so that the importance of partial images appearing in each image is reduced and specified. The scalar value of the feature vector can be weighted so as to increase the importance of the characteristic partial image that appears conspicuously in the image.

このＴＦ／ＩＤＦによる重み付けを用いて、クエリ画像内の部分画像が属するビジュアルキーワードと、検索対象画像内の部分画像が属するビジュアルキーワードとの類似スコアを求めてもよい。 Using the weighting by TF / IDF, a similarity score between the visual keyword to which the partial image in the query image belongs and the visual keyword to which the partial image in the search target image belongs may be obtained.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１画像検索装置
３比較領域切出部
１０クエリ画像受付部
２０特徴ベクトル生成部
３０比較領域切出部
４０類似度算出部
５０検索結果出力部
６０ビジュアルキーワード生成部
７０インデクシング部
７５インデクシング部
６５ビジュアルキーワードＤＢ
７５インデックスＤＢ
８０領域管理ＤＢ
９０検索対象画像ＤＢ
Ｒ比較領域 DESCRIPTION OF SYMBOLS 1 Image search device 3 Comparison area extraction part 10 Query image reception part 20 Feature vector generation part 30 Comparison area extraction part 40 Similarity calculation part 50 Search result output part 60 Visual keyword generation part 70 Indexing part 75 Indexing part 65 Visual keyword DB
75 Index DB
80 Area management DB
90 Search target image DB
R comparison area

Claims

In an image search device for calculating a similarity with a query image and searching for an image with a high similarity from search target images,
First extraction means for extracting a plurality of partial images from the query image;
Second extraction means for extracting a plurality of partial images from the search target image;
A region cutout unit that selects a partial image in the search target image similar to the partial image extracted from the query image, and cuts out a region including the selected partial image from the search target image;
When calculating the similarity between the query image and the image in the clipped region, the number of the partial images included in the clipped region and the partial image in the region Similarity calculation means for assigning, as a weight, a ratio between the number of partial images similar to the partial image extracted from the query image;
An image search apparatus comprising:

The similarity calculation means includes:
A ratio weight between the number of the partial images included in the query image and the number of partial images similar to the partial images in the search target image among the partial images in the query image is further used as the weight of the similarity The image search apparatus according to claim 1, wherein the image search apparatus is assigned.

The region cutting means is
When selecting a partial image in the search target image that is similar to the partial image extracted from the query image, a partial image located outside the set of partial images is excluded from the selection, and after the exclusion The image search apparatus according to claim 1, wherein an area including a partial image is cut out.

In an image search method in which a computer calculates a similarity with a query image and searches an image with a high similarity from search target images,
A first extraction step of extracting a plurality of partial images from the query image;
A second extraction step of extracting a plurality of partial images from the search target image;
Selecting a partial image in the search target image similar to the partial image extracted from the query image, and cutting out a region including the selected partial image from the search target image;
When calculating the similarity between the query image and the image in the clipped region, the number of the partial images included in the clipped region and the partial image in the region A similarity calculation step of assigning, as a weight, a ratio between the number of partial images similar to the partial image extracted from the query image;
The image search method characterized in that the computer executes.

A program for causing a computer to execute the image search method according to claim 4.