JPWO2017168601A1

JPWO2017168601A1 - Similar image retrieval method and system

Info

Publication number: JPWO2017168601A1
Application number: JP2018507912A
Authority: JP
Inventors: 廣池　敦; 敦廣池; 裕樹渡邉
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2016-03-30
Filing date: 2016-03-30
Publication date: 2018-08-16
Anticipated expiration: 2036-03-30
Also published as: WO2017168601A1; JP6445738B2

Abstract

従来技術では、クエリ画像の部分領域の面積が相対的に小さい場合には、検索意図から外れた画像が多数ヒットしてしまうクエリ画像に含まれる複数の部分領域を検出するステップと、検出された部分領域の特徴量を複数抽出するステップと、抽出された複数の特徴量とＤＢに予め複数格納される画像部分領域の特徴量とをそれぞれ対応付け、対応付けられた特徴量の類似度をそれぞれ算出するステップと、算出されたそれぞれの類似度に、クエリ画像に含まれる各部分領域の面積に応じて重み付けをするステップと、それぞれの部分領域の類似度に重み付け処理をした値の合計値に基づき、クエリ画像と検索対象画像との類似度を算出するステップと、を備える。In the prior art, when the area of the partial region of the query image is relatively small, a step of detecting a plurality of partial regions included in the query image in which a large number of images that are out of search intent are hit and detected A step of extracting a plurality of feature amounts of partial areas, a plurality of extracted feature amounts and feature quantities of image partial regions stored in advance in the DB are respectively associated with each other, and the similarity of the associated feature amounts is respectively determined. A step of calculating, a step of weighting each calculated similarity according to the area of each partial region included in the query image, and a sum of values obtained by weighting the similarity of each partial region And calculating a similarity between the query image and the search target image.

Description

本発明は、画像を対象とした情報検索技術に関する。 The present invention relates to an information search technique for an image.

近年、ネットワークのブロードバンド化、各種記憶装置の大規模化により、大規模な画像・映像を蓄積、管理し、それらを配信するサービスを実施することが可能となった。 In recent years, the broadband network and the scale of various storage devices have made it possible to store, manage and distribute large-scale images and videos.

このような大規模コンテンツを扱うシステムで重要となるのが検索技術である。検索技術として一般的なのは、画像・映像コンテンツと関連する付けられたテキスト情報に対する検索技術である。文書情報の検索技術では、1つ、ないしは、複数個のキーワードをクエリとして入力し、それが含まれるテキスト情報と関連づいた画像・映像を検索結果として返す形式が一般的である。また、画像自体から情報を抽出し検索する技術も存在する。特許文献１、特許文献２等に記載されているように、類似画像検索では、事前に、検索対象となる画像から、画像の持つ特徴を数値化した画像特徴量を抽出し、データベース化することによって高速の検索を実現している。 Search technology is important in such a system that handles large-scale contents. A general search technique is a search technique for attached text information related to image / video content. In the document information search technology, one or more keywords are inputted as a query, and an image / video associated with text information including the keyword is generally returned as a search result. There is also a technique for extracting and retrieving information from the image itself. As described in Patent Literature 1, Patent Literature 2, and the like, in similar image retrieval, an image feature value obtained by quantifying the characteristics of an image is extracted in advance from an image to be retrieved, and a database is created. To achieve high-speed search.

特許文献１には、検索対象画像を分割する複数の領域とクエリ画像を分割する複数の領域との組み合わせ毎に領域の類似性を示す領域類似度を算出し、クエリ画像内の領域毎に該領域に対応する領域類似度に基づき該領域の重要度を算出し、検索対象画像毎に該検索対象画像における各領域とクエリ画像における各領域との組み合わせに対応する領域類似度及び重要度に基づきクエリ画像との類似性を示す画像類似度を算出する構成が開示されている。 In Patent Literature 1, a region similarity indicating a region similarity is calculated for each combination of a plurality of regions for dividing a search target image and a plurality of regions for dividing a query image, and the region similarity is calculated for each region in a query image. The importance of the region is calculated based on the region similarity corresponding to the region, and for each search target image, based on the region similarity and importance corresponding to the combination of each region in the search target image and each region in the query image A configuration for calculating an image similarity indicating similarity to a query image is disclosed.

特開２０１３−２５４３６７号公報JP 2013-254367 A

特許文献１に開示された技術では、クエリ画像の部分領域の面積が相対的に小さい場合には、検索意図から外れた画像が多数ヒットしてしまう。 In the technique disclosed in Patent Document 1, when the area of the partial region of the query image is relatively small, a large number of images that are out of search intention are hit.

上記課題を解決するため、本発明は、クエリ画像に含まれる複数の部分領域を検出するステップと、検出された部分領域の特徴量を複数抽出するステップと、抽出された複数の特徴量とＤＢに予め複数格納される画像部分領域の特徴量とをそれぞれ対応付け、対応付けられた特徴量の類似度をそれぞれ算出するステップと、算出されたそれぞれの類似度に、クエリ画像に含まれる各部分領域の面積に応じて重み付けをするステップと、それぞれの部分領域の類似度に重み付け処理をした値の合計値に基づき、クエリ画像と検索対象画像との類似度を算出するステップと、を備える方法を提供する。 In order to solve the above problems, the present invention includes a step of detecting a plurality of partial regions included in a query image, a step of extracting a plurality of feature amounts of the detected partial regions, a plurality of extracted feature amounts and a DB. A plurality of feature amounts of image partial regions stored in advance, a step of calculating the similarity of the associated feature amounts, and each portion included in the query image according to the calculated similarity A method comprising weighting according to the area of the region, and calculating a similarity between the query image and the search target image based on a total value of values obtained by weighting the similarity of each partial region. I will provide a.

本発明によれば、クエリ画像の部分領域の面積が相対的に小さい場合には重みが小さくなる結果、検索意図から外れた画像がヒットする可能性を低減することが可能となる。 According to the present invention, when the area of the partial region of the query image is relatively small, the weight is reduced, and as a result, it is possible to reduce the possibility of hitting an image that is out of the search intention.

本発明の実施例のシステム構成図である。1 is a system configuration diagram of an embodiment of the present invention. 本発明の実施例のシステム構成の詳細図である。It is a detailed view of the system configuration of the embodiment of the present invention. 本発明の実施例で用いた着目領域検出技術の説明図である。It is explanatory drawing of the attention area detection technique used in the Example of this invention. 本発明の実施例で用いた着目領域検出技術おける対称軸の説明図である。It is explanatory drawing of the symmetry axis in the attention area detection technique used in the Example of this invention. 微分フィルタの例である。It is an example of a differential filter. 輝度勾配強度分布特徴量の説明図である。It is explanatory drawing of a luminance gradient intensity distribution feature-value. 本発明の実施例における着目領域検出処理の流れを示す図である。It is a figure which shows the flow of the attention area detection process in the Example of this invention. 着目領域検出処理の詳細を説明する図である。It is a figure explaining the detail of an attention area detection process. 着目領域検出処理おける詳細化処理の説明図である。It is explanatory drawing of the detailed process in an attention area detection process. 着目領域検出結果を示すための模式図である。It is a schematic diagram for showing an attention area detection result. 部分領域に関してデータベース化される項目の一覧である。It is a list of items that are made into a database regarding partial areas. 類似検索処理の流れである。It is a flow of similar search processing. 部分領域に関する類似検索の結果を示すための模式図である。It is a schematic diagram for showing the result of the similarity search regarding the partial area. 方式１による部分領域の対応付けを行った結果を示すための模式図である。It is a schematic diagram for showing the result of performing the association of partial areas by method 1. 方式２による部分領域の対応付けを行った結果を示すための模式図である。It is a schematic diagram for showing the result of performing the association of the partial area by the method 2. 実施例２のサービスを実現するシステムのシステム構成図である。It is a system configuration | structure figure of the system which implement | achieves the service of Example 2. FIG. 実施例２のサービスにおいて画像に関してデータベース化される項目の一覧である。It is the list of the items database-ized regarding an image in the service of Example 2. FIG. 実施例２のアプリケーションの画面構成の模式図である。FIG. 10 is a schematic diagram of a screen configuration of an application according to a second embodiment.

図１は、本発明の方式を画像検索サービスに適用した場合のシステム構成である。図１の１００は、検索サービスを提供するための計算機システムである。検索サービスが提供する各種機能は、ネットワークシステム１１０を経由して、端末計算機１２０を用いるユーザに対して提供される。 FIG. 1 shows a system configuration when the method of the present invention is applied to an image search service. Reference numeral 100 in FIG. 1 denotes a computer system for providing a search service. Various functions provided by the search service are provided to a user who uses the terminal computer 120 via the network system 110.

図２は、検索サービスを提供するための計算機システム１００の内部構成である。この図を用いて本発明が対象とする画像検索システムの処理フローを説明する。 FIG. 2 shows an internal configuration of the computer system 100 for providing a search service. The processing flow of the image search system targeted by the present invention will be described with reference to FIG.

まず、画像を登録する際の処理を説明する。画像を登録する際には、検索対象となる登録画像２１０を画像登録部２１１が受け付けた後、着目領域検出部２１２において、登録画像２１０から、着目するべき部分領域の集合が検出される。着目領域検出処理の詳細は図７で後述する。検出された各部分領域に対して、検索用特徴量抽出部２１３において画像特徴量が抽出される。抽出された画像特徴量は、部分領域と対応付けられた形式でデータベース２２０に保存される。 First, a process for registering an image will be described. When registering an image, after the registered image 210 to be searched is received by the image registration unit 211, the focused region detection unit 212 detects a set of partial regions to be focused on from the registered image 210. Details of the attention area detection processing will be described later with reference to FIG. An image feature quantity is extracted by the search feature quantity extraction unit 213 for each detected partial area. The extracted image feature amount is stored in the database 220 in a format associated with the partial area.

次に、画像を検索する際の処理を説明する。画像を検索する際には、与えられたクエリ画像２３０をクエリ画像受領部２３１が受け付けた後、着目領域検出部２３２において、クエリ画像２３０から、着目するべき部分領域の集合が抽出される。着目領域検出処理の詳細は、図７で後述する。次に、検出された各部分領域に対して、検索用特徴量抽出部２１３において画像特徴量が抽出される。なお、図２では説明の便宜のため着目領域検出部２１２と着目領域検出部２３２を分けて記載しているが、これらが計算機内の同一の処理部であっても良い。検索用特徴量抽出部２１３と検索用特徴量抽出部２３３についても同様である。 Next, processing when searching for an image will be described. When searching for an image, after the query image receiving unit 231 receives the given query image 230, the focused region detection unit 232 extracts a set of partial regions to be focused from the query image 230. Details of the attention area detection process will be described later with reference to FIG. Next, an image feature amount is extracted by the search feature amount extraction unit 213 for each detected partial area. In FIG. 2, the attention area detection unit 212 and the attention area detection unit 232 are described separately for convenience of explanation, but these may be the same processing unit in the computer. The same applies to the search feature quantity extraction unit 213 and the search feature quantity extraction unit 233.

本発明の方式が適用されるのは、類似検索部２３４である。類似検索部２３４では、クエリ画像２３０の部分領域の画像特徴量の集合と、データベース２２０に格納された登録画像２１０の部分領域の画像特徴量の集合とから、類似画像検索の結果を構成する。最後に、検索結果出力部２３５において、類似画像検索の結果、および、データベース２２０に格納された各種情報を用いて、検索要求元に返すべき情報を生成し、検索結果２４０として検索要求元に送信する。 The similarity search unit 234 is applied with the method of the present invention. The similarity search unit 234 forms a result of the similar image search from the set of image feature amounts of the partial areas of the query image 230 and the set of image feature amounts of the partial areas of the registered image 210 stored in the database 220. Finally, the search result output unit 235 generates information to be returned to the search request source using the similar image search result and various types of information stored in the database 220, and transmits the information to the search request source as the search result 240. To do.

本実施例では、登録画像２１０の着目領域検出部２１２およびクエリ画像２３０の着目領域検出部２２２において、対象を限定しない着目領域の検出方式として、局所的な対称性を用いた検出方式を用いている。当該検出方法の詳細については図３で後述する。 In the present embodiment, in the attention area detection unit 212 of the registered image 210 and the attention area detection unit 222 of the query image 230, a detection method using local symmetry is used as a detection method of the attention region that does not limit the target. Yes. Details of the detection method will be described later with reference to FIG.

図３は、局所的な対称性を評価するための画像特徴量の抽出に関する説明図である。画像３０１の中の任意の矩形部分領域３０２に対して、更に、３×３のブロック状に分割することによって、各ブロック領域の画像特徴量ベクトル群３０３を抽出する。。この９つの画像特徴量ベクトルをｆ００、ｆ１０、ｆ２０、ｆ０１、ｆ１１、ｆ２１、ｆ０２、ｆ１２、ｆ２２で示す。ここで、１番目の添え字は、各ブロックのｘ方向の位置、２番目の添え字は、ｙ方向の位置である。 FIG. 3 is an explanatory diagram relating to extraction of image feature amounts for evaluating local symmetry. An arbitrary rectangular partial area 302 in the image 301 is further divided into 3 × 3 blocks to extract an image feature vector group 303 of each block area. . These nine image feature quantity vectors are denoted by f00, f10, f20, f01, f11, f21, f02, f12, and f22. Here, the first subscript is the position in the x direction of each block, and the second subscript is the position in the y direction.

図４は、着目領域検出技術おける対称軸を示す図である。各ブロックの画像特徴量ベクトルに対して４つの軸を中心とした鏡映変換を考える。ここで、各軸を中心とした鏡映変換を画像特徴量ベクトルに対して適用するための行列を、Ｔ０、Ｔ１、Ｔ３、Ｔ４で表す。すなわち、Ｔ０は、左右に鏡映変換するための行列であり、Ｔ１は、右上４５度の軸を中心に鏡映変換するための行列であり、Ｔ３は、上下に鏡映変換するための行列であり、Ｔ４は、右下４５度の軸を中心に鏡映変換するための行列である。 FIG. 4 is a diagram showing an axis of symmetry in the attention area detection technique. Consider a mirror transformation centered on four axes for the image feature vector of each block. Here, T0, T1, T3, and T4 represent matrices for applying the mirror transformation about each axis to the image feature quantity vector. That is, T0 is a matrix for mirror conversion to the left and right, T1 is a matrix for mirror conversion about the axis at the upper right 45 degrees, and T3 is a matrix for mirror conversion up and down. T4 is a matrix for mirror conversion about the axis at the lower right 45 degrees.

各ブロック領域で抽出された画像特徴量ベクトルに対して、上記の変換行列を適用することによって、矩形部分領域内の対称性を評価することができる。例えば、左右対称性を評価するためには、ｙ軸を中心として対称な位置に存在するｆ００とｆ２０について、ｆ２０を左右に鏡映変換したベクトル、すなわち、ｆ２０にＴ０をかけたベクトルがｆ００と近ければ、対称性が高いと考えられる。同様に、ｆ０１とｆ２１については、ｆ２１にＴ０をかけたベクトル、ｆ０２とｆ２２については、ｆ２２にＴ０をかけたベクトルがそれぞれ近ければ、対称性が高いと考えられる。このようにして、左右対称性は、[数１]に示すような３つのベクトル間２乗距離から構成されるベクトルＤ０として表現することができる。 By applying the above transformation matrix to the image feature quantity vector extracted in each block area, the symmetry in the rectangular partial area can be evaluated. For example, in order to evaluate the left-right symmetry, f00 and f20 existing symmetrically with respect to the y axis are vectors obtained by mirror-transforming f20 left and right, that is, a vector obtained by multiplying f20 by T0 is f00. The closer it is, the higher the symmetry. Similarly, for f01 and f21, if f21 is multiplied by T0 and f02 and f22 are close to f22 multiplied by T0, the symmetry is considered high. In this way, the left-right symmetry can be expressed as a vector D0 composed of the three vector square distances as shown in [Equation 1].

同様に、右上４５度の軸を中心とする対称性は、[数２]のＤ１として表現される。 Similarly, the symmetry about the axis at the upper right 45 degrees is expressed as D1 in [Equation 2].

同様に、上下対称性は、[数３]のＤ２として表現される。 Similarly, the vertical symmetry is expressed as D2 in [Equation 3].

同様に、右下４５度の軸を中心とする対称性は、[数４]のＤ３として表現される。 Similarly, the symmetry about the axis at the lower right 45 degrees is expressed as D3 in [Equation 4].

一方、本検出方式では、特徴量ベクトルの変換によって対称性が増大する時に対称性が高いと評価する。例えば、Ｄ０の算出で用いられるｆ００とｆ０２が、元々、左右の鏡映変換に関して変動が小さい場合は、左右対称性が大きいとは考えない。このような性質を定量的に表現する補正項として、[数５]に示すような、鏡映変換を適用しない場合の対応するブロック領域間での特徴量ベクトル間の２乗距離から構成されるベクトルＥ０を定義する。 On the other hand, in this detection method, the symmetry is evaluated to be high when the symmetry increases due to the transformation of the feature vector. For example, when f00 and f02 used in the calculation of D0 originally have a small variation with respect to the left and right reflection conversion, it is not considered that the left-right symmetry is large. As a correction term that quantitatively expresses such a property, it is composed of a square distance between feature quantity vectors between corresponding block areas when mirror conversion is not applied, as shown in [Formula 5]. Define the vector E0.

同様に、Ｄ１に対する補正項Ｅ１は、次の[数６]のように表される。 Similarly, the correction term E1 for D1 is expressed as the following [Equation 6].

同様に、Ｄ２に対する補正項Ｅ２は、次の[数７]のように表される。 Similarly, the correction term E2 for D2 is expressed as the following [Equation 7].

同様に、Ｄ３に対する補正項Ｅ３は、次の[数８]のように表される。 Similarly, the correction term E3 for D3 is expressed as the following [Equation 8].

Ｄ０、Ｄ１、Ｄ２、Ｄ３、および、Ｅ０、Ｅ１、Ｅ２、Ｅ３を用いて、矩形部分領域の対称性を評価する。具体的な評価関数としては、[数９]を用いる。 D0, D1, D2, D3, and E0, E1, E2, E3 are used to evaluate the symmetry of the rectangular partial area. [Equation 9] is used as a specific evaluation function.

上記の評価関数では、４方向ごとの対称性を評価した後、その最大値を用いている。 The above evaluation function uses the maximum value after evaluating the symmetry in each of the four directions.

次に、本方式で用いる輝度勾配ベクトルの強度分布に基づく画像特徴量について説明する。 Next, the image feature amount based on the intensity distribution of the luminance gradient vector used in this method will be described.

図５は、数値微分を行うためのフィルタの例である。輝度勾配ベクトルは、白黒濃淡画像に対して、２次元の数値微分を適用することによって導出される。微分フィルタによって求められた画素位置（ｘ，ｙ）上の輝度勾配ベクトル（ｇｘ，ｇｙ）から、[数１０]のように、ベクトルの方向θとベクトルの２乗ノルムｐを算出することができる。 FIG. 5 is an example of a filter for performing numerical differentiation. The luminance gradient vector is derived by applying a two-dimensional numerical differentiation to the black and white gray image. From the luminance gradient vector (gx, gy) on the pixel position (x, y) obtained by the differential filter, the vector direction θ and the square norm p of the vector can be calculated as in [Equation 10]. .

ベクトルの方向θは、０度から３６０度の範囲に分布する。これを適切なレベルに等間隔に量子化し、２乗ノルムｐを矩形領域内で集計することによって、輝度勾配ベクトル方向の強度分布をヒストグラム状のデータとして表現することができる。 The vector direction θ is distributed in the range of 0 to 360 degrees. By quantizing this to an appropriate level at equal intervals and summing the square norm p within the rectangular area, the intensity distribution in the direction of the luminance gradient vector can be expressed as histogram-like data.

図６は、輝度勾配強度分布特徴量を算出の概念を表現した模式図である。まず、画像から輝度勾配ベクトル６０１を抽出する。次に、集計処理によってヒストグラム状のデータ６０２を算出する。なお、本実施例では、量子化のレベル数は８とする。また、最初の量子化の値域の中心をx 軸方向に一致させる。 FIG. 6 is a schematic diagram expressing the concept of calculating the luminance gradient intensity distribution feature amount. First, the brightness gradient vector 601 is extracted from the image. Next, histogram-like data 602 is calculated by aggregation processing. In this embodiment, the number of quantization levels is 8. Also, the center of the first quantization range is made to coincide with the x-axis direction.

本特徴量において量子化のレベル数を８とすれば、各ブロック領域から８次元の画像特徴量ベクトルを抽出することとなる。この時の左右対称性を評価するための鏡映変換行列Ｔ０は[数１１]のようになる。 If the number of quantization levels is 8 in this feature quantity, an 8-dimensional image feature quantity vector is extracted from each block area. The mirror transformation matrix T0 for evaluating the left-right symmetry at this time is as shown in [Equation 11].

同様に、右上４５度方向の軸を中心とする対称性を評価するための鏡映変換行列Ｔ１は[数１２]のようになる。 Similarly, the mirror transformation matrix T1 for evaluating the symmetry about the axis in the upper right 45 degree direction is as shown in [Equation 12].

同様に、上下対称性を評価するための鏡映変換行列Ｔ２は[数１３]のようになる。 Similarly, the reflection transformation matrix T2 for evaluating the vertical symmetry is as shown in [Equation 13].

同様に、右下４５度方向の軸を中心とする対称性を評価するための鏡映変換行列Ｔ３は[数１４]のようになる。 Similarly, the reflection transformation matrix T3 for evaluating the symmetry about the axis in the lower right 45 degree direction is as shown in [Expression 14].

次に、図７を用いて、着目領域検出部２１２および着目領域検出部２３２の処理について説明する。 Next, processing of the attention area detection unit 212 and the attention area detection unit 232 will be described with reference to FIG.

図７は、着目領域検出部の処理の流れを示した図である。 FIG. 7 is a diagram illustrating a flow of processing of the region of interest detection unit.

Ｓ７０１は、入力画像から、適切なスケールおよびアスペクトに変換した複数個の画像を生成する処理である。 S701 is processing for generating a plurality of images converted to an appropriate scale and aspect from the input image.

Ｓ７０２は、生成した複数個の画像を、多重解像度化する処理である。 S702 is a process for converting the plurality of generated images into multiple resolutions.

Ｓ７０３は、多重解像度化された画像に対して、走査処理によって、着目部分領域の候補となる矩形部分領域を生成し、各部分領域の対称性を算出する処理である。Ｓ７０１〜７０３の詳細は、図８で後述する。 S703 is processing for generating a rectangular partial region that is a candidate for the target partial region by scanning processing on the multi-resolution image and calculating the symmetry of each partial region. Details of S701 to 703 will be described later with reference to FIG.

Ｓ７０４は、走査処理によって生成される多数個の部分領域を、対称性の評価値に基づきソートし、着目すべき部分領域の候補を上位の所定件数に絞り込む処理である。 In step S704, a large number of partial areas generated by the scanning process are sorted based on the evaluation value of symmetry, and the partial area candidates to be focused on are narrowed down to a predetermined upper number.

Ｓ７０５は、収束判定をする処理である。 S705 is processing for determining convergence.

Ｓ７０６は、Ｓ７０５にて収束しないと判定された場合に、詳細化処理によって、その時点の着目領域候補から新たに部分領域を生成し、各部分領域の対称性を算出することによって、着目領域候補の追加を行う処理である。Ｓ７０６の詳細は、図９で後述する。このようにして構成された着目領域候補に対して、再び対称性評価に基づく絞り込み（Ｓ７０４）を行う。収束判定７０５で着目領域の変動がない場合には、収束すると判定して処理を終了する。なお、Ｓ７０４からＳ７０６の繰り返し回数が一定数を越えた場合に収束すると判定し処理を終了してもよい。 In S706, when it is determined in S705 that the convergence does not occur, the refinement process newly generates a partial area from the current focused area candidate, and calculates the symmetry of each partial area, whereby the focused area candidate Is a process of adding. Details of S706 will be described later with reference to FIG. Narrowing based on symmetry evaluation is performed again on the candidate region of interest configured in this way (S704). If there is no change in the region of interest in the convergence determination 705, it is determined that the target area has converged, and the process is terminated. It should be noted that the process may be terminated by determining that the process has converged when the number of repetitions of S704 to S706 exceeds a certain number.

図８は、図７のＳ７０１〜Ｓ７０３を図解したものである。 FIG. 8 illustrates S701 to S703 in FIG.

着目領域の検出では、応用分野に応じて、画像中の部分領域の大きさを適切に見積もる必要がある。特に、必要以上に小さな領域を部分領域として含めて考えると、不必要な部分領域が検出される結果、誤検出が増大するばかりではなく、処理時間も増大する。例えば、本方式の対称性評価における１ブロックの大きさを８×８画素とすれば、部分領域の大きさは２４×２４画素となる。仮に、着目するべき部分領域が、画像の大きさの１０％程度までで十分とすれば、画像の大きさは、２４０×２４０画素程度で十分となる。 In the detection of the region of interest, it is necessary to appropriately estimate the size of the partial region in the image according to the application field. In particular, if an area that is smaller than necessary is included as a partial area, unnecessary partial areas are detected. As a result, not only erroneous detection increases, but also processing time increases. For example, if the size of one block in the symmetry evaluation of this method is 8 × 8 pixels, the size of the partial area is 24 × 24 pixels. If the partial area to be focused on is sufficient up to about 10% of the size of the image, the size of the image is about 240 × 240 pixels.

また、着目部分領域の形状は、必ずしも正方形のみではなく、横長、縦長の矩形領域の抽出が必要な場合も多い。本実施例では、横長の矩形を抽出する必要がある場合は、元画像のアスペクトを縦長に変形した上で、正方格子状のブロック分割による対称性の評価を行う。このような処理によって生成された矩形領域を元の画像上の座標系に戻せば、横長の矩形となる。同様に、縦長の矩形を抽出する必要がある場合は、元画像のアスペクトを横長に変形して処理を行う。 In addition, the shape of the target partial region is not necessarily limited to a square, and it is often necessary to extract a horizontally long and vertically long rectangular region. In this embodiment, when it is necessary to extract a horizontally long rectangle, the aspect of the original image is deformed to be vertically long, and the symmetry is evaluated by square lattice block division. If the rectangular area generated by such processing is returned to the coordinate system on the original image, a horizontally long rectangle is obtained. Similarly, when it is necessary to extract a vertically long rectangle, the aspect of the original image is transformed into a horizontally long process.

８０２は、上述の２つの観点から行われる処理がを示したものである。入力画像８０１に対して、画像の幅を半分に縮小した上でスケール変換した画像、アスペクト比を保持してスケール変換した画像、画像の高さを半分に縮小した上でスケール変換した画像、計３つの画像が生成されていることが示されている。 Reference numeral 802 denotes processing performed from the above-described two viewpoints. An image obtained by scaling the input image 801 after reducing the image width to half, an image obtained by scaling conversion while maintaining the aspect ratio, an image obtained by scaling the image after reducing the image height by half, Three images are shown being generated.

８０３は、多重解像度処理を示すものである。この処理では、各画像を１／２ずつ２段階まで縮小した画像を生成する。 Reference numeral 803 denotes multi-resolution processing. In this process, an image is generated by reducing each image by half to two stages.

８１０は、このようにして生成された９つの画像に対する走査処理を示すものである。この処理では、各画像内の矩形状の窓を一定画素数ずつ平行移動することによって、矩形領域を粗く生成する。図９は、図７のＳ７０６を図解したものである。 Reference numeral 810 denotes scanning processing for the nine images generated in this way. In this process, a rectangular area is roughly generated by translating a rectangular window in each image by a certain number of pixels. FIG. 9 illustrates S706 of FIG.

Ｓ７０６では、ある着目部分領域の候補に対して、微小に縦横に平行移動した矩形領域９１０、微小に拡大縮小を行った矩形領域９２０、および、拡大縮小された矩形領域を更に縦横に平行移動した矩形領域を、新たな着目部分領域の候補として生成する。平行移動によって生成される矩形領域の数は、上下、左右、斜めの移動で８パターンである。拡大縮小で生成される矩形領域は２パターンで、拡大縮小それぞれの矩形領域の平行移動で、それぞれについて８パターンの矩形領域が生成される。合わせて、１つの種となる矩形領域に対して、最大で２６パターンの新たな矩形領域が生成され、その対称性が評価される。 In S706, a rectangular region 910 that has been translated slightly vertically and horizontally, a rectangular region 920 that has been slightly enlarged / reduced, and a rectangular region that has been enlarged / reduced are further translated horizontally and vertically with respect to a candidate of a target partial region. A rectangular area is generated as a candidate for a new target partial area. The number of rectangular areas generated by the parallel movement is eight patterns in the up / down, left / right, and diagonal movements. There are two rectangular areas generated by the enlargement / reduction, and eight patterns of rectangular areas are generated for each of the enlargement / reduction by parallel movement of the rectangular areas. In addition, a maximum of 26 new rectangular areas are generated for a single rectangular area, and the symmetry is evaluated.

なお、先述したように、詳細化処理７０６は繰り返し実行される。各繰り返し処理における各微小変動量は、[数１５]によって定義される。 As described above, the detailing process 706 is repeatedly executed. Each minute variation amount in each iteration is defined by [Equation 15].

ここで、ｑは、詳細化処理の繰り返しの回数、ｓｘ、ｓｙは、それぞれ走査処理（Ｓ７０３）において平行移動を行った際の横方向、縦方向のステップ幅、ｄｘ、ｄｙは、それぞれｑ回目の詳細化処理での横方向、縦方向の変動量である。一方、ｄｚは、ｑ回目の詳細化処理での拡大率で、縮小する場合の縮小率は、１／ｄｚである。上式から明らかなように、変動の大きさは、本処理の繰り返しの回数に応じて小さくなる。対象となる画像は離散的なデジタル画像であるから、本処理を十分に繰り返せば、微小変動によって新たな領域候補が生成されることはなくなる。Ｓ７０４〜Ｓ７０６は、少なくとも新たな領域候補が生成されることがなくなれば、Ｓ７０５で収束と判定されて終了となる。 Here, q is the number of repetitions of the detailing process, sx and sy are the step widths in the horizontal and vertical directions when the translation is performed in the scanning process (S703), respectively, and dx and dy are the qth time, respectively. The amount of fluctuation in the horizontal and vertical directions in the detailed processing. On the other hand, dz is an enlargement ratio in the q-th refinement process, and the reduction ratio when reducing is 1 / dz. As is clear from the above equation, the magnitude of the fluctuation decreases with the number of repetitions of this process. Since the target image is a discrete digital image, if this process is sufficiently repeated, new region candidates will not be generated due to minute fluctuations. If at least a new area candidate is not generated, S704 to S706 are determined to be converged in S705 and are terminated.

図１０は、登録画像の着目領域検出部２１２およびクエリ画像の着目領域検出部２３２における部分領域検出結果を模式的に示したものである。画像１０１０から大きさが異なる多様な部分領域が検出される。１０２０はそれぞれの部分領域の集合である。部分領域の集合１０２０には、面積が小さい部分領域１０２１、面積が大きい部分領域１０２２が含まれる。これらについては、図１５の説明中の数１８の説明で後述する。 FIG. 10 schematically shows partial region detection results in the registered region of interest detection unit 212 and the query image of interest region detection unit 232. Various partial areas having different sizes are detected from the image 1010. Reference numeral 1020 denotes a set of the respective partial areas. The partial area set 1020 includes a partial area 1021 having a small area and a partial area 1022 having a large area. These will be described later in the description of Expression 18 in the description of FIG.

なお、部分領域の１つとして、元画像全体の領域を加えてもよい。これにより、画像全体の類似性を加味した検索を可能することもできる。 Note that an area of the entire original image may be added as one of the partial areas. Thereby, it is possible to perform a search in consideration of the similarity of the entire image.

登録画像２１０の検索用特徴量抽出部２１３、および、クエリ画像２３０の検索用特徴量抽出部２３３において用いられる画像特徴量は、色分布、輝度勾配ベクトルの分布等に基づいて算出する。具体的な画像特徴量の定義例は、非特許文献１、非特許文献３等に開示されている。 Image feature amounts used in the search feature amount extraction unit 213 of the registered image 210 and the search feature amount extraction unit 233 of the query image 230 are calculated based on color distribution, luminance gradient vector distribution, and the like. Specific definition examples of image feature amounts are disclosed in Non-Patent Document 1, Non-Patent Document 3, and the like.

このようにして登録画像２１０から抽出された部分領域に関する情報は、データベース２２０に格納される。 Information regarding the partial area extracted from the registered image 210 in this way is stored in the database 220.

図１１は、各部分領域に対して保存される情報のうち、本実施例における類似検索処理に関わる項目を列挙したものである。 FIG. 11 lists items related to the similarity search processing in the present embodiment, among the information stored for each partial area.

項目１１０１は各部分領域がどの画像に含まれていたかを特定するための情報であり、整数値の画像IDとして表現する。項目１１０２は各矩形部分領域の座標値であり、左上端点、右下端点それぞれの座標値を４個の整数値として格納する。項目１１０３は各部分領域が元の画像の中で占める面積の比率である。例えば部分領域が元の画像と一致する場合には、最大値1.0となる。項目１００４は形状情報を表現する画像特徴量、項目１００５は色分布情報を表現する画像特徴量を、それぞれ示している。本実施例では、この２種の特徴量を用いているが、１種でも良いし、３種以上の特徴量を設定してもよい。 An item 1101 is information for specifying which image each partial area is included in, and is expressed as an integer image ID. An item 1102 is a coordinate value of each rectangular partial area, and stores the coordinate values of the upper left end point and the lower right end point as four integer values. Item 1103 is the ratio of the area occupied by each partial region in the original image. For example, when the partial area matches the original image, the maximum value is 1.0. An item 1004 indicates an image feature amount expressing shape information, and an item 1005 indicates an image feature amount expressing color distribution information. In this embodiment, these two types of feature values are used, but one type may be used, or three or more types of feature values may be set.

なお、クエリ画像に関する情報も、図１１に示したのと同様の情報が抽出される。クエリ画像に関する情報は、メモリ上に一時的に保持する形式で、本実施例のアプリケーションは稼動可能である。ただし、同一のクエリ画像を用いて、検索条件が異なる検索要求が複数回発生する事態を想定した場合、毎回、クエリ画像の解析を行うのは効率が悪い。従って、クエリ画像に関する情報は、１つの検索処理を越えてキャッシュされていることが望ましい。一方、同時に多数の検索要求を受領するような規模の大きな運用を考えた場合、メモリ上でキャッシュを行うとメモリ消費が増大し適切ではない。そのような場合は、クエリ画像に関しても、一時的にデータベース上に情報を保存する形式を採用すると好適である。 Note that information related to the query image is also extracted as shown in FIG. The information about the query image is temporarily stored in the memory, and the application of this embodiment can be operated. However, when it is assumed that a search request with different search conditions occurs a plurality of times using the same query image, it is inefficient to analyze the query image every time. Therefore, it is desirable that the information about the query image is cached beyond one search process. On the other hand, when considering a large-scale operation in which a large number of search requests are received at the same time, if the cache is performed on the memory, memory consumption increases, which is not appropriate. In such a case, it is preferable to adopt a format for temporarily storing information on the database for the query image.

なお、本実施例では、部分領域が矩形領域として定義されている例について説明するが、図１２に示す方式では、部分領域の形状が矩形である必要はない。この場合に部分領域が必要とする要件は、画像特徴量が抽出可能であることと、領域の面積が定義されていることである。 In this embodiment, an example in which the partial area is defined as a rectangular area will be described. However, in the method shown in FIG. 12, the shape of the partial area does not have to be rectangular. In this case, the requirements of the partial area are that the image feature amount can be extracted and that the area of the area is defined.

図１２は、類似検索部２３４における処理の流れを示したものである。Ｓ１２０１は、類似検索部２３４が、部分領域特徴量データベースを対象として類似検索を実行する処理である。本処理では、クエリ画像から検出された各部分領域の特徴量をクエリとする類似検索が、クエリ画像の部分領域の数だけ行われる。 FIG. 12 shows the flow of processing in the similarity search unit 234. S1201 is a process in which the similarity search unit 234 executes a similarity search for the partial region feature amount database. In this process, a similar search is performed by using the feature amount of each partial area detected from the query image as many as the number of partial areas of the query image.

Ｓ１２０２は、Ｓ１２０１で取得された検索結果の集合に基づき、クエリ画像の部分領域と各画像の部分領域との対応付けを行う処理である。詳細は図１３で後述する。 S1202 is a process of associating the partial area of the query image with the partial area of each image based on the set of search results acquired in S1201. Details will be described later with reference to FIG.

Ｓ１２０３は、処理１２０２における対応付けの結果から、クエリ画像と各画像との距離を構成する処理である。 S1203 is a process of configuring the distance between the query image and each image based on the result of association in process 1202.

Ｓ１２０４は、処理１２０３で構成された距離をソートすることによって、類似画像検索結果を構成する処理である。 S1204 is a process of constructing a similar image search result by sorting the distances constructed in process 1203.

Ｓ１２０５は、検索結果出力部２３５に対して、その結果を引き渡す処理である。 S1205 is a process of delivering the result to the search result output unit 235.

なお、Ｓ１２０１では、クエリ画像の部分領域とデータベース上の部分領域との間の特徴量間の２乗距離を非類似性の指標として用いた類似検索結果を行う。クエリ画像のｉ番目の矩形領域と、データベース上に格納されたｊ番目の部分領域との特徴量空間上での２乗距離は[数１６]のようになる。 In S1201, a similarity search result is performed using the square distance between the feature amounts between the partial region of the query image and the partial region on the database as an index of dissimilarity. The square distance in the feature amount space between the i-th rectangular area of the query image and the j-th partial area stored in the database is as shown in [Equation 16].

ここで、q_ikは、ｉ番目の部分領域の画像特徴量のｋ番目の要素、f_jkは、データベース上のｊ番目の部分領域の画像特徴量のｋ番目の要素、Ｍは、画像特徴量の次元数である。 Here, q_ik is the k-th element of the image feature quantity of the i-th partial area, f_jk is the k-th element of the image feature quantity of the j-th partial area on the database, and M is the dimension of the image feature quantity. Is a number.

なお、本２乗距離は、データベース上の全部分領域に対して算出されるわけではない。本実施例では、類似検索の高速化のために、クラスタリング処理に基づく近似近傍ベクトル探索アルゴリズムを用いている。近似近傍ベクトル探索とは、データベース上のベクトル中で、クエリとなるベクトルの近傍範囲に存在すると想定されるベクトルを絞り込むことによって、探索の高速化を実現するものである。本実施例で用いているアルゴリズムは、クエリ近傍のＮ個のベクトルの近似解を２乗距離が小さい順序にソートした形式で検索結果として返す。従って、Ｎ個の結果中に含まれないデータベース上の部分領域との２乗距離の値は、その時点では不明である。 The square distance is not calculated for all partial areas on the database. In this embodiment, an approximate neighborhood vector search algorithm based on clustering processing is used to speed up the similarity search. The approximate neighborhood vector search is a method for realizing a high-speed search by narrowing down vectors that are assumed to exist in the neighborhood range of a vector to be queried from vectors on a database. The algorithm used in the present embodiment returns the approximate solution of N vectors in the vicinity of the query as a search result in a form sorted in the order of decreasing square distance. Therefore, the value of the square distance to the partial area on the database not included in the N results is unknown at that time.

本発明の方式では、部分領域に関する類似検索の結果から、クエリ画像とデータベース上に格納された画像間の類似度を構成する。本構成は、クエリ画像の部分領域とデータベース上の各画像の部分領域間で対応付けを行い、部分領域ごとの類似度を画像ごとに集計することによって実現される。以下、図１３を用いてその具体的な方法について説明する。 In the method of the present invention, the similarity between the query image and the image stored on the database is constructed from the result of the similarity search for the partial region. This configuration is realized by associating the partial area of the query image with the partial area of each image on the database, and summing up the similarity for each partial area for each image. Hereinafter, the specific method will be described with reference to FIG.

図１３は、部分領域に関する類似検索の結果を示す模式図である。クエリ画像から検出された部分領域数をＬ、各部分領域に関する類似検索結果の件数をＮとすれば、前述Ｓの１２０１により、Ｌ×Ｎ件の類似検索結果が取得される。図を分かりやすくするために、L=4、N=5としているが、実際には、より多数のクエリ画像の部分領域について、より多数の検索結果を取得する場合が多い。本実施例における類似検索の各結果１３００は、算出された2乗距離、部分領域のID、および、その部分領域を含む画像のIDから構成される。なお、画像IDに関しては、類似検索を実施した後、部分領域IDを用いて、データベース上の情報を参照することによって、その値が取得される。図１３中の表の各行は、クエリ画像の各部分領域をクエリとした類似検索の結果である。各行中の検索結果は、データベース上の部分領域がどの画像に含まれるかとは関係なく、2乗距離が小さい順序、すなわち、類似性が高い順序に整列されている。 FIG. 13 is a schematic diagram showing the result of a similarity search for a partial region. If the number of partial areas detected from the query image is L and the number of similar search results for each partial area is N, L × N similar search results are acquired in 1201 of S described above. In order to make the figure easy to understand, L = 4 and N = 5 are set. However, in practice, a larger number of search results are often acquired for a partial region of a larger number of query images. Each result 1300 of the similarity search in this embodiment is composed of the calculated square distance, the ID of the partial area, and the ID of the image including the partial area. As for the image ID, after performing a similar search, the value is acquired by referring to the information on the database using the partial region ID. Each row of the table in FIG. 13 is a result of a similar search using each partial region of the query image as a query. The search results in each row are arranged in the order of small square distance, that is, the order of high similarity, regardless of which image contains the partial area on the database.

前述のＳ１２０２では、この類似検索結果の集合から、クエリ画像の部分領域とデータベース上の各画像の部分領域の対応付けを行う。本実施例では、対応付けの方式として以下の２つの方式を備えており、システムの運用者、ないしは、検索するユーザが、どちらかを選択する。 In S1202, the query image partial area is associated with the partial area of each image on the database from the set of similar search results. In this embodiment, the following two methods are provided as association methods, and the system operator or the user to search selects either.

第１の方式では、クエリ画像の各部分領域について、データベース上の各画像に含まれる部分領域の検索結果の集合中で１番距離が小さい検索結果を採用する。クエリ画像の異なる部分領域の検索結果に、同一のデータベース上の部分領域が含まれることがある。従って、第１の方式では、クエリ画像の異なる部分領域が、同一のデータベース上の部分領域と対応付けられる可能性がある。第１の方式の詳細は、図１４で説明する。 In the first method, for each partial region of the query image, a search result having the smallest distance in the set of partial region search results included in each image on the database is employed. Search results for different partial areas of the query image may include partial areas on the same database. Therefore, in the first method, there is a possibility that different partial areas of the query image are associated with partial areas on the same database. Details of the first method will be described with reference to FIG.

図１４は、図１３に示した検索結果から、第１の方式によって行われた対応付けの結果である。検索結果には、１、２、５、６，７の画像IDが含まれているため、その５つの画像について、それぞれ、クエリ画像の部分領域と最も類似性が高い検索結果が選択されている。 FIG. 14 shows the result of association performed by the first method from the search result shown in FIG. Since the search results include the image IDs 1, 2, 5, 6, and 7, the search results that are most similar to the partial regions of the query image are selected for the five images. .

第２の方式は、第１の方式とは異なり、同一のデータベース上の部分領域が多重に対応付けされることはない。第２の方式では、データベース上の画像ごとに、クエリ画像の全部分領域による検索結果を集めた集合を構成し、その集合を２乗距離が小さい順序にソートした上で、上位から順番に対応付けを行う。この時、対応付けに採用されたデータベース上の部分領域のIDを保持することによって、既に対応付けに採用された部分領域を排除して処理を進める。このようにして、２乗距離が小さい、すなわち、類似性が高いクエリ画像の部分領域とデータベース上の画像の部分領域の組を優先した、重複のない対応付けが可能となる。第２の方式の詳細は、図１５で説明する。 Unlike the first method, the second method does not associate multiple partial areas on the same database. In the second method, for each image on the database, a set of search results from all partial regions of the query image is configured, and the set is sorted in ascending order of the square distance, and then handled in order from the top. To do. At this time, by holding the ID of the partial area on the database adopted for the association, the partial area already adopted for the association is excluded and the process proceeds. In this manner, it is possible to associate the query images with a small square distance, that is, a combination of the query image partial area and the image partial area on the database without priority overlap. Details of the second method will be described with reference to FIG.

図１５は、図１３に示した検索結果から、第２の方式によって行われた対応付けの結果である。例えば、３番目の部分領域の画像IDが１である画像に含まれる部分領域の対応付けでは、類似度がより上位であった部分領域のIDが２である部分領域が、１番目の部分領域との対応付けで使われているため、下位にあった部分領域の画像IDが３である部分領域と対応付けが行われている。また、４番目の部分領域では、該当する部分領域が存在しないため、画像IDが１の画像では対応付けが行われていない。 FIG. 15 shows the result of association performed by the second method from the search results shown in FIG. For example, in the association of the partial areas included in the image whose image ID of the third partial area is 1, the partial area whose ID of the partial area whose similarity is higher is 2 is the first partial area Is associated with the partial area whose image ID is 3 in the lower partial area. In the fourth partial area, since there is no corresponding partial area, the image with the image ID of 1 is not associated.

あるデータベース上の画像に着目した場合、クエリ画像の全ての部分領域について対応付けが行われることは一般的ではない。先に述べたように、部分領域の検索結果には、全ての部分領域との評価結果が含まれているわけではない。また、検出する部分領域の数は画像によって異なるため、重複した対応付けを行わない第２の方式では、原理的に、全ての対応付けが行われることはない。部分領域ごとの類似度からクエリ画像と各画像との類似度を構成する際に、部分領域ごとの類似度が確定していたなら、部分領域ごとの類似度の総和が高いものが、画像としての類似度も高いとする自然な仮定が成立する。部分領域ごとの類似度が部分的に不定で、かつ、不定となる部分領域の数が画像間で異なる場合、この不定となる部分領域の扱いが問題となる。 When attention is paid to an image on a certain database, it is not general that association is performed for all partial areas of the query image. As described above, the partial region search results do not include evaluation results for all partial regions. In addition, since the number of partial areas to be detected varies depending on the image, in the second method in which overlapping association is not performed, all associations are not performed in principle. When the similarity between the query image and each image is configured from the similarity for each partial area, if the similarity for each partial area has been determined, an image with a high sum of similarities for each partial area is displayed as an image. The natural assumption that the degree of similarity is high also holds. When the similarity for each partial area is partially indefinite and the number of indefinite partial areas differs between images, the handling of the indefinite partial area becomes a problem.

ある類似検索を行った場合に検索結果に含まれなかったデータは、検索結果中で最も類似性が低いデータよりも類似性が低いと推定するのは妥当である。そこで、本実施例では、不定となる部分領域に対して、各部分領域の検索結果中の最下位の検索結果の２乗距離の値を設定する。図１３に示した類似検索結果の場合では、１番目の部分領域で不定の場合は、2乗距離0.80を、２番目の部分領域で不定の場合は、2乗距離0.60を、３番目の部分領域で不定の場合は、2乗距離0.50を、４番目の部分領域で不定の場合は、2乗距離0.70を、それぞれが不定となる場合の2乗距離として設定されている。 It is reasonable to estimate that data that is not included in the search result when a similar search is performed is lower in similarity than data having the lowest similarity in the search result. Therefore, in this embodiment, the value of the square distance of the lowest search result in the search results of each partial area is set for the indefinite partial area. In the case of the similar search result shown in FIG. 13, when the first partial area is indefinite, the square distance 0.80, and when the second partial area is indefinite, the square distance 0.60 is set. When the area is indefinite, the square distance 0.50 is set, and when the fourth partial area is indefinite, the square distance 0.70 is set as the square distance when each area is indefinite.

なお、別の設定方式としては、十分大きなデータベース上で統計的に２乗距離の統計量、例えば平均と分散を求め、その統計量に基づいた値を設定する方式もある。 As another setting method, there is a method in which a statistical amount of square distance, for example, an average and a variance is obtained statistically on a sufficiently large database, and a value based on the statistical amount is set.

上記一連の処理によって、クエリ画像に含まれる各部分領域とデータベース上の画像との２乗距離を、対応付けられた部分領域の２乗距離として定義することができる。このように定義されたクエリ画像の部分領域とデータベース上の画像との２乗距離に用いて、クエリ画像とデータベース上の画像との距離を[数１７]のように定義する。 Through the above series of processes, the square distance between each partial area included in the query image and the image on the database can be defined as the square distance of the associated partial area. The distance between the query image and the image on the database is defined as [Equation 17] using the square distance between the partial region of the query image defined in this way and the image on the database.

ここで、D_iは、クエリ画像とデータベース上のｉ番目の画像との合成された２乗距離、Ｌはクエリ画像に含まれる部分領域の数、d_jiは、クエリ画像のj番目の部分領域とｉ番目の画像との２乗距離、S_jは、j番目の部分領域の面積がクエリ画像の面積に占める比率、Ｐは、S_jの効果を制御するパラメータである。一方、Ｚは、次式のように定義される正規化項である。 Here, D_i is a combined square distance between the query image and the i-th image in the database, L is the number of partial regions included in the query image, and d_ji is the j-th partial region of the query image and i The square distance from the second image, S_j is the ratio of the area of the jth partial region to the area of the query image, and P is a parameter that controls the effect of S_j. On the other hand, Z is a normalization term defined as:

本正規化項は、アプリケーション側でD_iの値を扱いやすくするためのもので、検索処理自体に影響を与えるものではない。 This normalization term is intended to make it easier to handle the value of D_i on the application side, and does not affect the search process itself.

数１７における制御パラメータＰは、以下のような機能を果たす。 The control parameter P in Equation 17 performs the following function.

P=0.0の場合、全てのd_jiは、等価に重み付けられてD_iが算出される。Ｐを大きく設定すれば、面積比S_jが大きな部分領域のd_jiの値が重視されてD_iが算出される。図１０の部分領域検出結果１０２０の例では、面積が小さい部分領域１０２１の類似性は低く評価され、面積が大きい１０２２部分領域の類似性が重視されることとなる。 When P = 0.0, all d_jis are weighted equally to calculate D_i. If P is set to a large value, D_i is calculated with an emphasis on the value of d_ji of the partial region having a large area ratio S_j. In the example of the partial region detection result 1020 in FIG. 10, the similarity of the partial region 1021 having a small area is evaluated low, and the similarity of the 1022 partial region having a large area is emphasized.

なお、本実施例では、類似性の比較に非類似度である2乗距離を用いているが、類似性が高いと大きくなる類似度を用いる場合もある。類似度の例としては、例えば[数１９]のような負の2乗距離を指数変換したものがある。 In this embodiment, the square distance, which is a dissimilarity, is used for comparing the similarity. However, a similarity that increases when the similarity is high may be used. As an example of the similarity, for example, a negative square distance such as [Equation 19] is exponentially converted.

数１９の類似度は、2乗距離dが0.0で最大値1.0となり、dが大きくなると0.0に近づく。このような類似度を用いた場合でも、数１６に示した方式が適用可能である。すなわち、画像間の類似度は、基本的には、部分領域間の類似度の総和として算出され、部分領域間の類似度が高いものが多ければ、画像間の類似度は高くなる。クエリ画像の部分領域の面積による重み付け加算の効果も、数１７の場合と全く変わらない。 The similarity in Equation 19 reaches a maximum value of 1.0 when the square distance d is 0.0, and approaches 0.0 when d increases. Even when such a similarity is used, the method shown in Equation 16 can be applied. That is, the similarity between images is basically calculated as the sum of the similarities between partial areas. If there are many similarities between partial areas, the similarity between images increases. The effect of the weighted addition based on the area of the partial region of the query image is not different from that in the case of Expression 17.

部分領域に対する類似検索結果に含まれる部分領域を含む全てのデータベース上の画像について、上記のD_iを算出した後で、D_iの値でソートすることによって、類似画像検索の結果を構成する。 For all the images on the database including the partial area included in the similar search result for the partial area, after calculating the above D_i, the result of the similar image search is configured by sorting by the value of D_i.

なお、上述の説明では、検索クエリとして画像が与えられた場合について述べたが、データベース上に既に登録されている画像をクエリとして検索する場合もある。この場合は、図１１に示した情報がデータベース上から取得可能なため、クエリ画像の解析は不要となる。従って、図２の類似検索部２３４から処理を開始することができる。 In the above description, the case where an image is given as a search query has been described, but an image already registered on the database may be searched as a query. In this case, since the information shown in FIG. 11 can be acquired from the database, the analysis of the query image becomes unnecessary. Therefore, the process can be started from the similarity search unit 234 of FIG.

図１６は、本実施例で説明する検索サービスのシステム構成図である。本サービスは、検索システム１６００が具備する機能をWebAPI１６０２によってユーザに提供するものである。検索システム１６００のハードウェア構成に、検索対象の規模、および、検索要求頻度に応じて、適切なCPU数、メモリ量、ハードディスク量等の計算資源が割り当てられる。本システム上で稼動する各種プログラムは、プロセス間通信により、情報のやりとりを行う。 FIG. 16 is a system configuration diagram of the search service described in this embodiment. This service provides the user with the functions of the search system 1600 through the WebAPI 1602. An appropriate calculation resource such as the number of CPUs, the amount of memory, and the amount of hard disk is allocated to the hardware configuration of the search system 1600 according to the scale of the search target and the frequency of search requests. Various programs running on this system exchange information by inter-process communication.

ユーザは、Webブラウザ１６０１を用いて、検索要求を発行する。検索要求は、本サービスのWebAPI１６０２によって受領され、その結果は、WebAPI１６０２のレスポンスとしてWebブラウザ１６０１に返される。図２の検索に関わる処理２３１、２３２、２３３、２３４、２３５は、Webサーバ１６０３上で行われる。Webサーバ１６０３は、受領しクエリ画像の解析結果を一時DBサーバ１６０４に送付し、一時DBサーバ１６０４は、その内容をファイルシステム上に保存する。一時DBサーバ１６０４については、システムの安定稼動のために冗長化が必要な場合は、複数個のプロセスを稼動させることができる。複数個の一時DBサーバのプロセスが稼動している場合は、Webサーバ１６０３は、最も負荷が小さい一時DBサーバを登録先として選択することができる。 The user issues a search request using the Web browser 1601. The search request is received by the WebAPI 1602 of this service, and the result is returned to the Web browser 1601 as a response of the WebAPI 1602. Processing 231, 232, 233, 234, and 235 related to the search in FIG. 2 is performed on the Web server 1603. The Web server 1603 sends the received query image analysis result to the temporary DB server 1604, and the temporary DB server 1604 stores the contents on the file system. As for the temporary DB server 1604, a plurality of processes can be operated if redundancy is required for stable operation of the system. When a plurality of temporary DB server processes are operating, the Web server 1603 can select a temporary DB server with the smallest load as a registration destination.

図２の類似検索処理部２３４が行う処理の内、図１２の「部分領域特徴量DBを対象とする類似検索」１２０１に相当する処理は、Webサーバ１６０３ではなく、検索サーバ１６０５で行われる。大規模データを対象とする場合は、検索サーバ１６０５を複数プロセス稼動させることによって、並列分散処理を行う。 Of the processes performed by the similarity search processing unit 234 in FIG. 2, the process corresponding to the “similar search for a partial region feature DB” 1201 in FIG. 12 is performed not by the Web server 1603 but by the search server 1605. When large-scale data is targeted, parallel distributed processing is performed by operating a plurality of search servers 1605.

図２の画像の登録に関わる処理２１１、２１２、２１３に相当する処理は、データ登録プログラム１６０６上で行われる。データ登録プログラム１６０６は、取得した登録画像の解析結果を検索サーバ１６０５送付し、検索サーバ１６０５は、その内容をファイルシステム上に保存する。 Processing corresponding to the processing 211, 212, and 213 related to image registration in FIG. 2 is performed on the data registration program 1606. The data registration program 1606 sends the analysis result of the acquired registered image to the search server 1605, and the search server 1605 stores the contents on the file system.

検索サーバ１６０５、および、一時DBサーバ１６０４において、部分領域に関して管理される情報は、図１１に示したものと同様である。 In the search server 1605 and the temporary DB server 1604, the information managed regarding the partial area is the same as that shown in FIG.

図１７は、各画像についてデータベース上で管理される項目の一覧である。項目１７０１は、その画像に含まれる部分領域IDの配列で、可変長の整数配列として管理される。項目１７０２は、画像の大きさで、縦横の画素数が２つの整数値として保存される。項目１７０３は、画像が登録された年月日で、整数値として管理される。項目１７０４は、絞り込み検索を行うためのキーワードである。項目１７０３と１７０４は、一時DBサーバ１６０４では使用されない。項目１７０５は、画像データ、項目１７０６は、サムネール画像で、両者ともに、検索結果画面の表示に用いられる。 FIG. 17 is a list of items managed on the database for each image. An item 1701 is an array of partial area IDs included in the image, and is managed as a variable-length integer array. An item 1702 is the size of an image, and the number of vertical and horizontal pixels is stored as two integer values. An item 1703 is the date when the image is registered, and is managed as an integer value. An item 1704 is a keyword for performing a narrowing search. Items 1703 and 1704 are not used in the temporary DB server 1604. An item 1705 is image data, and an item 1706 is a thumbnail image, both of which are used to display a search result screen.

本実施例では、単純な類似画像検索だけではなく、登録年月日１７０３、および、キーワード１７０４による検索対象の絞り込みと組み合わせた類似画像検索機能が提供される。本機能は、先ず、対象となる画像を検索条件式を評価することによって絞り込んだ後、部分領域IDの配列１７０１を参照することによって、部分領域の検索対象の絞り込みを行う。これによって、効率的な検索が可能となる。 In the present embodiment, not only a simple similar image search but also a similar image search function combined with the search date 1703 and the search target narrowing by the keyword 1704 is provided. This function first narrows down the search target of the partial area by referring to the array 1701 of the partial area ID after narrowing down the target image by evaluating the search condition formula. As a result, an efficient search is possible.

図１８は、Webブラウザ１６０１上に表示される検索用の画面の模式図である。 FIG. 18 is a schematic diagram of a search screen displayed on the Web browser 1601.

図１８の１８１０は、検索条件設定のための画面である。１８１１は、クエリ画像を設定する領域であり、ユーザは、ドラッグアンドドロップ操作によって、自らのファイルシステム上に存在する任意の画像ファイルを画面上の領域１８１１に置くことにより、類似検索のクエリを設定する。ユーザが検索ボタン１８１２をクリックすると、画面遷移が発生し、検索結果画面１８２０が表示される。検索結果画面１８２０では、１８１１で指定したクエリ画像がサムネール画像化されて１８２１に表示され、類似検索の結果の一覧が、１８２２に表示される。ボタンの列１８２３は、ページ送りをするためのもので、より順位が低い、すなわち、類似性が低かった画像を閲覧するためのものである。また、１８２２に表示された画像をクリックすると、クリックされた画像をクエリとした類似検索が実行され、検索結果画面１８２０の内容が更新される。これによって、データベース上に登録された画像をクエリとした類似検索を行うことができる。 1810 in FIG. 18 is a screen for setting search conditions. Reference numeral 1811 denotes an area for setting a query image. The user sets a query for similarity search by placing an arbitrary image file existing on his file system in the area 1811 on the screen by a drag-and-drop operation. To do. When the user clicks the search button 1812, a screen transition occurs and a search result screen 1820 is displayed. On the search result screen 1820, the query image specified in 1811 is converted into a thumbnail image and displayed in 1821, and a list of similar search results is displayed in 1822. A row of buttons 1823 is for page-turning, and is for browsing an image having a lower rank, that is, a lower similarity. If an image displayed in 1822 is clicked, a similarity search using the clicked image as a query is executed, and the content of the search result screen 1820 is updated. This makes it possible to perform a similar search using an image registered on the database as a query.

検索条件設定画面１８１０上の１８１３は、検索対象となる画像を登録年月日によって絞り込むためのGUI部品である。ユーザは、登録年月日の範囲を下限と上限を表す２つの数値によって入力することができる。入力がない場合は、登録年月日による絞り込みは行われない。また、例えば、下限のみを設定した場合は、入力された年月日以降から最新の登録画像までが検索対象となる。１８１４は、キーワードによる絞り込み条件を設定するためのGUI部品である。ユーザは、任意の文字列をテキストフィールドに入力することができる。入力がない場合は、キーワードによる絞り込みは行われない。なお、登録年月日による条件とキーワードによる条件の両者を設定した場合は、その論理積（AND）が絞り込み条件となる。 A reference numeral 1813 on the search condition setting screen 1810 is a GUI component for narrowing down the search target image by the registered date. The user can input the registration date range by two numerical values representing the lower limit and the upper limit. If there is no input, narrowing down by registration date is not performed. For example, when only the lower limit is set, the search target is from the input date and after to the latest registered image. Reference numeral 1814 denotes a GUI component for setting a narrow-down condition based on a keyword. The user can input an arbitrary character string into the text field. If there is no input, filtering by keyword is not performed. When both the condition based on the registration date and the condition based on the keyword are set, the logical product (AND) is a narrowing condition.

２つのトグルボタンから構成されるGUI部品１８１５は、類似検索に使用する画像特徴量を選択するためのものである。ユーザは、形状の類似性を重視した検索を行いたい場合は、「SHAPE」のラベルのついたボタンをチェックし、色の類似性を重視した検索を行いたい場合は、「COLOR」のラベルのついたボタンをチェックする。 A GUI component 1815 composed of two toggle buttons is for selecting an image feature amount used for similarity search. The user checks the button labeled “SHAPE” if he / she wants to perform a search that emphasizes the similarity of shapes, and if the user wishes to perform a search which emphasizes the similarity of colors, Check the attached button.

スライドバー１８１６は、数１６で示した、部分領域の面積の効果を制御するパラメータPの値を指定するためのGUI部品である。本実施例では、その値を0.0〜1.0の範囲で、ユーザが自由に設定することができる。ユーザは、クエリ画像を部分的に含むような画像を検索したい場合は、1.0に近い値を設定する。クエリ画像中に含む部分と共通するような部分を含む画像を検索したい場合は、0.0に近い値を設定する。 The slide bar 1816 is a GUI component for designating the value of the parameter P that controls the effect of the area of the partial region shown in Expression 16. In the present embodiment, the user can freely set the value in the range of 0.0 to 1.0. When the user wants to search for an image that partially includes the query image, the user sets a value close to 1.0. If you want to search for an image containing a part that is common to the part included in the query image, set a value close to 0.0.

なお、本実施例では、本機能を実現するためのGUI部品として、連続量を指定するのに適したスライドバー１８１６を用いているが、実際に指定可能なのは、適切なレベルに量子化した値である。Pの値が変わると類似検索の結果は変化する。Webサーバ１６０３は、ページ切り替えボタン１８２３の操作に伴う画面遷移等への対応のために、同一の検索条件の検索結果をキャッシュしている。検索条件が変化した場合、新たなキャッシュを生成する必要があり、本質的な効果がないような微妙なPの変化に追随してキャッシュを生成することは、システム運用上から効率的ではない。本実施例では、WebAPIで受領したPの値を適切に量子化した上で、類似検索処理を行っている。また、量子化処理されることが前提なので、GUI部品としては、スライドバーでなく、量子化された値と対応したボタン列上での値の選択等の方式でも問題はない。 In this embodiment, a slide bar 1816 suitable for specifying a continuous amount is used as a GUI component for realizing this function. However, what can be actually specified is a value quantized to an appropriate level. It is. When the value of P changes, the result of the similarity search changes. The Web server 1603 caches search results under the same search condition in order to cope with screen transitions associated with the operation of the page switching button 1823. When the search condition changes, it is necessary to generate a new cache, and it is not efficient in terms of system operation to generate a cache following a subtle change in P that has no essential effect. In this embodiment, the similarity search process is performed after appropriately quantizing the value of P received by WebAPI. In addition, since it is premised that quantization processing is performed, there is no problem even if the GUI component is not a slide bar but a method such as selection of a value on a button row corresponding to a quantized value.

検索結果画面１８２０の領域１８２４には、検索条件設定画面１８１０の１８１３、１８１４、１８１５、１８１６と同様のGUI部品が配置されている。ユーザは、これらのGUI部品によって、クエリ画像を変えずに、他の検索条件のみを変更して検索を実施することができる。本検索は、検索ボタン１８２５をクリックすると実行され、検索結果画面１８２０の内容が更新される。
In the area 1824 of the search result screen 1820, GUI parts similar to those in the search condition setting screen 1810 1813, 1814, 1815, and 1816 are arranged. With these GUI parts, the user can perform a search by changing only other search conditions without changing the query image. This search is executed when the search button 1825 is clicked, and the contents of the search result screen 1820 are updated.

100:計算機システム 110:ネットワーク 120:端末計算機 210:登録画像 211:画像登録部 212:着目領域検出部 213:検索用特徴量抽出部 220:データベース 230:クエリ画像 231:クエリ画像受領部 232:着目領域検出部 233:検索用特徴量抽出部 234:類似検索部 235:検索結果出力部 240:検索結果 301:検出対象画像 302:矩形部分領域 303:ブロックごとの画像特徴量 601:輝度勾配ベクトル 602:輝度勾配ベクトルの強度分布 801:入力画像 802:スケールとアスペクトの変換 803:多重解像度化 810:走査処理 910:平行移動 920:拡大縮小 1010:画像 1020:部分領域検出結果 1021:面積が小さい部分領域の例 1022:面積が大きい部分領域の例 1101:画像ID 1102:矩形の座標値 1103:矩形の相対面積 1104:画像特徴量(形状) 1105:画像特徴量（色） 1600:検索システム 1601:Web ブラウザ 1602:Web API 1603:Webサーバ 1604:一時DBサーバ 1605:検索サーバ 1606:データ登録プログラム 1701:部分領域IDの配列 1702:画像サイズ 1703:登録年月日 1704:絞り込み用キーワード 1705:画像データ 1706:サムネール画像 1810:検索条件設定画面 1811:クエリ画像表示領域 1812:検索ボタン 1813:登録年月日範囲指定 1814:キーワード指定 1815:画像特徴量選択 1816:部分領域の面積の効果を制御するパラメータの設定 1920:検索結果表示画面 1921:クエリ画像表示領域 1922:検索結果表示領域 1923:ページ遷移ボタン 1924:検索条件指定 1925:検索ボタン 100: computer system 110: network 120: terminal computer 210: registered image 211: image registration unit 212: attention area detection unit 213: search feature extraction unit 220: database 230: query image 231: query image reception unit 232: attention Region detection unit 233: Search feature amount extraction unit 234: Similarity search unit 235: Search result output unit 240: Search result 301: Detection target image 302: Rectangular partial region 303: Image feature amount for each block 601: Brightness gradient vector 602 : Intensity distribution of luminance gradient vector 801: Input image 802: Scale and aspect conversion 803: Multi-resolution 810: Scan processing 910: Translation 920: Enlargement / reduction 1010: Image 1020: Partial area detection result 1021: Small area Example of area 1022: Example of partial area with large area 1101: Image ID 1102: Rectangle coordinate value 1103: Relative area of rectangle 1104: Image feature (shape) 1105: Image feature (color) 1600: Search system 1601: Web browser 1602: Web API 1 603: Web server 1604: Temporary DB server 1605: Search server 1606: Data registration program 1701: Partial area ID array 1702: Image size 1703: Registration date 1704: Keyword for narrowing down 1705: Image data 1706: Thumbnail image 1810: Search condition setting screen 1811: Query image display area 1812: Search button 1813: Registration date range specification 1814: Keyword specification 1815: Image feature selection 1816: Parameter setting to control the effect of the area of the partial area 1920: Search result Display screen 1921: Query image display area 1922: Search result display area 1923: Page transition button 1924: Search condition specification 1925: Search button

Claims

A first step in which the region of interest detection unit detects a plurality of partial regions included in the query image;
A second step in which the feature amount extraction unit extracts a plurality of feature amounts of the detected partial areas;
A third step in which the similarity search unit associates the extracted feature quantities with the feature quantities of the image partial areas stored in advance in the DB, and calculates the similarity of the associated feature quantities, respectively. When,
A fourth step in which the similarity search unit weights each calculated similarity according to the area of each partial region included in the query image;
A fifth step in which the similarity search unit calculates the similarity between the query image and the search target image based on a total value of values obtained by weighting the similarity of each partial region;
A search result output unit that outputs a search result based on the similarity calculated in the fifth step to the display unit;
A similar image search method comprising:

The similar image search method according to claim 1,
The first step includes a process of multi-resolutioning the query image, a process of generating a plurality of partial areas, and a process of narrowing down the generated partial areas based on symmetry evaluation. Similar image search method.

The similar image search method according to claim 2,
The first step is a process of generating a detailed partial area and adding it to the object of the symmetry evaluation when the number of the partial areas narrowed down based on the symmetry evaluation is a predetermined value or more. Including
The similar image search method according to claim 2, wherein the second step is performed when the number of the plurality of partial areas narrowed down based on the symmetry evaluation is less than a predetermined value.

The similar image search method according to claim 3,
The similar image search method according to claim 2, wherein the second step is performed when the number of narrowing trials based on the symmetry evaluation becomes a predetermined value or more.

A region of interest detection unit that detects a plurality of partial regions included in the query image;
A feature quantity extraction unit that extracts a plurality of feature quantities of the detected partial area;
The extracted feature quantities are associated with the feature quantities of the image partial areas stored in advance in the DB, respectively, the similarity of the associated feature quantities is calculated, respectively, A similarity search unit that performs weighting according to the area of each partial region included in the query image, and calculates the similarity between the query image and the search target image based on a total value of the weighted values;
A search result output unit that outputs a search result based on the calculated similarity to the display unit;
A similar image search system comprising: