JP6034970B2

JP6034970B2 - Dictionary generation system, dictionary generation method, and dictionary generation program

Info

Publication number: JP6034970B2
Application number: JP2015529303A
Authority: JP
Inventors: 廣池　敦; 敦廣池; 裕樹渡邉
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2013-08-02
Filing date: 2013-08-02
Publication date: 2016-11-30
Anticipated expiration: 2033-08-02
Also published as: WO2015015634A1; JPWO2015015634A1

Description

本発明は、辞書を生成する辞書生成システム、辞書生成方法、および辞書生成プログラムに関する。 The present invention relates to a dictionary generation system, a dictionary generation method, and a dictionary generation program that generate a dictionary.

従来、少量多種の物体に適用可能な類似画像検索に基づくオブジェクト検出手法がある。本手法は、入力画像の部分領域に対して、検出対象の事例を登録したデータベースから最近傍事例を検索し、特徴量空間での距離によってオブジェクトであるか否かを判別する。 Conventionally, there is an object detection method based on a similar image search that can be applied to a small amount of various objects. In this method, for the partial region of the input image, the nearest neighbor case is searched from a database in which cases to be detected are registered, and it is determined whether the object is an object based on the distance in the feature amount space.

渡邉裕樹，永吉洋登，廣池敦，"類似画像検索に基づく事例ベース一般オブジェクト検出"，信学技報，vol.111，no.353 (PRMU2011-124--PRMU2011-146)，pp.101-106,2011.Hiroki Watanabe, Hiroto Nagayoshi, Satoshi Tsunoike, "Case-Based General Object Detection Based on Similar Image Search", IEICE Technical Report, vol.111, no.353 (PRMU2011-124--PRMU2011-146), pp.101- 106,2011.

上述した従来技術では、検出したい対象物が含まれる領域を辞書パターンとして登録する必要がある。検出精度を向上させるためには、適切な辞書パターンを多数登録する必要があるが、実運用上、その作業コストがかかるという問題がある。 In the above-described prior art, it is necessary to register an area including an object to be detected as a dictionary pattern. In order to improve the detection accuracy, it is necessary to register a large number of appropriate dictionary patterns, but there is a problem that the operation cost is increased in actual operation.

本発明は、信頼性の高い辞書パターンを自動登録することにより辞書を自動生成することを目的とする。 An object of the present invention is to automatically generate a dictionary by automatically registering a highly reliable dictionary pattern.

本願において開示される発明の一側面となる辞書生成システム、辞書生成方法、および辞書生成プログラムは、第１の画像内の第１の領域群の中から選ばれた選択領域と前記第１の領域群の中から選ばれた前記選択領域以外の第１の領域との間における特徴量空間内の第１の距離と、前記選択領域と第２の画像内の第２の領域群の中から選ばれた第２の領域との間における特徴量空間内の第２の距離と、を取得し、取得された前記第１の距離と前記第２の距離との比に基づいて、前記選択領域を辞書パターンにすべきか否かを判定し、辞書パターンにすべきと判定された場合、前記選択領域を辞書パターン群が記憶される辞書に登録することを特徴とする。
前記第１の画像を含む第１の画像集合の各々の画像には、共通の属性情報が付与されており、前記第２の画像を含む第２の画像集合の各々の画像には、前記属性情報が付与されておらず、前記第１の領域群は、前記第１の画像集合内の各画像から得られた領域群であり、前記第２の領域群は、前記第２の画像集合内の各画像から得られた領域群である。
また、前記第１の画像を含む第１の画像集合の各々の画像と前記第２の画像を含む第２の画像集合の各々の画像とのいずれにも属性情報が付与されておらず、前記第１の領域群は、前記第１の画像集合内の各画像から得られた領域群であり、前記第２の領域群は、前記第２の画像集合内の各画像から得られた領域群であってもよい。
また、前記第１の画像を含む第１の画像集合の各々の画像には、属性情報が付与されておらず、前記第２の画像を含む第２の画像集合の各々の画像には、共通の属性情報が付与されており、前記第１の領域群は、前記第１の画像集合内の各画像から得られた領域群であり、前記第２の領域群は、前記第２の画像集合内の各画像から得られた領域群であってもよい。
また、前記第１の画像を含む第１の画像集合の各々の画像には、共通の第１の属性情報が付与されており、前記第２の画像を含む第２の画像集合の各々の画像には、共通の第２の属性情報が付与されており、前記第１の領域群は、前記第１の画像集合内の各画像から得られた領域群であり、前記第２の領域群は、前記第２の画像集合内の各画像から得られた領域群であってもよい。 A dictionary generation system, a dictionary generation method, and a dictionary generation program according to an aspect of the invention disclosed in the present application are a selection area selected from a first area group in a first image and the first area. A first distance in the feature amount space between the first region other than the selected region selected from the group and a second region group in the second region in the selected region and the second image. A second distance in the feature amount space between the selected second area and the selected area based on the ratio between the acquired first distance and the second distance. It is determined whether or not to be a dictionary pattern, and when it is determined to be a dictionary pattern, the selected region is registered in a dictionary in which a dictionary pattern group is stored.
Common attribute information is given to each image of the first image set including the first image, and each attribute of the second image set including the second image is assigned to the attribute. No information is given, the first region group is a region group obtained from each image in the first image set, and the second region group is in the second image set. This is a group of regions obtained from the respective images .
Further, no attribute information is given to any of each image of the first image set including the first image and each image of the second image set including the second image, The first area group is an area group obtained from each image in the first image set, and the second area group is an area group obtained from each image in the second image set. It may be.
Further, attribute information is not given to each image of the first image set including the first image, and common to each image of the second image set including the second image. The first region group is a region group obtained from each image in the first image set, and the second region group is the second image set. It may be a group of regions obtained from each of the images.
In addition, common first attribute information is given to each image of the first image set including the first image, and each image of the second image set including the second image is provided. Are given common second attribute information, and the first region group is a region group obtained from each image in the first image set, and the second region group is A region group obtained from each image in the second image set may be used.

本発明の代表的な実施の形態によれば、信頼性の高い辞書パターンを自動登録することにより辞書を自動生成することができる。前述した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to a typical embodiment of the present invention, a dictionary can be automatically generated by automatically registering a highly reliable dictionary pattern. Problems, configurations, and effects other than those described above will become apparent from the description of the following embodiments.

本発明にかかる辞書生成システムにおける辞書生成例を示す説明図である。It is explanatory drawing which shows the example of a dictionary production | generation in the dictionary production | generation system concerning this invention. 辞書生成システムにおける辞書生成の具体例を示す説明図である。It is explanatory drawing which shows the specific example of the dictionary production | generation in a dictionary production | generation system. 辞書生成システムのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of a dictionary production | generation system. 実施例１にかかる辞書生成システムの機能的構成例を示すブロック図である。It is a block diagram which shows the functional structural example of the dictionary production | generation system concerning Example 1. FIG. 第１の画像集合および第２の画像集合からの領域生成例を示す説明図である。It is explanatory drawing which shows the example of an area | region production | generation from a 1st image set and a 2nd image set. 摂動処理による領域生成例１を示す説明図である。It is explanatory drawing which shows the area | region production example 1 by a perturbation process. 摂動処理による領域生成例２を示す説明図である。It is explanatory drawing which shows the area | region production example 2 by a perturbation process. 摂動処理による領域生成例３を示す説明図である。It is explanatory drawing which shows the area | region production example 3 by a perturbation process. 辞書生成システムによる辞書生成処理手順例を示すフローチャートである。It is a flowchart which shows the example of a dictionary production | generation process procedure by a dictionary production | generation system. 図９に示した生成処理（ステップＳ９０１）の詳細な処理手順例を示すフローチャートである。10 is a flowchart illustrating a detailed processing procedure example of the generation processing (step S901) illustrated in FIG. 9. 図１０に示した領域群生成処理（ステップＳ１００３）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the example of a detailed process sequence of the area | region group production | generation process (step S1003) shown in FIG. 図９に示した取得処理（ステップＳ９０２）の詳細な処理手順例を示すフローチャートである。10 is a flowchart illustrating a detailed processing procedure example of the acquisition processing (step S902) illustrated in FIG. 9. 図１２に示した第１の平均最小距離取得処理（ステップＳ１２０１）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the detailed process sequence example of the 1st average minimum distance acquisition process (step S1201) shown in FIG. 図１３に示した第１の平均最小距離算出処理（ステップＳ１３０５）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the detailed process sequence example of the 1st average minimum distance calculation process (step S1305) shown in FIG. 図１４に示した最小距離累積処理（ステップＳ１４０５）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the detailed process sequence example of the minimum distance accumulation | storage process (step S1405) shown in FIG. 図１２に示した第２の平均最小距離取得処理（ステップＳ１２０２）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the detailed process sequence example of the 2nd average minimum distance acquisition process (step S1202) shown in FIG. 図１６に示した第２の平均最小距離算出処理（ステップＳ１６０５）の詳細な処理手順例を示すフローチャートである。17 is a flowchart showing a detailed processing procedure example of the second average minimum distance calculation processing (step S1605) shown in FIG. 16. 図１７に示した最小距離累積処理（ステップＳ１７０３）の詳細な処理手順例を示すフローチャートである。It is a flowchart which shows the detailed process sequence example of the minimum distance accumulation | storage process (step S1703) shown in FIG. 図９に示した決定処理（ステップＳ９０３）の詳細な処理手順例を示すフローチャートである。10 is a flowchart illustrating a detailed processing procedure example of the determination processing (step S903) illustrated in FIG. 9. 図９に示した摂動処理（ステップＳ９０５）の詳細な処理手順例を示すフローチャートである。10 is a flowchart illustrating a detailed processing procedure example of the perturbation processing (step S905) illustrated in FIG. 9. 実施例２にかかる最小距離算出処理（ステップＳ１６０５）の詳細な処理手順例を示すフローチャートである。12 is a flowchart illustrating a detailed processing procedure example of a minimum distance calculation process (step S1605) according to the second embodiment. 実施例５にかかるコンテンツクラウドシステムのシステム構成例を示すブロック図である。It is a block diagram which shows the system structural example of the content cloud system concerning Example 5. FIG. 辞書生成システムの運用方式例を示すブロック図である。It is a block diagram which shows the example of an operation system of a dictionary production | generation system. 画像管理サーバが管理する情報の一覧を示す説明図である。It is explanatory drawing which shows the list of the information which an image management server manages. 比較用パターン管理サーバが管理する情報を示す説明図である。It is explanatory drawing which shows the information which the pattern management server for a comparison manages. 辞書パターン管理サーバが管理する情報の一覧を示す説明図である。It is explanatory drawing which shows the list of the information which a dictionary pattern management server manages. 辞書生成に用いられる画面の一例を示す説明図である。It is explanatory drawing which shows an example of the screen used for dictionary production | generation. 確認画面の表示例を示す説明図である。It is explanatory drawing which shows the example of a display of a confirmation screen.

図１は、本発明にかかる辞書生成システムにおける辞書生成例を示す説明図である。辞書生成システムとは、辞書を生成するシステムである。システムとは、装置単体でもよく、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）、インターネットなどのネットワークに接続された装置群でもよい。辞書とは、オブジェクトや壁紙などのパターンを含む画像のうちオブジェクトを記憶した情報である。辞書生成システムは、オブジェクトを辞書パターンとして辞書に登録することにより、辞書生成の自動化を図る。 FIG. 1 is an explanatory diagram showing an example of dictionary generation in the dictionary generation system according to the present invention. A dictionary generation system is a system that generates a dictionary. The system may be a single device or a group of devices connected to a network such as a LAN (Local Area Network), a WAN (Wide Area Network), or the Internet. A dictionary is information that stores an object in an image including a pattern such as an object or wallpaper. The dictionary generation system automates dictionary generation by registering objects in the dictionary as dictionary patterns.

辞書生成には、２種類の画像群である第１の画像集合１０１と第２の画像集合１０２とが用いられる。第１の画像集合１０１は、オブジェクトの検出対象となる画像集合であり、第２の画像集合１０２は、第１の画像集合１０１からオブジェクトの非検出対象を除外するための画像集合である。第１の画像集合１０１の各画像から得られる領域群がオブジェクト候補となる。第２の画像集合１０２の各画像から得られる領域群がオブジェクト候補との比較対象となる。 The dictionary generation uses the first image set 101 and the second image set 102 which are two types of image groups. The first image set 101 is an image set that is an object detection target, and the second image set 102 is an image set for excluding object non-detection targets from the first image set 101. A region group obtained from each image of the first image set 101 is an object candidate. A region group obtained from each image in the second image set 102 is a comparison target with the object candidate.

辞書生成システムは、第１の画像集合１０１の画像１１０から得られる領域群（オブジェクト候補群）の領域（オブジェクト候補）どうしを、それらの画像特徴量を用いて近しいか否かを判定する。領域群１１１内の矩形はオブジェクト候補を示す。たとえば、第１の画像集合１０１のある画像１１０から得られるオブジェクト候補Ａ，Ｂ間の特徴量空間内の距離を第１の距離とする。辞書生成システムは、第１の距離によりオブジェクト候補Ａ，Ｂの類似性を評価することができる。 The dictionary generation system determines whether or not the regions (object candidates) of the region group (object candidate group) obtained from the image 110 of the first image set 101 are close to each other using their image feature amounts. The rectangles in the area group 111 indicate object candidates. For example, the distance in the feature amount space between the object candidates A and B obtained from the image 110 in the first image set 101 is set as the first distance. The dictionary generation system can evaluate the similarity between the object candidates A and B based on the first distance.

また、辞書生成システムは、第１の画像集合１０１の画像１１０から得られるオブジェクト候補と、第２の画像集合１０２の画像１２０から得られる領域群１２２内の比較対象となる領域とを、それらの画像特徴量を用いて近しいか否かを判定する。たとえば、オブジェクト候補Ａと、第２の画像集合１０２の画像１２０から得られる比較対象となる領域Ｃとの特徴量空間内の距離を、第２の距離とする。辞書生成システムは、第２の距離によりオブジェクト候補Ａ，Ｃの類似性を評価することができる。 In addition, the dictionary generation system divides the object candidates obtained from the image 110 of the first image set 101 and the comparison target areas in the area group 122 obtained from the image 120 of the second image set 102 into those regions. It is determined whether or not the image is close by using the image feature amount. For example, the distance in the feature amount space between the object candidate A and the region C to be compared obtained from the image 120 of the second image set 102 is set as the second distance. The dictionary generation system can evaluate the similarity between the object candidates A and C based on the second distance.

そして、辞書生成システムは、第１の距離と第２の距離との比を用いることにより、オブジェクト候補Ａが、オブジェクト候補Ｂに類似する領域なのか、比較対象となる領域Ｃに類似する候補なのかを判定する。辞書生成システムは、この判定結果によりオブジェクト候補の絞り込みをおこない、絞り込み結果１３０を得る。辞書生成システムは、絞り込み結果１３０に対して、後述する摂動処理を実施することによって新たな領域候補を生成し、生成された領域候補に対しても、領域群１２２を用いて再度第１の距離と第２の距離を求める。元の領域候補に新たな領域候補を加えたものに対して、再度、上述した判定処理を実行する。このような処理を収束するまで繰り返すことによって、辞書に登録すべきオブジェクト候補の信頼性の向上を図ることができる。 Then, the dictionary generation system uses the ratio between the first distance and the second distance, so that the object candidate A is an area similar to the object candidate B or a candidate similar to the area C to be compared. It is determined whether. The dictionary generation system narrows down object candidates based on the determination result, and obtains a narrowing result 130. The dictionary generation system generates a new area candidate by performing a perturbation process to be described later on the narrowing result 130, and also uses the area group 122 to generate the first distance again for the generated area candidate. And determine the second distance. The above-described determination process is performed again on the original region candidate plus a new region candidate. By repeating such processing until convergence, it is possible to improve the reliability of object candidates to be registered in the dictionary.

上述した辞書生成システムにおいて、第１の画像集合１０１と第２の画像集合１０２の組み合わせは４通り存在する。すなわち、（１）第１の画像集合１０１がタグありの画像集合で第２の画像集合１０２がタグなしの画像集合、（２）第１の画像集合１０１がタグなしの画像集合で第２の画像集合１０２もタグなしの画像集合、（３）第１の画像集合１０１がタグなしの画像集合で第２の画像集合１０２がタグありの画像集合、（４）第１の画像集合１０１がタグありの画像集合で第２の画像集合１０２もタグありの画像集合の４通りである。 In the dictionary generation system described above, there are four combinations of the first image set 101 and the second image set 102. That is, (1) the first image set 101 is an image set with a tag and the second image set 102 is an image set without a tag, and (2) the first image set 101 is an image set without a tag and the second image set. The image set 102 is also an untagged image set, (3) the first image set 101 is an untagged image set, the second image set 102 is a tagged image set, and (4) the first image set 101 is a tagged. In the image set with the second image set 102, there are four types of image sets with the tag.

タグとは、テキストなどの画像に付与された情報である。たとえば、「車」、「夏休みの旅行」などの任意の文字列や当該画像の取得時期を示すタイムスタンプ、当該画像の取得位置を示す位置情報である。すなわち、タグは、付与された画像についての何らかの属性を示す情報となる。タグがある画像集合は、たとえば、あるタグを検索キーとして検索された画像集合である。したがって、タグがある画像集合は、同一または類似する共通のタグを有する画像集合である。 A tag is information attached to an image such as text. For example, an arbitrary character string such as “car” or “summer vacation trip”, a time stamp indicating the acquisition time of the image, and position information indicating the acquisition position of the image. That is, the tag is information indicating some attribute about the assigned image. The image set having a tag is, for example, an image set searched using a certain tag as a search key. Therefore, an image set having a tag is an image set having the same or similar common tag.

上記（１）の場合、辞書生成システムは、第２の画像集合１０２内の比較対象となる領域群１２２に含まれる壁紙などのパターンを第１の画像集合１０１のオブジェクト候補から除外することにより、第１の画像集合１０１内のオブジェクト候補の絞り込みをおこなう。これにより、辞書登録の信頼性の向上を図ることができる。 In the case of (1) above, the dictionary generation system excludes patterns such as wallpaper included in the region group 122 to be compared in the second image set 102 from the object candidates of the first image set 101, The object candidates in the first image set 101 are narrowed down. Thereby, the reliability of dictionary registration can be improved.

上記（２）の場合、第１の画像集合１０１と第２の画像集合１０２は、ともにタグがない画像集合である。タグがない画像集合とは、タグが付与されていない画像群であるが、タグが付与されていても辞書生成システムにおいてタグがないものとして扱うこととしてもよい。上記（２）の場合、第１の画像集合１０１と第２の画像集合１０２は、ともにタグがないため、第２の画像集合１０２として第１の画像集合１０１を用いればよい。また、第１の画像集合１０１と第２の画像集合１０２を統合すればよい。上記（２）の場合、辞書生成システムは、第１の画像集合１０１内の同一画像中に類似したオブジェクト候補どうしを除外する。これにより、画像内での単純な繰り返しパターンを排除することができ、画像間で類似したオブジェクト候補を抽出することができる。 In the case of (2) above, both the first image set 101 and the second image set 102 are image sets without tags. An image set without a tag is an image group to which no tag is assigned, but may be treated as having no tag in the dictionary generation system even if a tag is assigned. In the case of (2) above, both the first image set 101 and the second image set 102 have no tag, so the first image set 101 may be used as the second image set 102. Further, the first image set 101 and the second image set 102 may be integrated. In the case of (2) above, the dictionary generation system excludes similar object candidates in the same image in the first image set 101. Thereby, a simple repeating pattern in an image can be excluded, and object candidates similar between images can be extracted.

上記（３）の場合、たとえば、第２の画像集合１０２の各画像に、壁紙や背景などの繰り返しパターン示すタグが付与されているとする。この場合、辞書生成システムは、タグが付与されていない第１の画像集合１０１のオブジェクト候補の中から、繰り返しパターンと部分一致する候補を排除し、人物、物品等のオブジェクトに対応する領域をオブジェクト候補として絞り込むことができる。 In the case of (3) above, for example, it is assumed that a tag indicating a repetitive pattern such as wallpaper or background is assigned to each image of the second image set 102. In this case, the dictionary generation system excludes candidates that partially match the repetitive pattern from the object candidates of the first image set 101 to which no tag has been assigned, and sets an area corresponding to an object such as a person or article as an object. Can be narrowed down as candidates.

上記（４）の場合、たとえば、第１の画像集合１０１の各画像には、特定のタグＸが付与されており、第２の画像集合１０２の各画像には、排除したい領域を特徴づけるタグＹが付与されているものとする。第１の画像集合１０１の各画像にはタグＹが付与されていてもよい。この場合、辞書生成システムは、タグＸが付与された画像に含まれるオブジェクト候補から、タグＹが付与された画像に含まれるオブジェクト候補を排除することができ、オブジェクト候補の絞り込み精度の向上を図ることができる。 In the case of (4) above, for example, each image in the first image set 101 is given a specific tag X, and each image in the second image set 102 is a tag that characterizes the area to be excluded. It is assumed that Y is given. A tag Y may be assigned to each image in the first image set 101. In this case, the dictionary generation system can exclude the object candidates included in the image to which the tag Y is added from the object candidates included in the image to which the tag X is added, thereby improving the accuracy of narrowing down the object candidates. be able to.

なお、上述の辞書生成では、第２の画像集合１０２を用いたが、第２の画像集合１０２を用いないこととしてもよい。この場合は、第１の画像集合１０１内の領域であるオブジェクト候補と第２の画像集合１０２内の領域との間の類似性を評価しないこととなるが、その分、辞書生成の高速化を図ることができる。 In the above dictionary generation, the second image set 102 is used, but the second image set 102 may not be used. In this case, the similarity between the object candidate, which is an area in the first image set 101, and the area in the second image set 102 is not evaluated. Can be planned.

図２は、辞書生成システムにおける辞書生成の具体例を示す説明図である。図２は、上述した（１）の組み合わせの場合の辞書生成例を示す。第１の画像集合１０１には「車」のタグが付与される。オブジェクト候補には、車の画像を含む領域と、車の画像を含まない領域が存在する。第２の画像集合１０２は、タグが付与されていない画像集合である。比較対象となる各種画像を含む領域が第２の画像集合１０２から抽出される。辞書生成システムは、上述した判定処理により、オブジェクト候補と比較対象の領域との間で類似すると評価された領域を、オブジェクト候補から除外する（図２中、右端の太線矩形）。これにより、辞書登録の信頼性の向上を図ることができる。 FIG. 2 is an explanatory diagram showing a specific example of dictionary generation in the dictionary generation system. FIG. 2 shows a dictionary generation example in the case of the combination (1) described above. A tag “car” is assigned to the first image set 101. The object candidate includes an area including a car image and an area not including a car image. The second image set 102 is an image set to which no tag is assigned. A region including various images to be compared is extracted from the second image set 102. The dictionary generation system excludes, from the object candidates, an area evaluated as similar between the object candidate and the comparison target area by the above-described determination process (the bold rectangle at the right end in FIG. 2). Thereby, the reliability of dictionary registration can be improved.

＜ハードウェア構成例＞
図３は、辞書生成システムのハードウェア構成例を示すブロック図である。辞書生成システム３００は、プロセッサ３０１と、記憶デバイス３０２と、入力デバイス３０３と、出力デバイス３０４と、通信インターフェース（通信ＩＦ３０５）と、を有する。プロセッサ３０１、記憶デバイス３０２、入力デバイス３０３、出力デバイス３０４、および通信ＩＦ３０５は、バスにより接続される。プロセッサ３０１は、辞書生成システム３００を制御する。記憶デバイス３０２は、プロセッサ３０１の作業エリアとなる。また、記憶デバイス３０２は、各種プログラムやデータを記憶する。記憶デバイス３０２としては、たとえば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、フラッシュメモリがある。<Hardware configuration example>
FIG. 3 is a block diagram illustrating a hardware configuration example of the dictionary generation system. The dictionary generation system 300 includes a processor 301, a storage device 302, an input device 303, an output device 304, and a communication interface (communication IF 305). The processor 301, the storage device 302, the input device 303, the output device 304, and the communication IF 305 are connected by a bus. The processor 301 controls the dictionary generation system 300. The storage device 302 serves as a work area for the processor 301. The storage device 302 stores various programs and data. Examples of the storage device 302 include a ROM (Read Only Memory), a RAM (Random Access Memory), an HDD (Hard Disk Drive), and a flash memory.

入力デバイス３０３は、データを入力する。入力デバイス３０３としては、たとえば、キーボード、マウス、タッチパネル、テンキー、スキャナがある。出力デバイス３０４は、データを出力する。出力デバイス３０４としては、たとえば、ディスプレイ、プリンタがある。通信ＩＦ３０５は、ネットワークと接続され、データを送受信する。以下、本発明にかかる実施例について説明する。 The input device 303 inputs data. Examples of the input device 303 include a keyboard, a mouse, a touch panel, a numeric keypad, and a scanner. The output device 304 outputs data. Examples of the output device 304 include a display and a printer. The communication IF 305 is connected to a network and transmits / receives data. Examples according to the present invention will be described below.

（実施例１）
実施例１では、上述した（１）第１の画像集合１０１がタグありの画像集合で第２の画像集合１０２がタグなしの画像集合の場合を例に挙げて説明する。Example 1
In the first embodiment, the case where (1) the first image set 101 is an image set with a tag and the second image set 102 is an image set without a tag will be described as an example.

＜機能的構成例＞
図４は、実施例１にかかる辞書生成システム３００の機能的構成例を示すブロック図である。図４において、辞書生成システム３００は、辞書４００と、生成部４０１と、取得部４０２と、決定部４０３と、摂動処理部４０４と、判定部４０５と、表示部４０６と、登録部４０７と、を有する。辞書４００は、辞書パターン群を記憶する。辞書４００は、具体的には、たとえば、図３に示した記憶デバイス３０２によりその機能を実現する。生成部４０１〜登録部４０７は、具体的には、たとえば、図３に示した記憶デバイス３０２に記憶されたプログラムをプロセッサ３０１が実行することによりその機能を実現する。<Functional configuration example>
FIG. 4 is a block diagram of a functional configuration example of the dictionary generation system 300 according to the first embodiment. 4, the dictionary generation system 300 includes a dictionary 400, a generation unit 401, an acquisition unit 402, a determination unit 403, a perturbation processing unit 404, a determination unit 405, a display unit 406, a registration unit 407, Have The dictionary 400 stores a dictionary pattern group. Specifically, the dictionary 400 realizes its function by, for example, the storage device 302 shown in FIG. Specifically, for example, the generation unit 401 to the registration unit 407 realize their functions by the processor 301 executing a program stored in the storage device 302 illustrated in FIG. 3.

生成部４０１は、生成対象である第１の画像および第２の画像の中から領域を生成する。第１の画像とは、たとえば、上述した第１の画像集合１０１内の画像１１０である。第２の画像とは、たとえば、上述した第２の画像集合１０２内の画像１２０である。具体的には、たとえば、生成部４０１は、生成対象の画像について、多重解像度処理を実行して、複数段階の多重解像度画像を生成する。そして、生成部４０１は、多重解像度画像の各々について、量子化された複数種類のアスペクト比の走査窓を用いてグリッド状走査を実行する。これにより、生成部４０１は、第１の画像や第２の画像から領域を生成する。なお、生成部４０１による具体的な生成例については、図５で説明する。 The generation unit 401 generates a region from the first image and the second image that are generation targets. The first image is, for example, the image 110 in the first image set 101 described above. The second image is, for example, the image 120 in the second image set 102 described above. Specifically, for example, the generation unit 401 performs multi-resolution processing on the generation target image and generates a multi-stage multi-resolution image. Then, the generation unit 401 performs grid-like scanning for each of the multi-resolution images using a plurality of quantized scanning windows having a plurality of aspect ratios. Thereby, the generation unit 401 generates a region from the first image and the second image. A specific example of generation by the generation unit 401 will be described with reference to FIG.

取得部４０２は、第１の画像内の第１の領域群の中から選ばれた選択領域と第１の領域群の中から選ばれた選択領域以外の第１の領域との間における特徴量空間内の第１の距離を取得する。第１の領域群とは、第１の画像集合１０１の各画像１１０から切り出される領域群１１１であり、具体的には、たとえば、第１の画像集合１０１の各画像１１０に対しグリッド状走査を実行することにより得られる領域群である。 The acquisition unit 402 includes a feature amount between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Obtain a first distance in space. The first region group is a region group 111 cut out from each image 110 of the first image set 101. Specifically, for example, grid-like scanning is performed on each image 110 of the first image set 101. This is a group of regions obtained by execution.

選択領域とは、第１の領域群の中から選ばれた領域であり、辞書４００への辞書パターンとしての登録対象となる領域である。選択領域は、上述したオブジェクト候補Ａに相当する。第１の領域とは、第１の領域群のうち選択領域とは異なる領域であり、たとえば、オブジェクト候補Ｂに相当する。選択領域と第１の領域とは第１の画像集合１０１内の領域である。 The selection area is an area selected from the first area group and is an area to be registered as a dictionary pattern in the dictionary 400. The selection area corresponds to the object candidate A described above. The first area is an area different from the selected area in the first area group, and corresponds to, for example, the object candidate B. The selected area and the first area are areas in the first image set 101.

また、同様に、取得部４０２は、選択領域と第２の画像内の第２の領域群の中から選ばれた第２の領域との間における特徴量空間内の第２の距離を取得する。第２の領域群とは、第２の画像集合１０２の各画像から切り出される領域群１２２であり、具体的には、たとえば、第２の画像集合１０２の各画像に対しグリッド状走査を実行することにより得られる領域群である。したがって、第２の領域群内の領域は、上述した領域Ｃに相当する。 Similarly, the acquisition unit 402 acquires the second distance in the feature amount space between the selected region and the second region selected from the second region group in the second image. . The second area group is an area group 122 cut out from each image of the second image set 102. Specifically, for example, grid-like scanning is performed on each image of the second image set 102. It is the area group obtained by this. Therefore, the region in the second region group corresponds to the region C described above.

取得部４０２は、複数存在する第１の領域中の選択領域と、着目する画像に含まれる第１の領域との間の第１の距離群のうち、最小となる第１の最小距離を取得する。同様に、取得部４０２は、選択領域と、着目する画像に含まれる第２の領域との間の第２の距離群のうち、最小となる第２の最小距離を取得する。 The acquisition unit 402 acquires a first minimum distance that is the minimum among a first distance group between a selected region in a plurality of first regions and a first region included in the image of interest. To do. Similarly, the acquisition unit 402 acquires the second minimum distance that is the minimum among the second distance group between the selected region and the second region included in the image of interest.

また、画像は複数存在するため、第１の最小距離は選択領域ごとに多数存在する。取得部４０２は、第１の最小距離を画像について平均化することにより、各選択領域に対して、第１の平均最小距離を取得する。同様に、取得部４０２は、第２の最小距離を画像について平均化することにより、各選択領域に対して、第２の平均最小距離を取得する。 In addition, since there are a plurality of images, there are many first minimum distances for each selected region. The acquisition unit 402 acquires the first average minimum distance for each selected region by averaging the first minimum distance for the image. Similarly, the acquisition unit 402 acquires the second average minimum distance for each selected region by averaging the second minimum distance with respect to the image.

決定部４０３は、第１の平均最小距離と前記第２の平均最小距離との比に基づいて、複数の選択領域の中から摂動処理の対象となる特定の選択領域を決定する。具体的には、たとえば、決定部は、各選択領域における第１の平均最小距離と前記第２の平均最小距離との比を昇順にソートする。そして、決定部４０３は、たとえば、上位Ｍ（Ｍは１以上の整数）番目までの比に該当する選択領域を、摂動処理の対象となる特定の選択領域に決定する。これにより、辞書パターンとして登録するのにふさわしい選択領域の絞り込みをおこなうことができる。 The determination unit 403 determines a specific selection region to be subjected to perturbation processing from a plurality of selection regions based on the ratio between the first average minimum distance and the second average minimum distance. Specifically, for example, the determination unit sorts the ratio between the first average minimum distance and the second average minimum distance in each selection region in ascending order. Then, the determination unit 403 determines, for example, a selection region corresponding to the ratio up to the top M (M is an integer of 1 or more) as a specific selection region to be subjected to perturbation processing. Thereby, it is possible to narrow down selection areas suitable for registration as a dictionary pattern.

摂動処理部４０４は、第１の距離と第２の距離との比に基づいて、選択領域を摂動させる摂動処理を実行することにより新たな領域を生成する。具体的には、たとえば、摂動処理部４０４は、決定部４０３によって決定された特定の選択領域について、摂動処理を実行する。摂動処理とは、選択領域の位置をずらして新たな領域を生成する処理である。摂動処理部４０４は、たとえば、図１に示した絞り込み結果１３０となるオブジェクト候補、すなわち選択領域について、摂動処理を実行する。 The perturbation processing unit 404 generates a new region by executing perturbation processing for perturbing the selected region based on the ratio between the first distance and the second distance. Specifically, for example, the perturbation processing unit 404 performs the perturbation process for the specific selection region determined by the determination unit 403. The perturbation process is a process for generating a new area by shifting the position of the selected area. For example, the perturbation processing unit 404 performs perturbation processing on the object candidates that are the narrowing result 130 shown in FIG.

判定部４０５は、第１の距離と第２の距離との比に基づいて、選択領域を辞書パターンにすべきか否かを判定する。第１の距離と第２の距離との比とは、第１の距離を第２の距離で除算した値である。当該比の値が小さいほど選択領域は、辞書パターンとしてふさわしい領域となる。 The determination unit 405 determines whether or not the selected area should be a dictionary pattern based on the ratio between the first distance and the second distance. The ratio of the first distance to the second distance is a value obtained by dividing the first distance by the second distance. The smaller the ratio value, the more suitable the selected area is as a dictionary pattern.

たとえば、第１の距離が小さければ、第１の画像集合１０１内で選択領域と第１の領域とは類似する。ただし、両領域が目的とする辞書パターンに該当するか否かは、第２の距離により決まる。たとえば、第２の画像集合１０２が第１の画像集合１０１とは関連性のない画像集合である場合、第２の距離が小さいということは、選択領域は第２の領域とも類似すると評価される。したがって、第２の距離が小さければ、選択領域についての距離の比が大きくなるため、辞書パターンとしてふさわしくないことになる。 For example, if the first distance is small, the selected area and the first area are similar in the first image set 101. However, whether or not both areas correspond to the target dictionary pattern is determined by the second distance. For example, when the second image set 102 is an image set that is not related to the first image set 101, the fact that the second distance is small is evaluated that the selected area is similar to the second area. . Therefore, if the second distance is small, the ratio of the distances for the selected region is large, which is not suitable as a dictionary pattern.

また、摂動処理が実行された場合、判定部４０５は、摂動処理部４０４による処理結果に基づいて、選択領域を辞書パターンにすべきか否かを判定する。選択領域は離散的なデジタル画像であるから、摂動処理を繰り返し実行することにより、微少変動によって新たな領域が生成されることはなくなる。したがって、摂動処理において、新たな領域が生成されることがなくなれば、判定部４０５は、収束したと判定し、残存する選択領域を辞書パターンとして登録すべきと判定することができる。 When the perturbation process is executed, the determination unit 405 determines whether or not the selected region should be a dictionary pattern based on the processing result by the perturbation processing unit 404. Since the selected region is a discrete digital image, a new region is not generated due to minute fluctuations by repeatedly executing the perturbation process. Therefore, if no new area is generated in the perturbation process, the determination unit 405 determines that the area has converged and can determine that the remaining selection area should be registered as a dictionary pattern.

表示部４０６は、判定部４０５によって辞書パターンにすべきと判定された選択領域を、辞書４００への登録可否を指定可能に表示する。具体的には、例えば、表示部４０６は、辞書パターンにすべきと判定された選択領域を出力デバイス３０４の表示画面に表示する。これにより、ユーザが入力デバイス３０３を用いて登録可否を指定することができる。 The display unit 406 displays the selection area determined to be a dictionary pattern by the determination unit 405 so that it can be specified whether or not it can be registered in the dictionary 400. Specifically, for example, the display unit 406 displays on the display screen of the output device 304 a selection area determined to be a dictionary pattern. As a result, the user can specify whether or not to register using the input device 303.

登録部４０７は、判定部４０５によって辞書パターンにすべきと判定された場合、選択領域を辞書４００に登録する。登録部４０７は、選択領域に属性情報が付与されている場合には、選択領域を属性情報と関連付けて辞書４００に登録する。属性情報とは、上述したタグである。これにより、属性情報を用いて辞書４００を検索する場合、所望のオブジェクトである辞書パターンを抽出することができる。また、登録部４０７は、表示部４０６において、ユーザが入力デバイス３０３を用いて登録すべきと指定された選択領域について登録することとしてもよい。 The registration unit 407 registers the selected area in the dictionary 400 when the determination unit 405 determines that the dictionary pattern should be used. When the attribute information is given to the selection area, the registration unit 407 registers the selection area in the dictionary 400 in association with the attribute information. The attribute information is the tag described above. Thereby, when searching the dictionary 400 using attribute information, the dictionary pattern which is a desired object can be extracted. In addition, the registration unit 407 may register the selection area designated on the display unit 406 to be registered by the user using the input device 303.

＜領域生成例＞
図５は、第１の画像集合１０１および第２の画像集合１０２からの領域生成例を示す説明図である。辞書生成システム３００は、量子化された複数種類のアスペクト比の走査窓を生成する。図５の例では、５種類のアスペクト比の走査窓ｗ１〜ｗ５が生成される。<Example of area generation>
FIG. 5 is an explanatory diagram showing an example of region generation from the first image set 101 and the second image set 102. The dictionary generation system 300 generates a plurality of quantized scanning windows having different aspect ratios. In the example of FIG. 5, five types of aspect ratio scanning windows w1 to w5 are generated.

また、辞書生成システム３００は、各画像１１０，１２０について多重解像度処理を実行する。辞書生成システム３００は、多重解像度処理により、たとえば、画像１１０，１２０の解像度を１／２ずつ縮小した多重解像度画像群を生成する。図５の例では、ある画像１１０について４段階の多重解像度画像１１０，１１０ａ，１１０ｂ，１１０ｃが生成される。 Further, the dictionary generation system 300 executes multi-resolution processing for each of the images 110 and 120. The dictionary generation system 300 generates, for example, a multi-resolution image group obtained by reducing the resolution of the images 110 and 120 by 1/2 by multi-resolution processing. In the example of FIG. 5, four-stage multi-resolution images 110, 110a, 110b, and 110c are generated for an image 110.

辞書生成システム３００は、アスペクト比の量子化で得られた走査窓を、多重解像度処理で得られた多重解像度画像上でグリッド状走査を実行する。図５の例では、辞書生成システム３００は、５種類の走査窓ｗを用いて４段階の多重解像度画像群をグリッド状走査する。これにより、画像１１０から領域が抽出される。抽出される領域は、適用される走査窓のアスペクト比、適用される多重解像度画像の解像度、グリッド状走査による走査窓の走査位置により規定される。 The dictionary generation system 300 performs a grid-like scan on a multi-resolution image obtained by multi-resolution processing using a scan window obtained by quantization of an aspect ratio. In the example of FIG. 5, the dictionary generation system 300 scans a four-stage multi-resolution image group in a grid pattern using five types of scanning windows w. Thereby, a region is extracted from the image 110. The extracted area is defined by the aspect ratio of the scanning window to be applied, the resolution of the applied multi-resolution image, and the scanning position of the scanning window by grid scanning.

＜摂動処理による領域生成例＞
図６〜図８は、摂動処理による領域生成例を示す説明図である。摂動処理とは、図１に示した絞り込み結果１３０となるオブジェクト候補について、位置をずらした領域を生成する処理である。図６〜図８において、点線矩形が絞り込み結果１３０内のあるオブジェクト候補となる領域であり、白塗りの実線矩形が摂動処理による変動後の領域である。図６は、絞り込み結果１３０内のあるオブジェクト候補を、当該オブジェクト候補を含む画像１１０上で上下左右に変動させた領域を示す。<Example of region generation by perturbation processing>
6-8 is explanatory drawing which shows the example of area | region generation by a perturbation process. The perturbation process is a process for generating a region whose position is shifted with respect to the object candidate that becomes the narrowing-down result 130 shown in FIG. 6 to 8, a dotted rectangle is a region that is a candidate object in the narrowing result 130, and a white solid rectangle is a region that has been changed by the perturbation process. FIG. 6 shows a region in which an object candidate in the narrowing-down result 130 is changed vertically and horizontally on the image 110 including the object candidate.

図７は、絞り込み結果１３０内のあるオブジェクト候補を、当該オブジェクト候補を含む画像１１０上で右斜め上、右斜め下、左斜め上、左斜め下に変動させた領域を示す。図８は、絞り込み結果１３０内のあるオブジェクト候補を、当該オブジェクト候補を含む画像１１０上で拡大縮小により変動させた領域を示す。 FIG. 7 shows a region in which a certain object candidate in the narrowing-down result 130 is changed to the upper right, lower right, upper left, and lower left on the image 110 including the object candidate. FIG. 8 shows a region in which an object candidate in the narrowing-down result 130 is changed by enlargement / reduction on the image 110 including the object candidate.

摂動処理による変動量の一例について説明する。グリッド状走査の横方向ステップ幅をｇｘ、縦方向ステップ幅をｇｙ、摂動処理の繰り返し回数をｑとする。ｑ回目の摂動処理における横方向の変動量ｄｘ、縦方向の変動量ｄｙ、拡大率ｄｚは以下の通りである。なお、縮小率は、１／ｄｚとなる。 An example of the fluctuation amount due to the perturbation process will be described. The horizontal step width of the grid scan is gx, the vertical step width is gy, and the number of repetitions of the perturbation process is q. The fluctuation amount dx in the horizontal direction, the fluctuation amount dy in the vertical direction, and the enlargement ratio dz in the q-th perturbation process are as follows. The reduction ratio is 1 / dz.

ｄｘ＝ｇｘ／２^ｑ・・・（１）
ｄｙ＝ｇｙ／２^ｑ・・・（２）
ｄｚ＝２＾（１／２^ｑ）・・・（３）dx = gx / 2 ^q (1)
dy = gy / 2 ^q (2)
dz = 2 ^ (1/2 ^q ) (3)

上述した式（１）〜（３）による変動量、拡大率および縮小率によれば、摂動処理の繰り返し回数ｑが増加するほど、増加前に比べて変動量および拡大率が小さくなり、縮小率が大きくなる。すなわち、摂動処理による領域のぶれが抑制され、元となる領域に収束されやすくなる。なお、式（１）〜（３）は一例であり、摂動処理の繰り返し回数ｑの増加にしたがい、増加前に比べて変動量および拡大率が小さくなり、縮小率が大きくなる式であれば、他の式でもよい。また、摂動処理の繰り返し回数ｑにかかわらず固定の変動量でもよい。この場合は、式（１）〜（３）の計算が不要となるため、摂動処理による領域生成の高速化を図ることができる。 According to the variation amount, the enlargement rate, and the reduction rate according to the above formulas (1) to (3), the variation amount and the enlargement rate become smaller and the reduction rate as the number of repetitions q of the perturbation process increases. Becomes larger. That is, the blurring of the region due to the perturbation process is suppressed, and the region is easily converged to the original region. Equations (1) to (3) are examples, and as the number of repetitions q of the perturbation process increases, the variation amount and the enlargement rate become smaller than before the increase, and the reduction rate becomes larger. Other expressions may be used. Also, a fixed amount of variation may be used regardless of the number of repetitions q of the perturbation process. In this case, the calculations of the expressions (1) to (3) are not necessary, so that the area generation by the perturbation process can be speeded up.

＜辞書生成処理＞
図９は、辞書生成システム３００による辞書生成処理手順例を示すフローチャートである。辞書生成システム３００は、図４に示した生成部４０１による生成処理（ステップＳ９０１）、取得部４０２による取得処理（ステップＳ９０２）、決定部４０３による決定処理（ステップＳ９０３）、摂動処理部４０４による摂動処理（ステップＳ９０４）の順に実行する。<Dictionary generation process>
FIG. 9 is a flowchart illustrating an example of a dictionary generation processing procedure by the dictionary generation system 300. The dictionary generation system 300 includes a generation process (step S901) by the generation unit 401 shown in FIG. The processes are executed in the order (step S904).

このあと、辞書生成システム３００は、判定部による収束判定処理を実行する（ステップＳ９０４）。収束判定処理（ステップＳ９０４）では、対象となる画像は離散的なデジタル画像であるから、繰り返し実行することにより、微少変動によって新たな領域が生成されることはなくなる。したがって、摂動処理（ステップＳ９０４）において、新たな領域候補が生成されることがなくなれば、辞書生成システム３００は、収束したと判定し（ステップＳ９０５：Ｙｅｓ）、表示処理（ステップＳ９０６）に移行する。 Thereafter, the dictionary generation system 300 performs a convergence determination process by the determination unit (step S904). In the convergence determination process (step S904), since the target image is a discrete digital image, a new region is not generated due to slight variations by repeatedly executing the image. Therefore, if no new region candidate is generated in the perturbation process (step S904), the dictionary generation system 300 determines that the convergence has been completed (step S905: Yes), and proceeds to the display process (step S906). .

また、計算の効率化のために繰り返し回数の上限を設定し、当該上限に達した場合に収束したと判定することとしてもよい。なお、収束していないと判定された場合（ステップＳ９０５：Ｎｏ）、取得処理（ステップＳ９０２）に移行する。すなわち、収束するまで、ステップＳ９０２〜Ｓ９０４を繰り返すことになる。一方、収束した場合（ステップＳ９０５：Ｙｅｓ）、表示部による表示処理（ステップＳ９０６）および登録部による登録処理（ステップＳ９０７）を実行する。これにより、一連の処理を終了する。 In addition, an upper limit of the number of repetitions may be set for efficiency of calculation, and it may be determined that convergence has been reached when the upper limit is reached. In addition, when it determines with not having converged (step S905: No), it transfers to an acquisition process (step S902). That is, steps S902 to S904 are repeated until convergence. On the other hand, when it has converged (step S905: Yes), display processing (step S906) by the display unit and registration processing (step S907) by the registration unit are executed. As a result, the series of processes is completed.

＜生成処理＞
図１０は、図９に示した生成処理（ステップＳ９０１）の詳細な処理手順例を示すフローチャートである。ここでは、第１の画像集合１０１を例に挙げて説明するが、第２の画像集合１０２にも適用される。<Generation process>
FIG. 10 is a flowchart illustrating a detailed processing procedure example of the generation processing (step S901) illustrated in FIG. Here, the first image set 101 is described as an example, but the first image set 101 is also applied to the second image set 102.

辞書生成システム３００は、第１の画像集合１０１の中に未選択画像があるか否かを判断する（ステップＳ１００１）。未選択画像がある場合（ステップＳ１００１：Ｙｅｓ）、辞書生成システム３００は、未選択画像を１つ選択する（ステップＳ１００２）。つぎに、辞書生成システム３００は、選択画像について領域群生成処理を実行する（ステップＳ１００３）。領域群生成処理（ステップＳ１００３）の詳細については図１１で説明するが、領域群生成処理（ステップＳ１００３）により、選択画像から複数の領域が抽出される。 The dictionary generation system 300 determines whether there is an unselected image in the first image set 101 (step S1001). When there is an unselected image (step S1001: Yes), the dictionary generation system 300 selects one unselected image (step S1002). Next, the dictionary generation system 300 executes region group generation processing for the selected image (step S1003). The details of the area group generation process (step S1003) will be described with reference to FIG. 11, but a plurality of areas are extracted from the selected image by the area group generation process (step S1003).

このあと、辞書生成システム３００は、選択画像から抽出された領域群の中に、未選択領域があるか否かを判断する（ステップＳ１００４）。未選択領域がある場合（ステップＳ１００４：Ｙｅｓ）、辞書生成システム３００は、未選択領域を１つ選択し（ステップＳ１００５）、選択領域の画像特徴量を抽出する（ステップＳ１００６）。画像特徴量の抽出方法については、上述した非特許文献１において詳細な説明が記載されている。抽出された画像特徴量に用いることによって、同一画像集合内の領域どうしの類似性および異なる画像集合間の領域どうしの類似性を評価することができる。 Thereafter, the dictionary generation system 300 determines whether or not there is an unselected area in the area group extracted from the selected image (step S1004). When there is an unselected area (step S1004: Yes), the dictionary generation system 300 selects one unselected area (step S1005), and extracts the image feature amount of the selected area (step S1006). A detailed description of the image feature extraction method is described in Non-Patent Document 1 described above. By using the extracted image feature amount, the similarity between regions in the same image set and the similarity between regions between different image sets can be evaluated.

画像特徴量の抽出（ステップＳ１００６）のあと、ステップＳ１００４に戻る。ステップＳ１００４において、未選択領域がない場合（ステップＳ１００４：Ｎｏ）、ステップＳ１００１に戻る。ステップＳ１００１において、未選択画像がない場合（ステップＳ１００１：Ｎｏ）、生成処理を終了し（ステップＳ９０１）、図９の取得処理（ステップＳ９０２）に移行する。 After extracting the image feature amount (step S1006), the process returns to step S1004. In step S1004, when there is no unselected area (step S1004: No), the process returns to step S1001. In step S1001, when there is no unselected image (step S1001: No), the generation process is terminated (step S901), and the process proceeds to the acquisition process (step S902) in FIG.

＜領域群生成処理＞
図１１は、図１０に示した領域群生成処理（ステップＳ１００３）の詳細な処理手順例を示すフローチャートである。まず、辞書生成システム３００は、図５に示したように、量子化されたアスペクト比ごとに、ステップＳ１００２で選択された選択画像について多重解像度処理を実行する（ステップＳ１１０１）。つぎに、辞書生成システム３００は、未選択のアスペクト比があるか否かを判断する（ステップＳ１１０２）。未選択のアスペクト比がある場合（ステップＳ１１０２：Ｙｅｓ）、辞書生成システム３００は、未選択のアスペクト比を選択し（ステップＳ１１０３）、未選択の多重解像度画像があるか否かを判断する（ステップＳ１１０４）。<Region group generation processing>
FIG. 11 is a flowchart showing a detailed processing procedure example of the region group generation processing (step S1003) shown in FIG. First, as illustrated in FIG. 5, the dictionary generation system 300 performs multi-resolution processing on the selected image selected in step S1002 for each quantized aspect ratio (step S1101). Next, the dictionary generation system 300 determines whether there is an unselected aspect ratio (step S1102). If there is an unselected aspect ratio (step S1102: Yes), the dictionary generation system 300 selects an unselected aspect ratio (step S1103), and determines whether there is an unselected multi-resolution image (step S1103). S1104).

未選択多重解像度がある場合（ステップＳ１１０４：Ｙｅｓ）、辞書生成システム３００は、未選択の多重解像度画像を選択する（ステップＳ１１０５）。そして、辞書生成システム３００は、選択アスペクト比の走査窓で選択多重解像度画像をグリッド状走査することにより、走査窓と同一の形状、大きさの領域群を生成する（ステップＳ１１０６）。このあと、ステップＳ１１０４に戻り、辞書生成システム３００は、未選択多重解像度画像があるか否かを判断する（ステップＳ１１０４）。未選択多重解像度画像がない場合（ステップＳ１１０４：Ｎｏ）、ステップＳ１１０２に戻り、辞書生成システム３００は、未選択アスペクト比があるか否かを判断する（ステップＳ１１０２）。未選択アスペクト比がない場合（ステップＳ１１０２：Ｎｏ）、一連の処理を終了し、図１０のステップＳ１００４に移行する。 When there is an unselected multiresolution (step S1104: Yes), the dictionary generation system 300 selects an unselected multiresolution image (step S1105). Then, the dictionary generation system 300 generates a region group having the same shape and size as the scanning window by scanning the selected multi-resolution image in a grid pattern with the scanning window having the selected aspect ratio (step S1106). Thereafter, returning to step S1104, the dictionary generation system 300 determines whether there is an unselected multi-resolution image (step S1104). If there is no unselected multi-resolution image (step S1104: No), the process returns to step S1102, and the dictionary generation system 300 determines whether there is an unselected aspect ratio (step S1102). If there is no unselected aspect ratio (step S1102: No), the series of processing is terminated, and the process proceeds to step S1004 in FIG.

＜取得処理＞
図１２は、図９に示した取得処理（ステップＳ９０２）の詳細な処理手順例を示すフローチャートである。取得処理（ステップＳ９０１）は、領域間の画像特徴量に基づく最小距離を取得し、これらを平均化して平均最小距離を取得する処理である。辞書生成システム３００は、まず、第１の平均最小距離取得処理を実行する（ステップＳ１２０１）。第１の平均最小距離取得処理（ステップＳ１２０１）は、第１の画像集合１０１内の画像から抽出された領域間についての最小距離を取得し、これらを平均化して第１の平均最小距離を取得する処理である。第１の平均最小距離取得処理（ステップＳ１２０１）の詳細については、図１４で説明する。<Acquisition processing>
FIG. 12 is a flowchart illustrating a detailed processing procedure example of the acquisition process (step S902) illustrated in FIG. The acquisition process (step S901) is a process of acquiring a minimum distance based on an image feature amount between regions and averaging these to acquire an average minimum distance. The dictionary generation system 300 first executes a first average minimum distance acquisition process (step S1201). In the first average minimum distance acquisition process (step S1201), a minimum distance between regions extracted from images in the first image set 101 is acquired, and these are averaged to acquire a first average minimum distance. It is processing to do. Details of the first average minimum distance acquisition process (step S1201) will be described with reference to FIG.

つぎに、辞書生成システム３００は、第２の平均最小距離取得処理を実行する（ステップＳ１２０２）。第２の平均最小距離取得処理（ステップＳ１２０２）は、第１の画像集合１０１内の画像から抽出された領域と第２の画像集合１０２内の画像から抽出された領域との間についての最小距離を取得し、これらを平均化して第２の平均最小距離を取得する処理である。第２の平均最小距離取得処理（ステップＳ１２０２）の詳細については、図１６で説明する。これにより、平均最小距離取得処理（ステップＳ１２０１）が終了すると、決定処理（ステップＳ９０３）に移行する。 Next, the dictionary generation system 300 executes a second average minimum distance acquisition process (step S1202). In the second average minimum distance acquisition process (step S1202), the minimum distance between the region extracted from the image in the first image set 101 and the region extracted from the image in the second image set 102 is displayed. Is obtained, and these are averaged to obtain the second average minimum distance. Details of the second average minimum distance acquisition process (step S1202) will be described with reference to FIG. Thereby, when the average minimum distance acquisition process (step S1201) is completed, the process proceeds to the determination process (step S903).

図１３は、図１２に示した第１の平均最小距離取得処理（ステップＳ１２０１）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、まず、変数を初期化する（ステップＳ１３０１）。ここでは、第１の画像集合１０１内の画像群のインデックスｉをｉ＝１とする。第１の画像集合１０１内のインデックスｉの画像を画像Ａｉとする。ｉは１≦ｉ≦Ｎａをとる整数である。Ｎａは、第１の画像集合１０１内の画像数である。 FIG. 13 is a flowchart showing a detailed processing procedure example of the first average minimum distance acquisition processing (step S1201) shown in FIG. The dictionary generation system 300 first initializes variables (step S1301). Here, the index i of the image group in the first image set 101 is set to i = 1. An image with index i in the first image set 101 is set as an image Ai. i is an integer taking 1 ≦ i ≦ Na. Na is the number of images in the first image set 101.

また、画像Ａｉから抽出された領域群のインデックスをｊとする。画像Ａｉから抽出されたｊ番目の領域をａｉｊとする。ｊは１≦ｊ≦ｎａｉをとる整数である。ｎａｉは、画像Ａｉ内の領域数である。 Also, let j be the index of the area group extracted from the image Ai. The j-th area extracted from the image Ai is assumed to be aij. j is an integer taking 1 ≦ j ≦ nai. nai is the number of areas in the image Ai.

つぎに、辞書生成システム３００は、ｉ＞Ｎａであるか否かを判断する（ステップＳ１３０２）。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像をすべて処理したか否かを判断する。ｉ＞Ｎａでない場合（ステップＳ１３０２：Ｎｏ）、ｊ＝１とし（ステップＳ１３０３）、辞書生成システム３００は、ｊ＞ｎａｉであるか否かを判断する（ステップＳ１３０４）。すなわち、辞書生成システム３００は、画像Ａｉ内の領域ａｉｊをすべて処理したか否かを判断する。 Next, the dictionary generation system 300 determines whether i> Na is satisfied (step S1302). That is, the dictionary generation system 300 determines whether all the images in the first image set 101 have been processed. If i> Na is not satisfied (step S1302: No), j = 1 is set (step S1303), and the dictionary generation system 300 determines whether j> nai is satisfied (step S1304). That is, the dictionary generation system 300 determines whether or not all the areas aij in the image Ai have been processed.

ｊ＞ｎａｉでない場合（ステップＳ１３０４：Ｎｏ）、辞書生成システム３００は、領域ａｉｊからその画像特徴量である第１特徴量を抽出する（ステップＳ１３０５）。このあと、辞書生成システム３００は、第１の平均最小距離算出処理を実行する（ステップＳ１３０６）。第１の平均最小距離算出処理（ステップＳ１３０６）は、第１の画像集合１０１内の画像Ａｉとは異なる画像Ａｋ内の領域ａｋｌからその画像特徴量である第２特徴量を抽出し、領域ａｉｊと領域ａｋｌとの最小距離の平均値である第１の平均最小距離を算出する処理である。インデックスｋは、１≦ｋ≦Ｎａをとる整数であり、ｋ≠ｉである。また、インデックスｌは、１≦ｌ≦ｎａｋをとる整数である。ｎａｋは、画像Ａｋ内の領域数である。第１の平均最小距離算出処理（ステップＳ１３０６）の詳細については、図１５で説明する。 If j> nai is not satisfied (step S1304: NO), the dictionary generation system 300 extracts the first feature amount that is the image feature amount from the area aij (step S1305). Thereafter, the dictionary generation system 300 executes a first average minimum distance calculation process (step S1306). In the first average minimum distance calculation process (step S1306), a second feature amount that is an image feature amount is extracted from a region akl in an image Ak different from the image Ai in the first image set 101, and a region aij is extracted. And a first average minimum distance that is an average value of the minimum distances between the area akl and the area akl. The index k is an integer taking 1 ≦ k ≦ Na, and k ≠ i. The index l is an integer that takes 1 ≦ l ≦ nak. nak is the number of regions in the image Ak. Details of the first average minimum distance calculation process (step S1306) will be described with reference to FIG.

このあと、辞書生成システム３００は、インデックスｊをインクリメントし（ステップＳ１３０７）、ステップＳ１３０４に戻る。ステップＳ１３０３において、ｊ＞ｎａｉである場合（ステップＳ１３０４：Ｙｅｓ）、辞書生成システム３００は、ｉをインクリメントし（ステップＳ１３０８）、ステップＳ１３０２に戻る。ステップＳ１３０２において、ｉ＞Ｎａである場合（ステップＳ１３０２：Ｙｅｓ）、図１２の第２の平均最小距離取得処理（ステップＳ１２０２）に移行する。これにより、第１の平均最小距離取得処理（ステップＳ１２０１）が終了する。 Thereafter, the dictionary generation system 300 increments the index j (step S1307) and returns to step S1304. In step S1303, when j> nai (step S1304: Yes), the dictionary generation system 300 increments i (step S1308), and returns to step S1302. In step S1302, if i> Na (step S1302: Yes), the process proceeds to the second average minimum distance acquisition process (step S1202) in FIG. Thereby, the first average minimum distance acquisition process (step S1201) ends.

図１４は、図１３に示した第１の平均最小距離算出処理（ステップＳ１３０６）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、まず、変数を初期化する（ステップＳ１４０１）。ここでは、ｋ＝１、ｌ＝１にする。また、辞書生成システム３００は、変数ｓをｓ＝０に設定する。変数ｓは、ある領域ａｉｊに対する各領域ａｋｌとの最小距離の累積値となる。 FIG. 14 is a flowchart showing a detailed processing procedure example of the first average minimum distance calculation processing (step S1306) shown in FIG. The dictionary generation system 300 first initializes variables (step S1401). Here, k = 1 and l = 1. Further, the dictionary generation system 300 sets the variable s to s = 0. The variable s is a cumulative value of the minimum distance from each area akl for a certain area aij.

つぎに、辞書生成システム３００は、ｋ＝ｉであるか否かを判断する（ステップＳ１４０２）。すなわち、辞書生成システム３００は、ｋ＝ｉとなる領域ａｋｌを処理対象から除外するためである。 Next, the dictionary generation system 300 determines whether or not k = i (step S1402). That is, the dictionary generation system 300 is for excluding the region akl where k = i from the processing target.

ｋ＝ｉである場合（ステップＳ１４０２：Ｙｅｓ）、ｋをインクリメントして（ステップＳ１４０３）、ステップＳ１４０４に移行する。一方、ｋ≠ｉの場合（ステップＳ１４０２：Ｎｏ）、ステップＳ１４０４に移行する。 When k = i (step S1402: Yes), k is incremented (step S1403), and the process proceeds to step S1404. On the other hand, if k ≠ i (step S1402: No), the process proceeds to step S1404.

つぎに、辞書生成システム３００は、ｋ＞Ｎａであるか否かを判断する（ステップＳ１４０４）。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像をすべて処理したか否かを判断する。ｋ＞Ｎａでない場合（ステップＳ１４０４：Ｎｏ）、辞書生成システム３００は、最小距離累積処理を実行する（ステップＳ１４０５）。最小距離累積処理（ステップＳ１４０５）は、ある領域ａｉｊに対する各領域ａｋｌとの最小距離を累積する処理である。最小距離累積処理（ステップＳ１４０５）では、ある領域ａｉｊに対する各領域ａｋｌとの最小距離の累積値（変数ｓ）が求まる。最小距離累積処理（ステップＳ１４０５）の詳細については、図１５で説明する。 Next, the dictionary generation system 300 determines whether or not k> Na (step S1404). That is, the dictionary generation system 300 determines whether all the images in the first image set 101 have been processed. If k> Na is not satisfied (step S1404: No), the dictionary generation system 300 executes minimum distance accumulation processing (step S1405). The minimum distance accumulation process (step S1405) is a process of accumulating the minimum distance between each area akl and a certain area aij. In the minimum distance accumulation process (step S1405), the accumulated value (variable s) of the minimum distance from each area akl with respect to a certain area aij is obtained. Details of the minimum distance accumulation process (step S1405) will be described with reference to FIG.

このあと、辞書生成システム３００は、ｋをインクリメントし（ステップＳ１４０６）、ステップＳ１４０２に移行する。また、ステップＳ１４０４において、ｋ＞Ｎａである場合（ステップＳ１４０４：Ｙｅｓ）、辞書生成システム３００は、第１の平均最小距離を算出し（ステップＳ１４０７）、図１３のステップＳ１３０７に移行する。第１の平均最小距離Ｄｉｊは、下記式（４）により算出される。これにより、第１の平均最小距離算出処理（ステップＳ１３０６）が終了する。 Thereafter, the dictionary generation system 300 increments k (step S1406), and proceeds to step S1402. In step S1404, if k> Na (step S1404: Yes), the dictionary generation system 300 calculates the first average minimum distance (step S1407), and proceeds to step S1307 in FIG. The first average minimum distance Dij is calculated by the following equation (4). Thereby, the first average minimum distance calculation process (step S1306) is completed.

Ｄｉｊ＝ｓ／（Ｎａ−１）・・・（４） Dij = s / (Na-1) (4)

図１５は、図１４に示した最小距離累積処理（ステップＳ１４０５）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、ｔ＝ｄｍａｘ、ｌ＝１とする（ステップＳ１５０１）。ｔは領域間の距離計算に用いる変数であり、ｄｍａｘは、当該距離の最大値である。つぎに、辞書生成システム３００は、ｌ＞ｎａｋであるか否かを判断する（ステップＳ１５０２）。ｎａｋは、画像Ａｋ内の領域数である。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像Ａｉとは異なる画像Ａｋ内の領域ａｋｌを、すべて処理したか否かを判断する。 FIG. 15 is a flowchart illustrating a detailed processing procedure example of the minimum distance accumulation processing (step S1405) illustrated in FIG. The dictionary generation system 300 sets t = dmax and l = 1 (step S1501). t is a variable used for calculating the distance between the regions, and dmax is the maximum value of the distance. Next, the dictionary generation system 300 determines whether or not l> nak (step S1502). nak is the number of regions in the image Ak. That is, the dictionary generation system 300 determines whether or not all the regions akl in the image Ak different from the image Ai in the first image set 101 have been processed.

ｌ＞ｎａｋでない場合（ステップＳ１５０２：Ｎｏ）、辞書生成システム３００は、領域ａｋｌからその画像特徴量である第２特徴量を抽出する（ステップＳ１５０３）。そして、辞書生成システム３００は、領域ａｉｊと領域ａｋｌとの距離ｄを算出する（ステップＳ１６０１）。距離ｄの算出には、領域ａｉｊの第１特徴量と領域ａｋｌの第２特徴量とが用いられる。下記式（５）は、距離ｄの算出例を示す式である。 If l> nak is not satisfied (step S1502: NO), the dictionary generation system 300 extracts the second feature amount that is the image feature amount from the region akl (step S1503). Then, the dictionary generation system 300 calculates the distance d between the area aij and the area akl (step S1601). For the calculation of the distance d, the first feature value of the region aij and the second feature value of the region akl are used. The following formula (5) is a formula showing an example of calculating the distance d.

式（５）中、ｖ_ｒは、領域ａｉｊの第１特徴量を示すベクトルにおけるｒ番目の成分であり、ｕ_ｒは、領域ａｋｌの第２特徴量を示すベクトルにおけるｒ番目の成分である。Ｒは成分数である。そして、辞書生成システム３００は、算出した距離ｄがｄ＜ｔであるか否かを判断する（ステップＳ１５０５）。ここで、ｔは、ステップＳ１５０８において変数ｓに加算される値であり、ｔの初期値は、特徴量空間上での距離評価の範囲、すなわち、距離の最大値ｄｍａｘである。Wherein (5), v _r is the r th component in the vector indicating the first feature amount of a region aij, u _r is the r th component in the vector showing the second characteristic amount region akl. R is the number of components. Then, the dictionary generation system 300 determines whether or not the calculated distance d is d <t (step S1505). Here, t is a value added to the variable s in step S1508, and an initial value of t is a distance evaluation range in the feature amount space, that is, a maximum distance dmax.

ｄ＜ｔである場合（ステップＳ１５０５：Ｙｅｓ）、辞書生成システム３００は、ｔ＝ｄに設定して（ステップＳ１５０６）、ステップＳ１５０７に移行する。一方、ｄ＜ｔでない場合（ステップＳ１５０５：Ｎｏ）、ステップＳ１５０７に移行する。すなわち、ｔの初期値はｔ＝ｄｍａｘであるが、ｄ＜ｔになる都度、ｔの値が小さくなる。 If d <t (step S1505: Yes), the dictionary generation system 300 sets t = d (step S1506), and proceeds to step S1507. On the other hand, if d <t is not satisfied (step S1505: NO), the process proceeds to step S1507. That is, the initial value of t is t = dmax, but every time d <t, the value of t decreases.

ステップＳ１５０７において、辞書生成システム３００は、ｌをインクリメントし（ステップＳ１５０７）、ステップＳ１５０２に戻る。ステップＳ１５０２において、ｌ＞ｎａｋである場合（ステップＳ１５０２：Ｙｅｓ）、辞書生成システム３００は、変数ｓを更新して（ステップＳ１５０８）、ステップＳ１４０６に移行する。したがって、ステップＳ１５０８では、領域ａｋｌについてｌ＝１からｌ＝ｎａｋまで試行した場合の距離ｄの最小値、すなわち、最小距離が、ｔとして変数ｓに加算されることになる。 In step S1507, the dictionary generation system 300 increments l (step S1507) and returns to step S1502. In step S1502, if l> nak (step S1502: Yes), the dictionary generation system 300 updates the variable s (step S1508), and proceeds to step S1406. Therefore, in step S1508, the minimum value of the distance d when the region akl is tried from l = 1 to l = nak, that is, the minimum distance is added to the variable s as t.

図１６は、図１２に示した第２の平均最小距離取得処理（ステップＳ１２０２）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、まず、変数を初期化する（ステップＳ１６０１）。ここでは、ｉ＝１とする。 FIG. 16 is a flowchart showing a detailed processing procedure example of the second average minimum distance acquisition processing (step S1202) shown in FIG. First, the dictionary generation system 300 initializes variables (step S1601). Here, i = 1.

つぎに、辞書生成システム３００は、ｉ＞Ｎａであるか否かを判断する（ステップＳ１６０２）。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像をすべて処理したか否かを判断する。ｉ＞Ｎａでない場合（ステップＳ１６０２：Ｎｏ）、ｊ＝１とし（ステップＳ１６０３）、辞書生成システム３００は、ｊ＞ｎａｉであるか否かを判断する（ステップＳ１６０４）。すなわち、辞書生成システム３００は、画像Ａｉ内の領域ａｉｊをすべて処理したか否かを判断する。 Next, the dictionary generation system 300 determines whether i> Na is satisfied (step S1602). That is, the dictionary generation system 300 determines whether all the images in the first image set 101 have been processed. If i> Na is not satisfied (step S1602: NO), j = 1 is set (step S1603), and the dictionary generation system 300 determines whether j> nai is satisfied (step S1604). That is, the dictionary generation system 300 determines whether or not all the areas aij in the image Ai have been processed.

ｊ＞ｎａｉでない場合（ステップＳ１６０４：Ｎｏ）、辞書生成システム３００は、領域ａｉｊからその画像特徴量である第１特徴量を抽出する（ステップＳ１６０５）。このあと、辞書生成システム３００は、第２の平均最小距離算出処理を実行する（ステップＳ１６０６）。第２の平均最小距離算出処理（ステップＳ１６０６）は、第２の画像集合１０２内の画像Ｂｋ内の領域ｂｋｌからその画像特徴量である第２特徴量を抽出し、領域ａｉｊと領域ａｋｌとの最小距離の平均値である第２の平均最小距離を算出する処理である。第２の平均最小距離算出処理（ステップＳ１６０６）の詳細については、図１７で説明する。 If j> nai is not satisfied (step S1604: NO), the dictionary generation system 300 extracts the first feature amount that is the image feature amount from the area aij (step S1605). Thereafter, the dictionary generation system 300 executes a second average minimum distance calculation process (step S1606). In the second average minimum distance calculation process (step S1606), the second feature quantity that is the image feature quantity is extracted from the area bkl in the image Bk in the second image set 102, and the area aij and the area akl are extracted. This is a process of calculating a second average minimum distance that is an average value of the minimum distances. Details of the second average minimum distance calculation process (step S1606) will be described with reference to FIG.

このあと、辞書生成システム３００は、インデックスｊをインクリメントし（ステップＳ１６０７）、ステップＳ１６０４に戻る。ステップＳ１６０４において、ｊ＞ｎａｉである場合（ステップＳ１６０４：Ｙｅｓ）、辞書生成システム３００は、ｉをインクリメントし（ステップＳ１６０８）、ステップＳ１６０２に戻る。ステップＳ１６０２において、ｉ＞Ｎａである場合（ステップＳ１６０２：Ｙｅｓ）、図９の決定処理（ステップＳ９０３）に移行する。これにより、第２の平均最小距離取得処理（ステップＳ１２０２）が終了する。 Thereafter, the dictionary generation system 300 increments the index j (step S1607) and returns to step S1604. If j> nai is satisfied in step S1604 (step S1604: YES), the dictionary generation system 300 increments i (step S1608) and returns to step S1602. In step S1602, if i> Na (step S1602: Yes), the process proceeds to the determination process of FIG. 9 (step S903). Thereby, the second average minimum distance acquisition process (step S1202) ends.

図１７は、図１６に示した第２の平均最小距離算出処理（ステップＳ１６０６）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、まず、変数を初期化する（ステップＳ１７０１）。ここでは、ｋ＝１、ｌ＝１にする。また、変数ｓをｓ＝０に設定する。変数ｓは、ある領域ａｉｊに対する各領域ｂｋｌとの距離の最小値の累積値となる。計算の詳細については、図１８で説明する。インデックスｋは、１≦ｋ≦Ｎｂをとる整数である。Ｎｂは、第２の画像集合１０２内の画像数である。インデックスｌは、１≦ｊ≦ｎｂｋをとる整数である。ｎｂｋは、画像Ｂｋ内の領域数である。 FIG. 17 is a flowchart showing a detailed processing procedure example of the second average minimum distance calculation processing (step S1606) shown in FIG. The dictionary generation system 300 first initializes variables (step S1701). Here, k = 1 and l = 1. In addition, the variable s is set to s = 0. The variable s is a cumulative value of the minimum value of the distance from each area bkl to a certain area aij. Details of the calculation will be described with reference to FIG. The index k is an integer that takes 1 ≦ k ≦ Nb. Nb is the number of images in the second image set 102. The index l is an integer that takes 1 ≦ j ≦ nbk. nbk is the number of regions in the image Bk.

つぎに、辞書生成システム３００は、ｋ＞Ｎｂであるか否かを判断する（ステップＳ１７０２）。すなわち、辞書生成システム３００は、第２の画像集合１０２内の画像をすべて処理したか否かを判断する。ｋ＞Ｎｂでない場合（ステップＳ１７０２：Ｎｏ）、辞書生成システム３００は、最小距離累積処理を実行する（ステップＳ１７０３）。最小距離累積処理（ステップＳ１７０３）は、Ｓ１４０５と同様、ある領域ａｉｊに対する各領域ｂｋｌとの最小距離を累積する処理である。最小距離累積処理（ステップＳ１７０３）では、ある領域ａｉｊに対する各領域ｂｋｌとの最小距離の累積値（変数ｓ）が求まる。最小距離累積処理（ステップＳ１７０３）の詳細については、図１８で説明する。 Next, the dictionary generation system 300 determines whether or not k> Nb (step S1702). That is, the dictionary generation system 300 determines whether all the images in the second image set 102 have been processed. If k> Nb is not satisfied (step S1702: NO), the dictionary generation system 300 executes a minimum distance accumulation process (step S1703). The minimum distance accumulation process (step S1703) is a process of accumulating the minimum distance between each area bij and a certain area aij, similar to S1405. In the minimum distance accumulation process (step S1703), the accumulated value (variable s) of the minimum distance from each area bkl to a certain area aij is obtained. Details of the minimum distance accumulation process (step S1703) will be described with reference to FIG.

このあと、辞書生成システム３００は、ｋをインクリメントし（ステップＳ１７０４）、ステップＳ１７０２に移行する。また、ステップＳ１７０２において、ｋ＞Ｎｂである場合（ステップＳ１７０２：Ｙｅｓ）、辞書生成システム３００は、第２の平均最小距離を算出し（ステップＳ１７０５）、図１６のステップＳ１６０７に移行する。第２の平均最小距離Ｅｉｊは、下記式（６）により算出される。これにより、第２の平均最小距離算出処理（ステップＳ１６０６）が終了する。 Thereafter, the dictionary generation system 300 increments k (step S1704), and proceeds to step S1702. In step S1702, if k> Nb (step S1702: Yes), the dictionary generation system 300 calculates the second average minimum distance (step S1705), and proceeds to step S1607 in FIG. The second average minimum distance Eij is calculated by the following equation (6). Thereby, the second average minimum distance calculation process (step S1606) ends.

Ｅｉｊ＝ｓ／Ｎｂ・・・（６） Eij = s / Nb (6)

図１８は、図１７に示した最小距離累積処理（ステップＳ１７０３）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、ｔ＝ｄｍａｘ、ｌ＝１とする（ステップＳ１８０１）。ｔは領域間の距離計算に用いる変数であり、ｄｍａｘは、当該距離の最大値である。つぎに、辞書生成システム３００は、ｌ＞ｎｂｋであるか否かを判断する（ステップＳ１８０２）。ｎｂｋは、画像Ｂｋ内の領域数である。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像Ａｉとは異なる画像Ｂｋ内の領域ｂｋｌを、すべて処理したか否かを判断する。 FIG. 18 is a flowchart showing a detailed processing procedure example of the minimum distance accumulation processing (step S1703) shown in FIG. The dictionary generation system 300 sets t = dmax and l = 1 (step S1801). t is a variable used for calculating the distance between the regions, and dmax is the maximum value of the distance. Next, the dictionary generation system 300 determines whether or not l> nbk (step S1802). nbk is the number of regions in the image Bk. That is, the dictionary generation system 300 determines whether or not all the regions bkl in the image Bk different from the image Ai in the first image set 101 have been processed.

ｌ＞ｎｂｋでない場合（ステップＳ１８０２：Ｎｏ）、辞書生成システム３００は、領域ｂｋｌからその画像特徴量である第３特徴量を抽出する（ステップＳ１８０３）。このあと、辞書生成システム３００は、領域ａｉｊと領域ｂｋｌとの距離ｅを算出する（ステップＳ１８０４）。距離ｅの算出には、領域ａｉｊの第１特徴量と領域ｂｋｌの第３特徴量とが用いられる。下記式（７）は、距離ｅの算出例を示す式である。 If l> nbk is not satisfied (step S1802: NO), the dictionary generation system 300 extracts the third feature amount that is the image feature amount from the region bkl (step S1803). Thereafter, the dictionary generation system 300 calculates a distance e between the area aij and the area bkl (step S1804). For the calculation of the distance e, the first feature quantity of the area aij and the third feature quantity of the area bkl are used. The following formula (7) is a formula showing an example of calculating the distance e.

式（７）中、ｖ_ｒは、領域ａｉｊの第１特徴量を示すベクトルにおけるｒ番目の成分であり、ｕ_ｒは、領域ｂｋｌの第３特徴量を示すベクトルにおけるｒ番目の成分である。Ｒは成分数である。そして、辞書生成システム３００は、算出した距離ｅがｅ＜ｔであるか否かを判断する（ステップＳ１８０５）。ここで、ｔは、ステップＳ１８０８において変数ｓに加算される値であり、ｔの初期値は、特徴量空間上での距離評価の範囲、すなわち、距離の最大値ｄｍａｘである。Wherein (7), v _r is the r th component in the vector indicating the first feature amount of a region aij, u _r is the r th component in the vector showing a third feature amount of a region bkl. R is the number of components. Then, the dictionary generation system 300 determines whether or not the calculated distance e is e <t (step S1805). Here, t is a value added to the variable s in step S1808, and an initial value of t is a distance evaluation range in the feature amount space, that is, a maximum distance dmax.

ｅ＜ｔである場合（ステップＳ１８０５：Ｙｅｓ）、辞書生成システム３００は、ｔ＝ｅに設定して（ステップＳ１８０６）、ステップＳ１８０７に移行する。一方、ｅ＜ｔでない場合（ステップＳ１８０５：Ｎｏ）、ステップＳ１８０７に移行する。すなわち、ｔの初期値はｔ＝ｄｍａｘであるが、ｅ＜ｔになる都度、ｔの値が小さくなる。 If e <t (step S1805: YES), the dictionary generation system 300 sets t = e (step S1806), and proceeds to step S1807. On the other hand, if e <t is not satisfied (step S1805: NO), the process proceeds to step S1807. That is, the initial value of t is t = dmax, but every time e <t, the value of t decreases.

ステップＳ１８０７では、辞書生成システム３００は、ｌをインクリメントし（ステップＳ１８０７）、ステップＳ１８０２に戻る。ステップＳ１８０２において、ｌ＞ｎｂｋである場合（ステップＳ１８０２：Ｙｅｓ）、辞書生成システム３００は、変数ｓを更新して（ステップＳ１８０８）、ステップＳ１７０４に移行する。したがって、ステップＳ１８０８では、領域ｂｋｌについてｌ＝１からｌ＝ｎｂｋまで試行した場合の距離ｅの最小値がｔとして変数ｓに加算されることになる。 In step S1807, the dictionary generation system 300 increments l (step S1807) and returns to step S1802. In step S1802, if l> nbk (step S1802: Yes), the dictionary generation system 300 updates the variable s (step S1808), and proceeds to step S1704. Therefore, in step S1808, the minimum value of the distance e when trying from 1 = 1 to l = nbk for the region bkl is added to the variable s as t.

＜決定処理＞
図１９は、図９に示した決定処理（ステップＳ９０３）の詳細な処理手順例を示すフローチャートである。決定処理（ステップＳ９０３）では、辞書生成システム３００は、収束判定処理（ステップＳ９０４）の判定対象となる領域を決定する。辞書生成システム３００は、まず、変数を初期化する（ステップＳ１９０１）。ここでは、ｉ＝１とする。つぎに、辞書生成システム３００は、ｉ＞Ｎａであるか否かを判断する（ステップＳ１９０２）。すなわち、辞書生成システム３００は、第１の画像集合１０１内の画像をすべて処理したか否かを判断する。ｉ＞Ｎａでない場合（ステップＳ１９０２：Ｎｏ）、ｊ＝１とし（ステップＳ１９０３）、辞書生成システム３００は、ｊ＞ｎａｉであるか否かを判断する（ステップＳ１９０４）。すなわち、辞書生成システム３００は、画像Ａｉ内の領域ａｉｊをすべて処理したか否かを判断する。<Decision process>
FIG. 19 is a flowchart illustrating a detailed processing procedure example of the determination processing (step S903) illustrated in FIG. In the determination process (step S903), the dictionary generation system 300 determines an area to be determined in the convergence determination process (step S904). The dictionary generation system 300 first initializes variables (step S1901). Here, i = 1. Next, the dictionary generation system 300 determines whether i> Na is satisfied (step S1902). That is, the dictionary generation system 300 determines whether all the images in the first image set 101 have been processed. If i> Na is not satisfied (step S1902: NO), j = 1 is set (step S1903), and the dictionary generation system 300 determines whether j> nai is satisfied (step S1904). That is, the dictionary generation system 300 determines whether or not all the areas aij in the image Ai have been processed.

ｊ＞ｎａｉでない場合（ステップＳ１９０４：Ｎｏ）、辞書生成システム３００は、第１の平均最小距離Ｄｉｊを第２の平均最小距離Ｅｉｊで除算することにより、正規化された平均最小距離Ｆｉｊを算出する（ステップＳ１９０５）。正規化された平均最小距離Ｆｉｊは、領域ａｉｊを辞書４００に登録すべきか否かの指標となる数値である。正規化された平均最小距離Ｆｉｊの値が小さいほど、辞書４００への登録が適切であることを示す。 If j> nai is not satisfied (step S1904: NO), the dictionary generation system 300 calculates the normalized average minimum distance Fij by dividing the first average minimum distance Dij by the second average minimum distance Eij. (Step S1905). The normalized average minimum distance Fij is a numerical value serving as an index as to whether or not the region aij should be registered in the dictionary 400. A smaller value of the normalized average minimum distance Fij indicates that registration in the dictionary 400 is more appropriate.

正規化された平均最小距離Ｆｉｊの算出のあと、辞書生成システム３００は、ｊをインクリメントして（ステップＳ１９０６）、ステップＳ１９０３に戻る。ステップＳ２００３において、ｊ＞ｎａｉである場合（ステップＳ１９０４：Ｙｅｓ）、ｉをインクリメントして（ステップＳ１９０７）、ステップＳ１９０２に戻る。そして、辞書生成システム３００は、ステップＳ１９０２において、ｉ＞Ｎａである場合（ステップＳ１９０２：Ｙｅｓ）、正規化された平均最小距離Ｆｉｊをその値の昇順にソートする（ステップＳ１９０８）。 After calculating the normalized average minimum distance Fij, the dictionary generation system 300 increments j (step S1906) and returns to step S1903. In step S2003, if j> nai (step S1904: Yes), i is incremented (step S1907), and the process returns to step S1902. Then, in step S1902, if i> Na (step S1902: Yes), the dictionary generation system 300 sorts the normalized average minimum distance Fij in ascending order of the values (step S1908).

そして、辞書生成システム３００は、正規化された平均最小距離Ｆｉｊの値が上位Ｍ番目までの領域ａｉｊを取得する。これにより、領域ａｉｊが上位Ｍ個に絞り込まれる。したがって、辞書生成システム３００は、取得したＭ個の領域ａｉｊについて、インデックス（ｉ、ｊ、ｎａｉ）を更新する（ステップＳ１９０９）。たとえば、Ｍ＝１０とし、上位１０番目までの領域ａｉｊが、｛ａ１２，ａ２４，ａ１５，ａ２６，ａ６１，ａ３１，ａ４７，ａ６３，ａ４８，ａ６９｝とする。 Then, the dictionary generation system 300 acquires a region aij in which the value of the normalized average minimum distance Fij is the upper Mth. Thereby, the area aij is narrowed down to the top M. Therefore, the dictionary generation system 300 updates the index (i, j, nai) for the acquired M areas aij (step S1909). For example, M = 10, and the top ten regions aij are {a12, a24, a15, a26, a61, a31, a47, a63, a48, a69}.

この場合、｛ａ１１，ａ１２｝の内容が、｛ａ１２，ａ１５｝の内容に更新される。｛ａ２１，ａ２２｝の内容が、｛ａ２４，ａ２６｝の内容に更新される。｛ａ３１｝はそのままである。｛ａ４１，ａ４２｝の内容が、｛ａ４７，ａ４８｝の内容に更新される。｛ａ６１，ａ６２，ａ６３｝の内容が、｛ａ６１，ａ６３，ａ６９｝の内容に更新される。また、画像Ａｉ内の領域数ｎａｉは、ｎａ１＝２、ｎａ２＝２、ｎａ３＝１、ｎａ４＝２、ｎａ６＝３、それ以外は、ｎａｉ＝０に更新される。これにより、決定処理を終了し（ステップＳ９０３）、収束判定処理（ステップＳ９０４）に移行する。収束判定処理（ステップＳ９０４）では、Ｍ個の領域ａｉｊごとに、収束判定が実行される。 In this case, the content of {a11, a12} is updated to the content of {a12, a15}. The contents of {a21, a22} are updated to the contents of {a24, a26}. {A31} remains as it is. The contents of {a41, a42} are updated to the contents of {a47, a48}. The contents of {a61, a62, a63} are updated to the contents of {a61, a63, a69}. Further, the number of areas nai in the image Ai is updated to nai = 0, na2 = 2, na3 = 1, na4 = 2, na6 = 3, and otherwise. Thereby, the determination process ends (step S903), and the process proceeds to the convergence determination process (step S904). In the convergence determination process (step S904), the convergence determination is executed for each of the M areas aij.

＜摂動処理＞
図２０は、図９に示した摂動処理（ステップＳ９０５）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、まず、摂動処理の繰り返し回数ｑ（初期値はｑ＝０）をインクリメントし（ステップＳ２００１）、横方向の変動量ｄｘ、縦方向の変動量ｄｙ、拡大率ｄｚ、縮小率１／ｄｚを算出する（ステップＳ２００２）。つぎに、辞書生成システム３００は、未選択領域ａｉｊがあるか否かを判断する（ステップＳ２００３）。未選択領域がある場合（ステップＳ２００３：Ｙｅｓ）、辞書生成システム３００は、未選択領域ａｉｊを選択し（ステップＳ２００４）、横方向の変動量ｄｘ、縦方向の変動量ｄｙ、拡大率ｄｚ、縮小率１／ｄｚを用いて、選択領域ａｉｊに基づく新たな領域を生成して（ステップＳ２００５）、ステップＳ２００３に戻る。<Perturbation processing>
FIG. 20 is a flowchart showing a detailed processing procedure example of the perturbation processing (step S905) shown in FIG. First, the dictionary generation system 300 increments the number of repetitions q of the perturbation process (initial value is q = 0) (step S2001), and changes the horizontal fluctuation amount dx, the vertical fluctuation amount dy, the enlargement ratio dz, and the reduction ratio. 1 / dz is calculated (step S2002). Next, the dictionary generation system 300 determines whether there is an unselected area aij (step S2003). When there is an unselected area (step S2003: Yes), the dictionary generation system 300 selects the unselected area aij (step S2004), and the horizontal direction fluctuation amount dx, the vertical direction fluctuation amount dy, the enlargement ratio dz, and the reduction. A new area based on the selected area aij is generated using the rate 1 / dz (step S2005), and the process returns to step S2003.

ステップＳ２００３において、未選択候補がない場合（ステップＳ２００３：Ｎｏ）、辞書生成システム３００は、ステップＳ２００５で領域が追加されたため、各画像に含まれる領域数ｎａｉを更新する（ステップＳ２００６）。たとえば、上記の正規化の例で、上位Ｍ個に絞り込まれた後、追加分の領域が含まれるように、辞書生成システム３００は、各画像中の領域候補を格納する配列の内容ａｉｊ、および、領域候補の数ｎａｉを更新する。これにより、摂動処理（ステップＳ９０４）を終了し、収束判定処理（ステップＳ９０５）に移行する。 In step S2003, when there is no unselected candidate (step S2003: No), the dictionary generation system 300 updates the number of areas nai included in each image because the area is added in step S2005 (step S2006). For example, in the above normalization example, the dictionary generation system 300 includes the contents aij of the array storing the area candidates in each image, so that the additional area is included after being narrowed down to the top M. The number of area candidates nai is updated. Thereby, the perturbation process (step S904) is terminated, and the process proceeds to the convergence determination process (step S905).

このように、実施例１によれば、第２の画像集合１０２内の比較対象となる領域群に含まれる壁紙などのパターンを第１の画像集合１０１のオブジェクト候補から除外することにより、第１の画像集合１０１内のオブジェクト候補の絞り込みをおこなうことができる。これにより、辞書登録の信頼性の向上を図ることができる。また、第２の画像集合１０２中の要素に、本来ならば、着目する属性付与されるべき画像、すなわち、第１の画像集合１０１に含まれるべき画像が混入いていても良い。第２の画像集合１０２の要素数が十分であれば、平均最小値が算出される過程において、このような属性付与の不備に伴う効果は、十分に削減することが可能である。 As described above, according to the first embodiment, by removing patterns such as wallpaper included in the region group to be compared in the second image set 102 from the object candidates of the first image set 101, The object candidates in the image set 101 can be narrowed down. Thereby, the reliability of dictionary registration can be improved. In addition, an element in the second image set 102 may be mixed with an image to be given an attribute of interest, that is, an image to be included in the first image set 101. If the number of elements of the second image set 102 is sufficient, in the process of calculating the average minimum value, it is possible to sufficiently reduce the effect due to such inadequate attribute assignment.

（実施例２）
つぎに、実施例２について説明する。実施例２では、上述した（２）第１の画像集合１０１がタグなしの画像集合で第２の画像集合１０２もタグなしの画像集合の場合の例である。この場合、実施例１における図９〜図２０のフローチャートにおいて、第２の画像集合１０２を第１の画像集合１０１に置き換えればよい。また、実施例２の場合、図１７および図１８に示した第２の平均最小距離算出処理（ステップＳ１６０５）が実施例１と異なり、図２１に置き換わる。実施例２では、実施例１のＥｉｊに相当するものとして、着目領域を含む画像の他の領域との最小距離を採用する。以下、実施例２にかかる第２の最小距離算出処理（ステップＳ１６０５）について説明する。(Example 2)
Next, Example 2 will be described. The second embodiment is an example in which (2) the first image set 101 described above is an untagged image set and the second image set 102 is also an untagged image set. In this case, the second image set 102 may be replaced with the first image set 101 in the flowcharts of FIGS. In the case of the second embodiment, the second average minimum distance calculation process (step S1605) shown in FIGS. 17 and 18 differs from the first embodiment and is replaced with FIG. In the second embodiment, the minimum distance from other regions of the image including the region of interest is adopted as an equivalent to Eij of the first embodiment. Hereinafter, the second minimum distance calculation process (step S1605) according to the second embodiment will be described.

図２１は、実施例２にかかる最小距離算出処理（ステップＳ１６０５）の詳細な処理手順例を示すフローチャートである。辞書生成システム３００は、図１６のステップＳ１６０４の領域ａｉｊの画像特徴量である第１特徴量を抽出したあと、まず、変数を初期化する（ステップＳ２１０１）。ここでは、ｋ＝１にする。つぎに、辞書生成システム３００は、ｋ＝ｊであるか否かを判断する（ステップＳ２１０２）。すなわち、辞書生成システム３００は、ｋ＝ｊとなる領域ａｉｋを処理対象から除外するためである。 FIG. 21 is a flowchart of a detailed process procedure example of the minimum distance calculation process (step S1605) according to the second embodiment. After extracting the first feature quantity that is the image feature quantity of the area aij in step S1604 in FIG. 16, the dictionary generation system 300 first initializes variables (step S2101). Here, k = 1. Next, the dictionary generation system 300 determines whether or not k = j (step S2102). That is, the dictionary generation system 300 is for excluding the region aik where k = j from the processing target.

ｋ＝ｊである場合（ステップＳ２１０２：Ｙｅｓ）、ｋをインクリメントして（ステップＳ２１０３）、ステップＳ２１０４に移行する。一方、ｋ≠ｊの場合（ステップＳ２１０２：Ｎｏ）、ステップＳ２１０４に移行する。 If k = j (step S2102: YES), k is incremented (step S2103), and the process proceeds to step S2104. On the other hand, if k ≠ j (step S2102: No), the process proceeds to step S2104.

つぎに、辞書生成システム３００は、ｋ＞ｎａｉであるか否かを判断する（ステップＳ２１０４）。すなわち、辞書生成システム３００は、画像内のすべての領域を処理したか否かを判断する。ｋ＞ｎａｉでない場合（ステップＳ２１０４：Ｎｏ）、辞書生成システム３００は、ｔ＝ｄｍａｘとする（ステップＳ２１０５）。ｔは領域間の距離計算に用いる変数であり、ｄｍａｘは、当該距離の最大値である。つぎに、辞書生成システム３００は、領域ａｉｋからその画像特徴量である第３特徴量を抽出する（ステップＳ２１０６）。そして、辞書生成システム３００は、領域ａｉｊと領域ａｉｋとの距離ｅを算出する（ステップＳ２１０７）。距離ｅの算出には、上記式（７）が用いられる。 Next, the dictionary generation system 300 determines whether k> nai is satisfied (step S2104). That is, the dictionary generation system 300 determines whether or not all areas in the image have been processed. If k> nai is not satisfied (step S2104: NO), the dictionary generation system 300 sets t = dmax (step S2105). t is a variable used for calculating the distance between the regions, and dmax is the maximum value of the distance. Next, the dictionary generation system 300 extracts a third feature amount that is an image feature amount from the region aik (step S2106). Then, the dictionary generation system 300 calculates the distance e between the area aij and the area aik (step S2107). The above formula (7) is used to calculate the distance e.

ｅ＜ｔである場合（ステップＳ２１０８：Ｙｅｓ）、辞書生成システム３００は、ｔ＝ｅに設定して（ステップＳ２１０９）、ステップＳ２１１０に移行する。一方、ｅ＜ｔでない場合（ステップＳ２１０８：Ｎｏ）、ステップＳ２１１０に移行する。すなわち、ｔの初期値はｔ＝ｄｍａｘであるが、ｅ＜ｔになる都度、ｔの値が小さくなる。 If e <t (step S2108: YES), the dictionary generation system 300 sets t = e (step S2109) and proceeds to step S2110. On the other hand, if e <t is not satisfied (step S2108: NO), the process proceeds to step S2110. That is, the initial value of t is t = dmax, but every time e <t, the value of t decreases.

ステップＳ２１１０では、辞書生成システム３００は、ｋをインクリメントし（ステップＳ２１１０）、ステップＳ２１０２に移行する。また、ステップＳ２１０４において、ｋ＞Ｎａである場合（ステップＳ２１０４：Ｙｅｓ）、第２の最小距離Ｅｉｊをｔに決定し（ステップＳ２１１１）、図１６のステップＳ１６０６に移行する。したがって、領域ａｉｋについてｋ＝１からｋ＝ｎａｉまで試行した場合の距離ｅの最小値ｔが、領域ａｉｊに対する最小距離Ｅｉｊに決定されることになる。これにより、最小距離算出処理（ステップＳ１７０６）が終了して、図１６のステップＳ１６０６に移行する。 In step S2110, the dictionary generation system 300 increments k (step S2110), and proceeds to step S2102. In step S2104, if k> Na (step S2104: Yes), the second minimum distance Eij is determined to be t (step S2111), and the process proceeds to step S1606 in FIG. Therefore, the minimum value t of the distance e when the region aik is tried from k = 1 to k = nai is determined as the minimum distance Eij with respect to the region aij. As a result, the minimum distance calculation process (step S1706) ends, and the process proceeds to step S1606 in FIG.

実施例２によれば、辞書生成システム３００は、第１の画像集合１０１内の同一画像中に類似したオブジェクト候補どうしを除外する。これにより、画像内での単純な繰り返しパターンを排除することができ、画像間で類似したオブジェクト候補を抽出することができる。 According to the second embodiment, the dictionary generation system 300 excludes similar object candidates in the same image in the first image set 101. Thereby, a simple repeating pattern in an image can be excluded, and object candidates similar between images can be extracted.

（実施例３）
つぎに、実施例３について説明する。実施例３では、上述した（３）第１の画像集合１０１がタグなしの画像集合で第２の画像集合１０２がタグありの画像集合の例である。実施例３では、第１の画像集合１０１がタグなしであり、第２の画像集合１０２がタグありである点を除き、処理内容は、実施例１における図９〜図２０のフローチャートと同一である。このように、実施例３によれば、例えば、第２の画像集合を指定するタグとして、背景等の単純な繰り返しパターンと対応するタグを用いることによって、タグが付与されていない第１の画像集合１０１のオブジェクト候補の中から、繰り返しパターンと部分一致する候補を排除し、人物、物品等のオブジェクトに対応する領域をオブジェクト候補として絞り込むことができる。Example 3
Next, Example 3 will be described. In the third embodiment, (3) the first image set 101 is an image set with no tag and the second image set 102 is an image set with a tag. In the third embodiment, the processing contents are the same as those in the flowcharts of FIGS. 9 to 20 in the first embodiment except that the first image set 101 has no tag and the second image set 102 has a tag. is there. Thus, according to the third embodiment, for example, by using a tag corresponding to a simple repeating pattern such as a background as a tag for designating the second image set, the first image to which no tag is attached. Candidates that partially match the repetitive pattern are excluded from the object candidates of the set 101, and areas corresponding to objects such as people and articles can be narrowed down as object candidates.

（実施例４）
つぎに、実施例４について説明する。実施例４では、上述した（４）第１の画像集合１０１がタグありの画像集合で第２の画像集合１０２もタグありの画像集合の例である。実施例４では、第１の画像集合１０１に付与されるタグＸと、第２の画像集合１０２に付与されるタグＹとは、異なるタグである。実施例４は、第１の画像集合１０１および第２の画像集合１０２にタグが付与される点を除き、処理内容は、実施例１における図９〜図２０のフローチャートと同一である。このように、タグＸが付与された画像に含まれるオブジェクト候補から、タグＹが付与された画像に含まれるオブジェクト候補を排除することができ、オブジェクト候補の絞り込み精度の向上を図ることができる。Example 4
Next, Example 4 will be described. In the fourth embodiment, (4) the first image set 101 is an image set with a tag and the second image set 102 is also an example of an image set with a tag. In the fourth embodiment, the tag X given to the first image set 101 and the tag Y given to the second image set 102 are different tags. The fourth embodiment is the same as the flowchart of FIGS. 9 to 20 in the first embodiment except that tags are assigned to the first image set 101 and the second image set 102. As described above, the object candidates included in the image to which the tag Y is added can be excluded from the object candidates included in the image to which the tag X is added, and the refinement accuracy of the object candidates can be improved.

（実施例５）
次に、実施例５について説明する。実施例５は、実施例１〜４にかかる辞書生成システム３００を、コンテンツクラウドシステムへ組み込んだ例である。(Example 5)
Next, Example 5 will be described. The fifth embodiment is an example in which the dictionary generation system 300 according to the first to fourth embodiments is incorporated into a content cloud system.

図２２は、実施例５にかかるコンテンツクラウドシステムのシステム構成例を示すブロック図である。コンテンツクラウドシステム２２００は、ＥｘｔｒａｃｔＴｒａｎｓｆｏｒｍＬｏａｄ（ＥＴＬ）モジュール２２０３、コンテンツストレージ２２０４、検索エンジン２２０５、メタデータサーバ２２０６、マルチメディアサーバ２２０７を有する。コンテンツクラウドシステム２２００は１つ以上のプロセッサ３０１および記憶デバイス３０２を備えた計算機（たとえば、図３を参照）上で動作し、様々なモジュールで構成される。また、それぞれのモジュールが独立した計算機で実行されることもあり、その場合、各ストレージとモジュール間はネットワーク等で接続されおり、それらを介してデータ通信を行う分散処理で実現される。 FIG. 22 is a block diagram of a system configuration example of the content cloud system according to the fifth embodiment. The content cloud system 2200 includes an Extract Transform Load (ETL) module 2203, a content storage 2204, a search engine 2205, a metadata server 2206, and a multimedia server 2207. The content cloud system 2200 operates on a computer (eg, see FIG. 3) including one or more processors 301 and a storage device 302, and is composed of various modules. In addition, each module may be executed by an independent computer. In this case, each storage is connected to the module via a network or the like, and is realized by distributed processing in which data communication is performed via them.

アプリケーションプログラム２２０８は、ネットワーク等を経由してコンテンツクラウドシステム２２００に対してリクエストを送り、コンテンツクラウドシステム２２００はリクエストに応じた情報をアプリケーションプログラム２２０８に送信する。 The application program 2208 sends a request to the content cloud system 2200 via a network or the like, and the content cloud system 2200 transmits information corresponding to the request to the application program 2208.

コンテンツクラウドシステム２２００は、入力として映像データ、画像データ、文書データ、音声データなどの任意の形式のデータ２２０１を受け取る。データ２２０１は、例えば、図形商標とその広報文書、ウェブサイトの画像とＨＴＭＬ文書、クローズドキャプションまたは音声付き映像データなどであり、構造化されたデータでもよいし非構造化データでもよい。コンテンツクラウドシステム２２００へ入力されるデータはストレージ２２０２に一時的に蓄えられる。 The content cloud system 2200 receives data 2201 in an arbitrary format such as video data, image data, document data, and audio data as an input. The data 2201 is, for example, a graphic trademark and its publicity document, a website image and HTML document, closed caption or video data with audio, etc., and may be structured data or unstructured data. Data input to the content cloud system 2200 is temporarily stored in the storage 2202.

ＥＴＬモジュール２２０３は、ストレージ２２０２を監視しており、ストレージ２２０２へデータ２２０１が格納されると、ファイルシステムから得られる情報（メタデータ）をコンテンツストレージ２２０４にアーカイブ化して保存する。 The ETL module 2203 monitors the storage 2202, and when data 2201 is stored in the storage 2202, information (metadata) obtained from the file system is archived and stored in the content storage 2204.

コンテンツストレージ２２０４は、ＥＴＬモジュール２２０３が抽出した情報およびストレージ２２０２に一時的に蓄えられている処理前のデータ２２０１を保存する。 The content storage 2204 stores the information extracted by the ETL module 2203 and the pre-processing data 2201 temporarily stored in the storage 2202.

検索エンジン２２０５は、アプリケーションプログラム２２０８からのリクエストがあると、例えばテキスト検索であれば、ＥＴＬモジュール２２０３が作成したインデックスを元にテキスト検索を実施し、検索結果をアプリケーションプログラム２２０８に送信する。検索エンジン２２０５のアルゴリズムに関しては、公知の技術を適用することができる。検索エンジン２２０５はテキストだけでなく、画像、音声などのデータを検索するモジュールを搭載することもできる。 When there is a request from the application program 2208, the search engine 2205 performs a text search based on the index created by the ETL module 2203, for example, if it is a text search, and transmits the search result to the application program 2208. A known technique can be applied to the algorithm of the search engine 2205. The search engine 2205 can include a module for searching not only text but also data such as images and sounds.

メタデータサーバ２２０６は、ＲＤＢ（ＲｅｌａｔｉｏｎａｌＤａｔａＢａｓｅ）に蓄えられたメタデータを管理する。例えば、ＥＴＬモジュール２２０３が抽出した、データのファイル名、データ登録年月日、元データの種類、メタデータテキスト情報、などがＲＤＢに登録されていると仮定する。アプリケーションプログラム２２０８からリクエストの要求があると、メタデータサーバ２２０６はそのリクエストに従って、ＲＤＢ内の情報をアプリケーションプログラム２２０８に送信する。 The metadata server 2206 manages metadata stored in an RDB (Relational DataBase). For example, it is assumed that the file name, data registration date, original data type, metadata text information, etc. extracted by the ETL module 2203 are registered in the RDB. When there is a request from the application program 2208, the metadata server 2206 transmits information in the RDB to the application program 2208 in accordance with the request.

マルチメディアサーバ２２０７は、コンテンツストレージ２２０４にアーカイブ化されたデータに対して、そのデータに合わせた情報抽出処理モジュール２２０９を動作させ、データの内容に関するメタデータを抽出する。 The multimedia server 2207 operates the information extraction processing module 2209 according to the data archived in the content storage 2204, and extracts metadata related to the contents of the data.

情報抽出処理モジュール２２０９は、例えば、テキストのインデックスモジュール、画像認識モジュールなどで構成されている。メタデータの例としては、時刻、Ｎ−ｇｒａｍインデックス、画像認識結果（物体名、画像中の領域座標）、画像特徴量とその関連語、音声認識結果、などが該当する。情報抽出処理モジュール２２０９として、何らかの情報（メタデータ）抽出を行うプログラムすべてを用いることができ、公知の技術を採用することができるので、ここでは情報抽出処理モジュール２２０９の説明を省略する。 The information extraction processing module 2209 includes, for example, a text index module, an image recognition module, and the like. Examples of metadata include time, an N-gram index, an image recognition result (object name, region coordinates in the image), an image feature amount and related words, a speech recognition result, and the like. As the information extraction processing module 2209, any program for extracting some information (metadata) can be used, and a known technique can be adopted. Therefore, the description of the information extraction processing module 2209 is omitted here.

各メディアデータから抽出されたメタデータは、互いに関連付けられ、グラフ形式で構造化されたグラフＤＢ２３１１に蓄積される。関連図付けの一例としては、コンテンツストレージ２２０４に蓄えられた「リンゴ」という音声認識結果に対して、元の音声ファイル、画像データ、関連語などの対応関係をネットワーク形式で表現することができる。マルチメディアサーバ２２０７は、アプリケーションプログラム２２０８からのリクエストがあると、それに応じたメタ情報をアプリケーションプログラム２２０８に送信する。例えば、「リンゴ」というリクエストがあると、構築されたグラフ構造に基づき、リンゴを含む画像、平均相場、アーティストの曲名、などのネットワークグラフ上で関連付けられたメタ情報を提供する。 Metadata extracted from each media data is stored in a graph DB 2311 that is associated with each other and structured in a graph format. As an example of association mapping, the correspondence relationship of the original audio file, image data, related words, and the like can be expressed in a network format with respect to the speech recognition result “apple” stored in the content storage 2204. When there is a request from the application program 2208, the multimedia server 2207 transmits meta information corresponding to the request to the application program 2208. For example, when there is a request of “apple”, meta information associated on a network graph such as an image including an apple, an average market price, and an artist's song name is provided based on the constructed graph structure.

情報抽出処理モジュール２２０９は、画像からの物体検出処理を実行する。辞書４００には、実施例１〜４に示した辞書生成システム３００により生成された物体検出用の辞書パターンが登録される。辞書４００中の各辞書パターンには、各辞書パターンの画像特徴量とともに、各辞書パターンがいかなる事物であるかを示すメタデータが定義される。 The information extraction processing module 2209 executes object detection processing from an image. In the dictionary 400, a dictionary pattern for object detection generated by the dictionary generation system 300 shown in the first to fourth embodiments is registered. For each dictionary pattern in the dictionary 400, metadata indicating what kind of thing each dictionary pattern is is defined along with the image feature amount of each dictionary pattern.

情報抽出処理モジュール２２０９は、辞書４００との照合によって、与えられた画像から所望の物体を検出すると、検出位置、領域の大きさなどの検出に関する情報や照合された辞書パターンのメタデータをグラフＤＢ２３１０に登録する。なお、マルチメディアサーバ２２０７に複数の情報抽出処理モジュール２２０９が組み込まれる場合は、１台の計算機のリソースをシェアしてもよいし、モジュール毎に独立した計算機を用いてもよい。 When the information extraction processing module 2209 detects a desired object from a given image by collation with the dictionary 400, the graph DB 2310 displays information related to detection such as the detection position and the size of the region and metadata of the collated dictionary pattern. Register with. When a plurality of information extraction processing modules 2209 are incorporated in the multimedia server 2207, the resources of one computer may be shared, or an independent computer may be used for each module.

このように、実施例１〜４に示した辞書生成システム３００により生成された物体検出用の辞書パターンを用いることにより、コンテンツクラウドシステム２２００は、各メディアデータに共通して利用可能なメタデータを生成することができる。したがって、メディア間にまたがって情報を統合することができ、付加価値がより高い情報をユーザへ提供することができる。 In this way, by using the object detection dictionary pattern generated by the dictionary generation system 300 shown in the first to fourth embodiments, the content cloud system 2200 can use metadata that can be commonly used for each media data. Can be generated. Therefore, information can be integrated across media, and information with higher added value can be provided to the user.

（実施例６）
つぎに、実施例６について説明する。実施例６は、実施例１〜実施例５における辞書生成システム３００を運用方式例である。(Example 6)
Next, Example 6 will be described. The sixth embodiment is an example of an operation method of the dictionary generation system 300 in the first to fifth embodiments.

図２３は、辞書生成システム３００の運用方式例を示すブロック図である。辞書生成システム３００は、ネットワーク２３００を介して、画像管理サーバ２３１０、単語管理サーバ２３２０、辞書生成サービス２３３０、比較用パターン管理サーバ２３４０、辞書パターン管理サーバ２３５０、および、端末装置２３６０が相互に結合されたシステムである。 FIG. 23 is a block diagram illustrating an example of an operation method of the dictionary generation system 300. In the dictionary generation system 300, the image management server 2310, the word management server 2320, the dictionary generation service 2330, the comparison pattern management server 2340, the dictionary pattern management server 2350, and the terminal device 2360 are coupled to each other via the network 2300. System.

画像管理サーバ２３１０は、画像データを管理する。単語管理サーバ２３２０は、画像データに付与された言語情報を管理する。辞書生成サービス２３３０は、辞書パターンの生成を行う。辞書生成サービス２３３０は、図４に示した生成部４０１〜判定部４０５を有するコンピュータである。 The image management server 2310 manages image data. The word management server 2320 manages the linguistic information given to the image data. The dictionary generation service 2330 generates a dictionary pattern. The dictionary generation service 2330 is a computer having the generation unit 401 to the determination unit 405 shown in FIG.

比較用パターン管理サーバ２３４０は、辞書パターン生成時に第１の画像集合１０１から得られる領域の比較対象となる第２の画像集合１０２から得られた領域の画像特徴量を管理する。辞書パターン管理サーバ２３５０は、辞書パターンを管理する。辞書パターン管理サーバ２３５０は、図４に示した登録部４０７を有するコンピュータである。端末装置２３６０は、サーバに対する各種要求の発行、生成された辞書パターンの確認等を行う。端末装置２３６０は、図４に示した表示部４０６を有するコンピュータである。 The comparison pattern management server 2340 manages the image feature amount of the area obtained from the second image set 102 to be compared with the area obtained from the first image set 101 when the dictionary pattern is generated. The dictionary pattern management server 2350 manages dictionary patterns. The dictionary pattern management server 2350 is a computer having the registration unit 407 shown in FIG. The terminal device 2360 issues various requests to the server, confirms the generated dictionary pattern, and the like. The terminal device 2360 is a computer having the display unit 406 shown in FIG.

図２４は、画像管理サーバ２３１０が管理する情報の一覧を示す説明図である。項目２４０１の「ｉｍａｇｅ」は、バイト列で表現された画像データである。なお、画像データ自体は、別途、ファイルサーバ等で管理されている場合も多い。この場合は、項目２４０１に相当する情報として、画像ファイルが存在するＵＲＬ等の画像を取得するために必要となる情報が格納される。項目２４０２の「ｋｅｙｗｏｒｄ」は、画像データと関連する単語の集合である。各単語は、文字列として管理されていても良い。実施例６では、メモリ消費の節約、および、データ処理の効率化のため、各単語は、単語管理サーバ２４２０によって管理され、項目２４０２には、単語管理サーバ２４２０上での各単語の管理番号を示す整数値の列が格納される。以降、項目２４０２を「単語」と称す。 FIG. 24 is an explanatory diagram showing a list of information managed by the image management server 2310. An item 2401 “image” is image data expressed in a byte string. The image data itself is often managed separately by a file server or the like. In this case, information necessary for acquiring an image such as a URL in which an image file exists is stored as information corresponding to the item 2401. An item 2402 “keyword” is a set of words related to the image data. Each word may be managed as a character string. In the sixth embodiment, each word is managed by the word management server 2420 in order to save memory consumption and improve data processing efficiency, and the item 2402 includes a management number of each word on the word management server 2420. Stores a column of integer values. Hereinafter, the item 2402 is referred to as “word”.

一方、項目２４０３の「ｕｓｅｄ＿ｋｅｙｗｏｒｄ」は、当該画像が辞書パターン生成サービス２４３０によって利用された場合、その際に属性として用いられた単語の単語ＩＤである。単語ＩＤは、同一の画像を重複して辞書生成に用いることを避けるために用意された情報である。このように、画像管理サーバ２３１０は、項目２４０１〜項目２４０３により規定された画像データを管理することになる。 On the other hand, “used_keyword” of the item 2403 is a word ID of a word used as an attribute when the image is used by the dictionary pattern generation service 2430. The word ID is information prepared in order to avoid using the same image repeatedly for dictionary generation. As described above, the image management server 2310 manages the image data defined by the items 2401 to 2403.

図２５は、比較用パターン管理サーバ２３４０が管理する情報を示す説明図である。項目２５０１の「ｉｍａｇｅ」は、整数値で表現された、元となる画像の画像管理サーバ２３１０上での管理番号である。項目２５０２の「ｒｅｃｔ」は、比較用パターンの矩形領域を表現する２点の座標値で、４次元の整数配列である。項目２５０３の「ｆｅａｔｕｒｅ」は、項目２５０２の「ｒｅｃｔ」で規定される比較用パターンの矩形領域の画像特徴量である。このように、比較用パターン管理サーバ２３４０は、項目２５０１〜項目２５０３により規定された比較用パターンを管理することになる。 FIG. 25 is an explanatory diagram showing information managed by the comparison pattern management server 2340. The item “image” of the item 2501 is a management number on the image management server 2310 of the original image expressed by an integer value. The item 2502 “rect” is a two-dimensional coordinate value representing a rectangular area of the comparison pattern, and is a four-dimensional integer array. An item 2503 “feature” is an image feature amount of a rectangular area of a comparison pattern defined by an item 2502 “rect”. Thus, the comparison pattern management server 2340 manages the comparison patterns defined by the items 2501 to 2503.

図２６は、辞書パターン管理サーバ２３５０が管理する情報の一覧を示す説明図である。項目２６０１の「ｉｍａｇｅ」は、元となる画像の画像管理サーバ２４１０上での管理番号である。項目２６０２の「ｒｅｃｔ」は、辞書パターンの矩形領域を表現する２点の座標値である。項目２６０３の「ｆｅａｔｕｒｅ」は、辞書パターンの矩形領域の画像特徴量である。項目２６０４の「ｋｅｙｗｏｒｄ」は、辞書生成を行った際、属性として用いた単語ＩＤのリストである。このように、辞書パターン管理サーバ２３５０は、項目２６０１〜項目２６０４により規定された辞書パターンを管理することになる。 FIG. 26 is an explanatory diagram showing a list of information managed by the dictionary pattern management server 2350. An item 2601 “image” is a management number of the original image on the image management server 2410. The item 2602 “rect” is a coordinate value of two points representing a rectangular area of the dictionary pattern. An item 2603 “feature” is an image feature amount of a rectangular area of the dictionary pattern. An item 2604 “keyword” is a list of word IDs used as attributes when the dictionary is generated. Thus, the dictionary pattern management server 2350 manages the dictionary pattern defined by the items 2601 to 2604.

次に、実施例６における辞書生成の処理について説明する。画像管理サーバ２３１０が管理する画像は、例えば、ＷｅｂクローラによってＷｅｂ上から収集したものを用いることができる。画像に付与される単語２４０２は、運用者によって付与されたものでも良いが、Ｗｅｂクローラによって収集された場合は、元画像が含まれるＨＴＭＬ文書中の画像引用箇所の前後に存在するテキスト、あるいは、ＨＴＭＬ文書のタイトルから自動的に抽出することも可能である。 Next, dictionary generation processing according to the sixth embodiment will be described. As the image managed by the image management server 2310, for example, an image collected from the Web by a Web crawler can be used. The word 2402 given to the image may be given by the operator, but when collected by the Web crawler, the text existing before and after the image citation location in the HTML document including the original image, or It is also possible to automatically extract from the title of the HTML document.

辞書生成サービスは、辞書生成の事前処理として、画像管理サーバ２３１０が管理する画像中から、適切な個数の画像をサンプリングし、第２の画像集合１０２から比較用パターンである領域群を生成する。生成された領域群は比較用パターン管理サーバ２３４０に登録される。 The dictionary generation service samples an appropriate number of images from images managed by the image management server 2310 as pre-diction processing for generating a dictionary, and generates a group of regions as comparison patterns from the second image set 102. The generated region group is registered in the comparison pattern management server 2340.

図２７は、辞書生成に用いられる画面の一例を示す説明図である。画面２７１０は、辞書生成システム３００の利用者が、端末装置２４６０から、辞書生成サービス２３３０に対して、辞書生成の要求を出す際に、利用者に対して提示される画面である。先ず、利用者は、端末装置２３６０上で稼働するアプリケーションプログラムの画面２７１０上の入力フィールド２７１１に、属性として用いる単語を列挙し、辞書生成サービス２３３０に送信する。 FIG. 27 is an explanatory diagram showing an example of a screen used for dictionary generation. A screen 2710 is a screen presented to the user when the user of the dictionary generation system 300 issues a dictionary generation request from the terminal device 2460 to the dictionary generation service 2330. First, the user enumerates words used as attributes in the input field 2711 on the screen 2710 of the application program running on the terminal device 2360, and transmits it to the dictionary generation service 2330.

辞書生成サービス２３３０は、画像管理サーバ２３１０に対して、指定された単語列上の単語を、単語２４０２のリストに含む画像の管理番号を取得し、それらの画像を一覧表示するための情報を構成して端末装置２３６０上のアプリケーションプログラムに送付する。これによって、利用者に提示する画面は、画面２７１０から画面２７２０に遷移し、該当する画像の一覧２７２１が表示される。一覧２７２１の画像が、辞書生成のための元画像となるが、利用者は、辞書生成に使用するのが適切ではないと判断した画像を、本画面上で指定することもできる。 The dictionary generation service 2330 obtains, for the image management server 2310, the management number of the image including the word on the designated word string in the list of the word 2402, and configures information for displaying the list of those images. To the application program on the terminal device 2360. As a result, the screen presented to the user transitions from the screen 2710 to the screen 2720, and a list 2721 of the corresponding images is displayed. The image of the list 2721 is the original image for dictionary generation, but the user can also specify on the screen an image that is determined to be inappropriate for dictionary generation.

一般には、元画像の候補は大量に存在する。利用者は、必要であれば、ページ切り替えボタン２７２２を押すことによって、全件をチェックすることもできる。ただし、実施例６は、自動的に適切な辞書パターンを取得するものであるから、このような利用者によるチェックが実施されなくても運用上の問題は生じない。 In general, a large number of original image candidates exist. If necessary, the user can check all cases by pressing a page switching button 2722. However, since the sixth embodiment automatically acquires an appropriate dictionary pattern, there is no operational problem even if such a check by the user is not performed.

なお、画像の件数が多いと辞書パターンの生成に長時間を要する。従って、元画像の候補として該当する画像全てを用いるのではなく、その中の一定件数を用いる方が、運用上便利である。画像管理サーバ２３１０の項目２４０３で示した単語ＩＤのリスト（以下、単語ＩＤリスト２４０３）によって、ある画像が、辞書生成に用いられたか否かが判定できる。これによって、過去に辞書パターンとして用いられていない画像を選別して、元画像として用いることもできる。 If the number of images is large, it takes a long time to generate a dictionary pattern. Therefore, it is more convenient in terms of operation to use a certain number of images instead of using all the corresponding images as original image candidates. It is possible to determine whether or not an image has been used for dictionary generation based on a list of word IDs (hereinafter referred to as word ID list 2403) indicated by an item 2403 of the image management server 2310. As a result, an image that has not been used as a dictionary pattern in the past can be selected and used as an original image.

画面２７２０上の開始ボタン２７２３を押すことによって、辞書生成サービス２３３０に、利用者が確認した画像を元画像とする辞書パターン生成の要求が送信される。辞書生成サービス２３３０は、利用者によって確認された画像の集合を、第１の画像集合１０１として用いることによって、辞書生成処理を実行する。一方、第２の画像集合１０２は、比較用パターン管理サーバ２３４０上で管理されるが、この際には、項目２５０１で示した画像の管理番号を参照することによって、利用者が指定した単語列が単語ＩＤリスト２４０３の単語２５０２に含まれる画像を除外する。 When a start button 2723 on the screen 2720 is pressed, a dictionary pattern generation request using an image confirmed by the user as an original image is transmitted to the dictionary generation service 2330. The dictionary generation service 2330 executes a dictionary generation process by using the set of images confirmed by the user as the first image set 101. On the other hand, the second image set 102 is managed on the comparison pattern management server 2340. At this time, the word string designated by the user by referring to the management number of the image indicated by the item 2501. Exclude images included in the word 2502 of the word ID list 2403.

辞書生成サービス２３３０は、比較用パターン管理サーバ２３４０に登録された画像中の条件を満たす画像が所望の件数に達しない場合は、画像管理サーバ２３１０に問い合わせる。そして、辞書生成サービスは、第２の画像集合１０２内の画像、すなわち、指定された単語列が単語ＩＤリスト２４０３の単語２４０２に含まれない画像を取得し、それらの画像から比較用の領域候補を生成する。生成された矩形領域は、比較対象である領域群として比較用パターン管理サーバ２３４０に追加登録される。 The dictionary generation service 2330 makes an inquiry to the image management server 2310 when the desired number of images in the image registered in the comparison pattern management server 2340 does not reach the desired number. Then, the dictionary generation service acquires images in the second image set 102, that is, images in which the designated word string is not included in the word 2402 of the word ID list 2403, and region candidates for comparison are acquired from these images. Is generated. The generated rectangular area is additionally registered in the comparison pattern management server 2340 as an area group to be compared.

辞書生成サービス２３３０は、辞書生成処理を終了すると、その結果を辞書パターン管理サーバ２３５０に登録する。同時に、指定された単語列のＩＤ列を項目２６０４の単語ＩＤリストに格納する。また、画像管理サーバ２３１０の単語ＩＤリスト２４０３も更新する。利用者は、登録された辞書パターンを、端末装置２３６０上で確認することが出来る。 When the dictionary generation service 2330 finishes the dictionary generation process, the result is registered in the dictionary pattern management server 2350. At the same time, the ID string of the designated word string is stored in the word ID list of item 2604. Also, the word ID list 2403 of the image management server 2310 is updated. The user can check the registered dictionary pattern on the terminal device 2360.

図２８は、確認画面の表示例を示す説明図である。確認画面２８００上の表示領域２８０１に、生成された辞書パターンが一覧表示される。アイコン２８０２のように、辞書パターンとして適切ではないものが含まれていた場合は、利用者は、本画面でその辞書パターンを指定することによって、辞書パターン管理サーバ２３５０の登録データから削除することができる。さらに、利用者が望めば、その辞書パターンとして適切ではないデータを、比較用パターン管理サーバ２３４０に登録することもできる。比較用パターン管理サーバ２３４０に登録されたパターンと類似したパターンは、その後の辞書パターンでの生成では、除外される可能性が高くなる。 FIG. 28 is an explanatory diagram illustrating a display example of a confirmation screen. A list of generated dictionary patterns is displayed in a display area 2801 on the confirmation screen 2800. When an unsuitable dictionary pattern such as the icon 2802 is included, the user can delete it from the registered data of the dictionary pattern management server 2350 by specifying the dictionary pattern on this screen. it can. Furthermore, if the user desires, data that is not appropriate as the dictionary pattern can be registered in the comparison pattern management server 2340. A pattern similar to the pattern registered in the comparison pattern management server 2340 is more likely to be excluded in subsequent generation of a dictionary pattern.

以上に説明したように、本実施例によれば、信頼性の高い辞書パターンを自動登録することにより辞書を自動生成することができる。 As described above, according to this embodiment, a dictionary can be automatically generated by automatically registering a highly reliable dictionary pattern.

以上、本発明を添付の図面を参照して詳細に説明したが、本発明はこのような具体的構成に限定されるものではなく、添付した請求の範囲の趣旨内における様々な変更及び同等の構成を含むものである。 Although the present invention has been described in detail with reference to the accompanying drawings, the present invention is not limited to such specific configurations, and various modifications and equivalents within the spirit of the appended claims Includes configuration.

Claims

In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. And
A determination unit that determines whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition unit;
A registration unit for registering the selection area in a dictionary in which a dictionary pattern group is stored, when the determination unit determines that the dictionary pattern should be used ;
Common attribute information is given to each image of the first image set including the first image, and each attribute of the second image set including the second image is assigned to the attribute. No information is given, the first region group is a region group obtained from each image in the first image set, and the second region group is in the second image set. A dictionary generation system characterized by being a group of regions obtained from each of the images .

In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. And
A determination unit that determines whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition unit;
A registration unit for registering the selection area in a dictionary in which a dictionary pattern group is stored, when the determination unit determines that the dictionary pattern should be used;
Attribute information is not given to any of each image of the first image set including the first image and each image of the second image set including the second image, and the first image set includes the first image set. The area group is an area group obtained from each image in the first image set, and the second area group is an area group obtained from each image in the second image set. A dictionary generation system characterized by that.

In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. And
A determination unit that determines whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition unit;
A registration unit for registering the selection area in a dictionary in which a dictionary pattern group is stored, when the determination unit determines that the dictionary pattern should be used;
No attribute information is given to each image of the first image set including the first image, and a common attribute is assigned to each image of the second image set including the second image. Information is provided, the first region group is a region group obtained from each image in the first image set, and the second region group is in the second image set. A dictionary generation system characterized by being a group of regions obtained from each image.

In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. And
A determination unit that determines whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition unit;
A registration unit for registering the selection area in a dictionary in which a dictionary pattern group is stored, when the determination unit determines that the dictionary pattern should be used;
Common first attribute information is given to each image of the first image set including the first image, and each image of the second image set including the second image is assigned to each image of the first image set including the first image. , Common second attribute information is given, and the first area group is an area group obtained from each image in the first image set, and the second area group is A dictionary generation system, which is a group of regions obtained from each image in the second image set.

In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. A feature amount between a first minimum distance that is the minimum of the plurality of first distances, and a second area selected from the selected area and a second area group in the second image. An acquisition unit that acquires a second minimum distance that is the minimum of the plurality of second distances in the space;
A determination unit that determines whether or not the selection area should be a dictionary pattern based on a ratio between the first minimum distance and the second minimum distance acquired by the acquisition unit;
A registration unit for registering the selection area in a dictionary in which a dictionary pattern group is stored;
A dictionary generation system comprising:

The dictionary generation system according to claim 5,
The acquisition unit acquires a first average minimum distance obtained by averaging the first minimum distance for each selected region, and acquires the second minimum distance for each selected region and calculates the average. The second average minimum distance
The determination unit determines whether or not the selection area should be a dictionary pattern based on a ratio between the first average minimum distance and the second average minimum distance.

The dictionary generation system according to any one of claims 1 to 4,
A perturbation processing unit that generates a new region by performing a perturbation process that perturbs the selected region based on a ratio between the first distance and the second distance;
The determination unit, on the basis of the processing result of the perturbation processor, dictionary generation system that is characterized in that determining whether to said selection area on the dictionary pattern.

The dictionary generation system according to claim 7,
The dictionary generation system, wherein the perturbation processing unit executes the perturbation process by reducing a perturbation amount that perturbs the selected region according to an increase in the number of times the perturbation process is executed.

The dictionary generation system according to claim 7,
The acquisition unit acquires a first average minimum distance obtained by averaging the first minimum distance that is the minimum of the plurality of the first distances for each of the selected regions, and a plurality of the first distances. Obtaining a second average minimum distance obtained by averaging the second minimum distance, which is the smallest of the two distances, for each selected region;
A determination unit configured to determine a specific selection region to be subjected to the perturbation process from a plurality of the selection regions based on a ratio between the first average minimum distance and the second average minimum distance;
The dictionary generation system, wherein the perturbation processing unit generates a new region by executing the perturbation process for the specific selection region determined by the determination unit.

The dictionary generation system according to claim 1,
The registration unit registers the selected area determined to be a dictionary pattern by the determination unit in the dictionary in association with the attribute information.

The dictionary generation system according to claim 3,
The registration unit registers the selected area determined to be a dictionary pattern by the determination unit in the dictionary in association with the attribute information.

The dictionary generation system according to claim 1,
A display unit configured to display the selection area determined to be a dictionary pattern by the determination unit so as to be able to designate whether or not to be registered in the dictionary;
The said registration part registers the said selection area | region where registration to the said dictionary of the said selection area | region displayed on the said display part was designated to the said dictionary, The dictionary generation system characterized by the above-mentioned.

A dictionary generation method executed by a dictionary generation system comprising: a processor that executes a program; and a memory that stores a program executed by the processor,
The processor is
In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. Procedure and
A determination procedure for determining whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition procedure;
A registration procedure for registering the selected area in a dictionary in which a dictionary pattern group is stored when it is determined that the determination procedure should be a dictionary pattern;
Common attribute information is given to each image of the first image set including the first image, and each attribute of the second image set including the second image is assigned to the attribute. No information is given, the first region group is a region group obtained from each image in the first image set, and the second region group is in the second image set. A dictionary generation method characterized by being a group of regions obtained from each of the images.

In a dictionary generation system comprising a processor that executes a program and a memory that stores a program executed by the processor,
In a feature amount space between a selection region selected from the first region group in the first image and a first region other than the selection region selected from the first region group. Acquisition of acquiring a first distance and a second distance in the feature amount space between the selected area and a second area selected from the second area group in the second image. Procedure and
A determination procedure for determining whether or not the selection area should be a dictionary pattern based on a ratio between the first distance and the second distance acquired by the acquisition procedure;
A registration procedure for registering the selected region in a dictionary in which a dictionary pattern group is stored;
Common attribute information is given to each image of the first image set including the first image, and each attribute of the second image set including the second image is assigned to the attribute. No information is given, the first region group is a region group obtained from each image in the first image set, and the second region group is in the second image set. A dictionary generation program characterized by being a group of regions obtained from each of the images.