JP5347897B2

JP5347897B2 - Annotation apparatus, method and program

Info

Publication number: JP5347897B2
Application number: JP2009238350A
Authority: JP
Inventors: 盈輝徐
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2009-10-15
Filing date: 2009-10-15
Publication date: 2013-11-20
Anticipated expiration: 2029-10-15
Also published as: JP2011086113A

Description

本発明は、画像の特徴を解析して注釈を付与する注釈付与装置、方法およびプログラムに関する。 The present invention relates to an annotating apparatus, method, and program for analyzing an image characteristic and adding an annotation.

画像に対して画像の内容などを表す情報（注釈（アノテーション）、タグ、ラベル）を付与する画像アノテーションは、一般に手作業で行われるため負荷が大きく、時間を浪費し、高コストであった。このような問題を解消するため、画像アノテーションを自動で実行させるための研究が行われてきた。自動画像アノテーションシステムでは、単一の学習器を用いて意味範疇（注釈）を学習するため、さまざまな画像特徴量を１つの特徴ベクトルに結合する（例えば、非特許文献１）。そして、注釈が付与された学習画像を用いて、学習画像の特徴量を含む特徴ベクトルと学習画像に付与された注釈との関係を学習器により学習する。この学習結果を参照することにより、注釈付与の対象となる画像に対する適切な注釈を決定可能となる。 An image annotation that gives information (annotation (annotation), tag, label) indicating the content of an image to an image is generally a manual operation, and thus has a heavy load, wastes time, and is expensive. In order to solve such problems, research has been conducted to automatically execute image annotation. In the automatic image annotation system, a single learning device is used to learn semantic categories (annotations), and thus various image feature amounts are combined into one feature vector (for example, Non-Patent Document 1). Then, the learning device learns the relationship between the feature vector including the feature amount of the learning image and the annotation given to the learning image using the learning image to which the annotation is given. By referring to the learning result, it is possible to determine an appropriate annotation for the image to be annotated.

しかしながら、このような方法では、画像の特徴量と特定の注釈との関係が弱いため、与えられた画像に対して適切な注釈を付与できない場合があるという問題があった。 However, in such a method, since the relationship between the feature amount of the image and the specific annotation is weak, there is a problem that an appropriate annotation may not be given to the given image.

本発明は、上記に鑑みてなされたものであって、画像に付与する注釈をより適切に決定することができる注釈付与装置、方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and an object thereof is to provide an annotation providing apparatus, method, and program capable of more appropriately determining an annotation to be added to an image.

上述した課題を解決し、目的を達成するために、本発明は、予め定められた複数の画像と前記画像の特徴量とを対応づけたインデックスを記憶するインデックス記憶部と、前記画像と予め定められた複数の注釈とを対応づけて記憶する対応記憶部と、注釈を付与する対象画像を複数の分割画像に分割する分割部と、前記分割画像を解析し、前記分割画像の特徴量を抽出する特徴抽出部と、対応する前記特徴量と抽出された前記特徴量との類似度が大きい予め定められた個数の前記画像を前記インデックス記憶部から検索する検索部と、検索された前記画像に対応づけられた前記注釈それぞれについて、前記対応記憶部内での前記注釈の出現頻度が小さいほど大きく、かつ、対応する前記画像の前記類似度が大きいほど大きく、かつ、対応づけられた複数の前記画像の特徴量が相互に類似するほど大きい値となる前記注釈のスコアを算出する算出部と、前記スコアが大きい前記注釈を優先して前記対象画像に付与する前記注釈を決定する第１決定部と、を備えることを特徴とする注釈付与装置である。 In order to solve the above-described problems and achieve the object, the present invention provides an index storage unit that stores an index in which a plurality of predetermined images and feature amounts of the images are associated with each other, and the predetermined image. A correspondence storage unit that stores the plurality of annotations in association with each other, a division unit that divides the target image to be annotated into a plurality of divided images, and analyzes the divided images to extract feature amounts of the divided images A feature extraction unit, a search unit that searches the index storage unit for a predetermined number of images having a high degree of similarity between the corresponding feature amount and the extracted feature amount, and the searched image For each of the associated annotations, the smaller the appearance frequency of the annotation in the correspondence storage unit is, and the larger the similarity of the corresponding image is, the larger the correspondence is. A calculation unit that calculates a score of the annotation that has a larger value as the feature quantities of the plurality of images are similar to each other, and determines the annotation to be given to the target image with priority on the annotation having the higher score A first determination unit that performs the annotation.

また、本発明は、上記装置で実行することができる方法およびプログラムである。 The present invention also relates to a method and a program that can be executed by the above apparatus.

本発明によれば、画像に付与する注釈をより適切に決定することができるという効果を奏する。 According to the present invention, it is possible to more appropriately determine the annotation to be added to the image.

図１は、本実施の形態にかかる注釈付与装置を含む情報処理システムの構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of an information processing system including an annotation assigning apparatus according to the present embodiment. 図２は、検索される複数の画像から分類コードを決定する機能の概要を説明するための図である。FIG. 2 is a diagram for explaining an outline of a function for determining a classification code from a plurality of searched images. 図３は、本実施の形態における注釈決定処理の全体の流れを示すフローチャートである。FIG. 3 is a flowchart showing the overall flow of the annotation determination process in the present embodiment. 図４は、スコア算出処理の全体の流れを示すフローチャートである。FIG. 4 is a flowchart showing the overall flow of the score calculation process. 図５は、要素ａ算出処理の全体の流れを示すフローチャートである。FIG. 5 is a flowchart showing the overall flow of the element a calculation process. 図６は、要素ｂ算出処理の全体の流れを示すフローチャートである。FIG. 6 is a flowchart showing the overall flow of the element b calculation process. 図７は、画像パターン検出処理の一例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of image pattern detection processing. 図８は、信用テーブルのデータ構造の一例を示す図である。FIG. 8 is a diagram illustrating an example of the data structure of the trust table. 図９は、学習画像ごとのランク、類似度、対応づけられた分類コードの例を示す図である。FIG. 9 is a diagram illustrating an example of ranks, similarities, and associated classification codes for each learning image. 図１０は、学習画像間の相互類似度の算出例を示す図である。FIG. 10 is a diagram illustrating a calculation example of the mutual similarity between learning images. 図１１は、本実施の形態にかかる注釈付与装置のハードウェア構成の一例を示す図である。FIG. 11 is a diagram illustrating an example of a hardware configuration of the annotation assigning apparatus according to the present embodiment.

以下に添付図面を参照して、この発明にかかる注釈付与装置、方法およびプログラムの一実施の形態を詳細に説明する。 Exemplary embodiments of an annotation assigning apparatus, method, and program according to the present invention will be described below in detail with reference to the accompanying drawings.

本実施の形態の注釈付与装置は、学習画像と複数の注釈とを対応づけた学習データを用いた学習により得られる学習結果を参照して、注釈付与の対象となる対象画像に対する注釈を決定する。本実施の形態の注釈付与装置は、例えば、商標を表す対象画像に対して、標章の図形要素を分類するために定められている「ウイーン分類」に対応する注釈を付与する装置に適用できる。 The annotation giving apparatus according to the present embodiment determines an annotation for a target image to be annotated with reference to a learning result obtained by learning using learning data in which a learning image is associated with a plurality of annotations. . The annotation giving apparatus according to the present embodiment can be applied to, for example, an apparatus for giving an annotation corresponding to “Wien classification” defined for classifying graphic elements of a mark on a target image representing a trademark. .

ウイーン分類は、すべての図形要素を大分類、中分類、小分類に分けた階層構成となっている。そして、大分類、中分類、小分類それぞれに所定の符号体系に従った番号（以下、分類コードという）が付与されている。例えば分類コード＝「１．１」は、「星、彗星」を表すことが定められている。 The Vienna classification has a hierarchical structure in which all graphic elements are divided into a large classification, a medium classification, and a small classification. A number according to a predetermined code system (hereinafter referred to as a classification code) is assigned to each of the major classification, middle classification, and minor classification. For example, the classification code = “1.1” is defined to represent “star, comet”.

以下では、本実施の形態の注釈付与装置を、対象画像に付与するウイーン分類の分類コードを決定するように構成した例について説明する。分類コードを決定すれば、分類コードに対応する内容（注釈）が決定できる。なお、適用可能な注釈はウイーン分類の分類コードに対応する注釈に限られるものではない。学習画像に付与しうるものであればよいため、任意の注釈を適用することができる。 Hereinafter, an example will be described in which the annotation adding apparatus according to the present embodiment is configured to determine the classification code of the Vienna classification to be given to the target image. If the classification code is determined, the content (annotation) corresponding to the classification code can be determined. Note that applicable annotations are not limited to annotations corresponding to the Vienna classification code. An arbitrary annotation can be applied as long as it can be given to the learning image.

本実施の形態の注釈付与装置は、まず、対象画像を複数の分割画像に分割する。次に、分割画像と特徴量が類似する上位ｋ個の学習画像を求め、学習画像に対応する分類コードを取得する。次に、取得された分類コードに対して、学習データ内での分類コードの出現頻度、および、分割画像と検索した画像との類似度等に応じて分類コードの優先度を表すスコアを算出する。そして、スコアが上位である所定数の分類コードを、付与すべき分類コードとして決定する。これにより、画像の特徴量との関係が強い分類コードを適切に決定することができる。 The annotation giving apparatus according to the present embodiment first divides a target image into a plurality of divided images. Next, the top k learning images whose feature amounts are similar to the divided images are obtained, and a classification code corresponding to the learning image is acquired. Next, for the obtained classification code, a score representing the priority of the classification code is calculated according to the appearance frequency of the classification code in the learning data, the similarity between the divided image and the searched image, and the like. . Then, a predetermined number of classification codes having higher scores are determined as classification codes to be given. Thereby, it is possible to appropriately determine a classification code having a strong relationship with the feature amount of the image.

図１は、本実施の形態にかかる注釈付与装置を含む情報処理システムの構成例を示すブロック図である。図１に示すように情報処理システムは、注釈付与装置１００と、複数のユーザ端末２００ａ、２００ｂとが、ネットワーク３００を介して接続された構成となっている。ネットワーク３００は、ＬＡＮ（Loacal Area Network）およびインターネットなどのあらゆるネットワーク構成を適用できる。ユーザ端末２００ａ、２００ｂは、例えばＰＣ（パーソナルコンピュータ）などのユーザにより利用される端末装置である。なお、ユーザ端末２００ａ、２００ｂは同様の構成を備えるため、以下では単にユーザ端末２００という場合がある。また、ユーザ端末２００の個数は２に限られるものではない。 FIG. 1 is a block diagram illustrating a configuration example of an information processing system including an annotation assigning apparatus according to the present embodiment. As illustrated in FIG. 1, the information processing system has a configuration in which an annotation assigning device 100 and a plurality of user terminals 200 a and 200 b are connected via a network 300. The network 300 can employ any network configuration such as a LAN (Loacal Area Network) and the Internet. The user terminals 200a and 200b are terminal devices used by a user such as a PC (personal computer). In addition, since the user terminals 200a and 200b have the same configuration, the user terminals 200 may be simply referred to as the user terminal 200 below. Further, the number of user terminals 200 is not limited to two.

ユーザ端末２００は、分類コードを付与する対象となる画像を表す対象画像を注釈付与装置１００に送信する。また、ユーザ端末２００は、対象画像に対して注釈付与装置１００が付与した分類コードを注釈付与装置１００から受信し、ディスプレイなどの表示部（図示せず）に表示する。 The user terminal 200 transmits a target image representing an image to which a classification code is assigned to the annotation giving apparatus 100. Further, the user terminal 200 receives the classification code assigned by the annotation assigning device 100 to the target image from the annotation assigning device 100 and displays it on a display unit (not shown) such as a display.

注釈付与装置１００は、指定された対象画像に対して付与する分類コードを決定し、決定した分類コードをユーザ端末２００等に出力する。分類コードの決定方法の詳細については後述する。 The annotation assigning device 100 determines a classification code to be assigned to the designated target image, and outputs the determined classification code to the user terminal 200 or the like. Details of the classification code determination method will be described later.

なお、システム構成は図１の構成に限られるものではない。例えば、注釈付与装置１００内で対象画像の指定を受付け、この対象画像に対して決定された分類コードを注釈付与装置１００に備えられる表示部（図示せず）等に出力するように構成してもよい。 The system configuration is not limited to the configuration shown in FIG. For example, it is configured to accept the designation of the target image in the annotation giving apparatus 100 and output the classification code determined for this target image to a display unit (not shown) provided in the annotation giving apparatus 100 or the like. Also good.

次に、注釈付与装置１００の機能構成について説明する。図１に示すように、注釈付与装置１００は、インデックス記憶部１５１と、対応記憶部１５２と、規則記憶部１５３と、画像受付部１０１と、分割部１１１と、特徴抽出部１１２と、検索部１１３と、算出部１１４と、第１決定部１１５と、分類部１１６と、第２決定部１１７と、パターン検出部１１８と、第３決定部１１９と、出力部１２１と、通信部１２２と、を備えている。 Next, the functional configuration of the annotation assigning apparatus 100 will be described. As shown in FIG. 1, the annotation assigning apparatus 100 includes an index storage unit 151, a correspondence storage unit 152, a rule storage unit 153, an image reception unit 101, a division unit 111, a feature extraction unit 112, and a search unit. 113, a calculation unit 114, a first determination unit 115, a classification unit 116, a second determination unit 117, a pattern detection unit 118, a third determination unit 119, an output unit 121, a communication unit 122, It has.

インデックス記憶部１５１は、複数の学習画像と、学習画像それぞれの特徴量とを対応づけたインデックスを記憶する。インデックス記憶部１５１に記憶されるインデックスは、検索部１１３が対象画像に類似する学習画像を検索するときに参照される。インデックスは、学習画像と複数の分類コードとを対応づけた学習データを用いた学習により、与えられた画像と特徴量が類似する複数の学習画像を検索できるように構成される。 The index storage unit 151 stores an index in which a plurality of learning images are associated with feature amounts of the learning images. The index stored in the index storage unit 151 is referred to when the search unit 113 searches for a learning image similar to the target image. The index is configured so that a plurality of learning images whose feature amounts are similar to a given image can be searched by learning using learning data in which a learning image is associated with a plurality of classification codes.

なお、インデックスの学習は例えば以下のように実行される。まず、学習データに含まれる学習画像から、特徴抽出部１１２と同様の方法により学習画像の特徴量を抽出する。そして、分割画像と特徴量が類似する学習画像を検索するためのインデックスを、例えばｋ近傍（ＫＮＮ：K Nearest Neighbors）法により学習する。この場合、インデックス記憶部１５１は、例えばＫＮＮ法で利用可能なｋｄ（k-dimensional）木で表されたインデックスを記憶するように構成することができる。 Note that index learning is performed, for example, as follows. First, the feature amount of the learning image is extracted from the learning image included in the learning data by the same method as the feature extraction unit 112. Then, an index for searching for a learning image whose feature quantity is similar to that of the divided image is learned by, for example, a k-Nest (K Nearest Neighbors) method. In this case, the index storage unit 151 can be configured to store, for example, an index represented by a kd (k-dimensional) tree that can be used in the KNN method.

なお、インデックスの学習方法およびデータ構造は、ＫＮＮ法およびｋｄ木に限られるものではない。特徴量が類似する学習画像を検索可能なものであればあらゆる学習方法および対応するデータ構造を適用できる。 Note that the index learning method and data structure are not limited to the KNN method and the kd tree. Any learning method and corresponding data structure can be applied as long as a learning image having a similar feature amount can be searched.

対応記憶部１５２は、インデックス記憶部１５１に記憶される学習画像と当該学習画像に予め付与された複数の分類コードとを対応づけて記憶する。対応記憶部１５２は、検索された学習画像に対応する分類コードを取得するときに参照される。 The correspondence storage unit 152 stores a learning image stored in the index storage unit 151 in association with a plurality of classification codes assigned in advance to the learning image. The correspondence storage unit 152 is referred to when acquiring a classification code corresponding to the searched learning image.

規則記憶部１５３は、分割画像の特徴量に応じて、各分類コードに対応する複数のクラスのいずれかに分割画像を分類するための規則を記憶する。例えば、規則記憶部１５３は、学習画像と当該学習画像に対する１つの分類コードとを対応づけた学習データを用いてランダム・フォレスト法により学習された規則を記憶する。 The rule storage unit 153 stores a rule for classifying the divided image into one of a plurality of classes corresponding to each classification code in accordance with the feature amount of the divided image. For example, the rule storage unit 153 stores a rule learned by the random forest method using learning data in which a learning image is associated with one classification code for the learning image.

なお、規則の学習は例えば以下のように実行される。まず、学習データに含まれる学習画像から、特徴抽出部１１２と同様の方法により学習画像の特徴量を抽出する。そして、分割画像と特徴量が類似する学習画像を検索するための規則を、ランダム・フォレスト法により学習する。この場合、規則記憶部１５３は、特徴量に応じて分岐していくことにより分類するクラスを決定する複数の決定木で表された規則を記憶する。 The rule learning is executed as follows, for example. First, the feature amount of the learning image is extracted from the learning image included in the learning data by the same method as the feature extraction unit 112. Then, a rule for searching for a learning image whose feature quantity is similar to that of the divided image is learned by a random forest method. In this case, the rule storage unit 153 stores rules represented by a plurality of decision trees that determine classes to be classified by branching according to feature amounts.

なお、規則の学習方法は、ランダム・フォレスト法に限られるものではない。ランダム・フォレスト法を用いれば、多数の特徴量をベクトル化した高次元の特徴ベクトルを用いることによる、いわゆる次元の呪いの問題を解消可能となる。すなわち、ランダム・フォレスト法では、特徴ベクトルに含まれる特徴量のうち一部がランダムに選択され、選択された特徴量に応じて分割画像が分類される。また、このような分類が複数の決定木それぞれに対して行われ、各決定木による結果の投票（多数決）により、最終的に分類するクラスが決定される。したがって、より客観的に分類するクラス（分類コード）を決定することができる。 The rule learning method is not limited to the random forest method. If the random forest method is used, the so-called dimensional curse problem caused by using a high-dimensional feature vector obtained by vectorizing a large number of features can be solved. That is, in the random forest method, a part of the feature amounts included in the feature vector is selected at random, and the divided images are classified according to the selected feature amount. Further, such classification is performed on each of the plurality of decision trees, and finally a class to be classified is determined by voting (majority decision) on the result of each decision tree. Therefore, a class (classification code) to be classified more objectively can be determined.

なお、規則記憶部１５３では１つのクラスに対して１つの分類コードが対応づけられる。したがって、分類部１１６によって分類された分割画像に対しては、分類されたクラスに対応する１つの分類コードが得られる。これに対し、上述の対応記憶部１５２では１つの学習画像に対して複数の分類コードが対応づけられている。また、検索部１１３は分割画像に対して複数の類似する学習画像を検索する。したがって、検索部１１３によって検索された分割画像に付与すべき分類コードとして、複数の分類コードが得られる。 In the rule storage unit 153, one classification code is associated with one class. Therefore, for the divided images classified by the classification unit 116, one classification code corresponding to the classified class is obtained. On the other hand, in the above-described correspondence storage unit 152, a plurality of classification codes are associated with one learning image. In addition, the search unit 113 searches for a plurality of similar learning images with respect to the divided image. Therefore, a plurality of classification codes are obtained as classification codes to be assigned to the divided images searched by the search unit 113.

通信部１２２は、ユーザ端末２００などの外部装置との間で各種情報を送受信する。例えば、通信部１２２は、ユーザ端末２００から、対象画像または対象画像を識別する情報を受信する。また、通信部１２２は、対象画像に対して決定された分類コードをユーザ端末２００に送信する。 The communication unit 122 transmits / receives various information to / from an external device such as the user terminal 200. For example, the communication unit 122 receives the target image or information for identifying the target image from the user terminal 200. Further, the communication unit 122 transmits the classification code determined for the target image to the user terminal 200.

画像受付部１０１は、対象画像の指定を受付ける。例えば、画像受付部１０１は、ユーザ端末２００から通信部１２２を介して受信した対象画像を受付ける。なお、対象画像の受付け方法はこれに限られるものではない。例えば、対象画像を識別する情報を受付け、この情報を元に注釈付与装置１００内部または外部装置から対象画像を取得するように構成してもよい。 The image receiving unit 101 receives the designation of the target image. For example, the image receiving unit 101 receives a target image received from the user terminal 200 via the communication unit 122. Note that the method for receiving the target image is not limited to this. For example, information for identifying the target image may be received, and the target image may be acquired from the inside of the annotation giving apparatus 100 or an external device based on this information.

分割部１１１は、対象画像を複数の分割画像に分割する。分割部１１１は、例えば、対象画像に含まれる図形領域と文字領域とを認識し、各領域に対応する複数の分割画像に分割する。なお、分割方法はこれに限られるものではなく、任意の分割方法を適用できる。 The dividing unit 111 divides the target image into a plurality of divided images. For example, the dividing unit 111 recognizes a graphic area and a character area included in the target image, and divides them into a plurality of divided images corresponding to the respective areas. The dividing method is not limited to this, and any dividing method can be applied.

特徴抽出部１１２は、各分割画像を解析し、分割画像の特徴量を抽出する。特徴量としては、例えば、色ヒストグラム、配色、エッジ、テクスチャ、構図などの従来から用いられているあらゆる指標を適用できる。 The feature extraction unit 112 analyzes each divided image and extracts a feature amount of the divided image. As the feature amount, any conventionally used index such as a color histogram, color scheme, edge, texture, composition, etc. can be applied.

検索部１１３は、各分割画像について、抽出された特徴量と類似する特徴量に対応する複数の学習画像をインデックス記憶部１５１から検索する。検索部１１３は、例えばＫＮＮ法によって、分割画像から抽出された特徴量と類似するｋ個の学習画像をインデックス記憶部１５１から検索する。なお、適用可能な検索方法はＫＮＮ法に限られず、例えば、近似最近傍探索（ＡＮＮ：Approximate Nearest Neighbor）法を適用してもよい。 The search unit 113 searches the index storage unit 151 for a plurality of learning images corresponding to feature amounts similar to the extracted feature amounts for each divided image. The search unit 113 searches the index storage unit 151 for k learning images similar to the feature amount extracted from the divided image, for example, by the KNN method. The applicable search method is not limited to the KNN method, and for example, an approximate nearest neighbor (ANN) method may be applied.

算出部１１４は、検索部１１３によって検索された学習画像に対応づけられた各分類コードについて、分類コードのスコアを算出する。具体的には、算出部１１４は、インデックス記憶部１５１のインデックスの学習に用いた学習データ内での分類コードの出現頻度が小さく、分割画像と学習画像との類似度が大きく、かつ、対応づけられた複数の学習画像の特徴量が相互に類似するほど大きい値となるスコアを算出する。スコアの算出方法の詳細については後述する。 The calculation unit 114 calculates the score of the classification code for each classification code associated with the learning image searched by the search unit 113. Specifically, the calculation unit 114 has a low appearance frequency of the classification code in the learning data used for learning the index of the index storage unit 151, has a high similarity between the divided image and the learning image, and associates them. The score which becomes a large value is calculated so that the feature-value of the obtained some learning image is mutually similar. Details of the score calculation method will be described later.

第１決定部１１５は、算出されたスコアが大きい分類コードを優先して、対象画像に付与する分類コードとして決定する。例えば、第１決定部１１５は、スコアが上位の所定数の分類コードを対象画像に付与する分類コードとして決定する。なお、後述するように第２決定部１１７および第３決定部１１９によっても付与する分類コードが決定される。そして、出力部１２１が、決定された分類コードから、信用値（後述）が高い分類コードを、最終的に対象画像に付与する分類コードとして決定して出力する。 The first determination unit 115 prioritizes a classification code having a large calculated score and determines it as a classification code to be assigned to the target image. For example, the first determination unit 115 determines a predetermined number of classification codes having higher scores as classification codes to be assigned to the target image. As will be described later, the second determination unit 117 and the third determination unit 119 also determine the classification code to be assigned. Then, the output unit 121 determines and outputs a classification code having a high trust value (described later) as a classification code to be finally added to the target image from the determined classification code.

分類部１１６は、特徴抽出部１１２によって抽出された特徴量と、規則記憶部１５３に記憶された規則とを用いて、分割部１１１によって分割された各分割画像をいずれかのクラスに分類する。例えば、分類部１１６は、ランダム・フォレスト法により、学習された規則を用いて、分割画像を特徴量に応じたクラスに分類する。 The classification unit 116 classifies each divided image divided by the dividing unit 111 into any class using the feature amount extracted by the feature extraction unit 112 and the rule stored in the rule storage unit 153. For example, the classification unit 116 classifies the divided images into classes according to the feature amounts using the learned rules by the random forest method.

第２決定部１１７は、分類されたクラスに対応づけられた分類コードを、対象画像に付与する分類コードとして決定する。 The second determination unit 117 determines the classification code associated with the classified class as the classification code to be assigned to the target image.

パターン検出部１１８は、対象画像から予め定められた画像パターンを検出する。画像パターンとしては、例えば円、楕円（ウイーン分類では２６．１）、四角形（ウイーン分類では２６．４）などの容易に特定可能な幾何学的パターンを適用できる。 The pattern detection unit 118 detects a predetermined image pattern from the target image. As the image pattern, for example, an easily identifiable geometric pattern such as a circle, an ellipse (26.1 in the Wien classification), and a quadrangle (26.4 in the Wien classification) can be applied.

第３決定部１１９は、検出された画像パターンに予め対応づけられた分類コードを、対象画像に付与する分類コードとして決定する。例えば、正方形が検出された場合、第３決定部１１９は、ウイーンコードの「正方形」に対応する分類コードである「２６．４．１」を付与する分類コードとして決定する。 The third determination unit 119 determines a classification code associated with the detected image pattern in advance as a classification code to be assigned to the target image. For example, when a square is detected, the third determination unit 119 determines a classification code to which “26.4.1” that is a classification code corresponding to the “square” of the Vienna code is assigned.

出力部１２１は、決定された分類コードを出力する。本実施の形態では、出力部１２１が、第１決定部１１５、第２決定部１１７、および第３決定部１１９で決定された各分類コードを統合し、すべての分類コードの中から信用値が高い所定数の分類コードを選択して最終的な分類コードとして出力する。信用値とは、分類コードの確からしさを表す値である。信用値は、例えば学習データによる学習を行ったときに事前に算出し、分類コードと信用値とを対応づけたテーブル（以下、信用テーブルという）として所定の記憶部（図示せず）に記憶しておく。信用テーブルの詳細については後述する。 The output unit 121 outputs the determined classification code. In the present embodiment, the output unit 121 integrates the classification codes determined by the first determination unit 115, the second determination unit 117, and the third determination unit 119, and the credit value is obtained from all the classification codes. A high predetermined number of classification codes are selected and output as final classification codes. The trust value is a value representing the certainty of the classification code. The credit value is calculated in advance, for example, when learning is performed using learning data, and is stored in a predetermined storage unit (not shown) as a table (hereinafter referred to as a credit table) in which classification codes and credit values are associated with each other. Keep it. Details of the trust table will be described later.

以上のように、本実施の形態では、Ａ．ＫＮＮ法などにより検索される複数の画像から分類コードを決定する機能（以下、機能Ａという）、Ｂ．ランダム・フォレスト法などにより分類されるクラスに応じた分類コードを決定する機能（以下、機能Ｂという）、Ｃ．対象画像から検出される画像パターンに対応する分類コードを決定する機能（以下、機能Ｃという）、の３つの機能により分類コードが決定される。 As described above, in this embodiment, A. A function for determining a classification code from a plurality of images searched by the KNN method or the like (hereinafter referred to as function A); A function for determining a classification code corresponding to a class classified by the random forest method (hereinafter referred to as function B), C.I. The classification code is determined by three functions: a function for determining a classification code corresponding to an image pattern detected from the target image (hereinafter referred to as function C).

本実施の形態の注釈付与装置１００は、少なくとも機能Ａを備えていればよい。機能Ａにより、学習データ内での分類コードの出現頻度等に応じて算出した分類コードのスコアが上位である所定数の分類コードを、付与すべき分類コードとして決定することができる。すなわち、画像の特徴量との関係が強い分類コードを適切に決定することができる。機能Ｂをさらに備えれば、上述のように次元の呪いの問題を解消し、より客観的に分類コードを決定可能となる。 The annotation giving apparatus 100 of the present embodiment only needs to have at least the function A. With function A, it is possible to determine a predetermined number of classification codes whose scores of classification codes calculated according to the appearance frequency of classification codes in the learning data are higher as classification codes to be given. That is, it is possible to appropriately determine a classification code having a strong relationship with the image feature amount. If the function B is further provided, the problem of the curse of dimension can be solved as described above, and the classification code can be determined more objectively.

さらに機能Ｃを備えれば、分類コードが不均衡となる問題を解消可能となる。分類コードが不均衡となる問題とは、例えば、円または四角形などのような単純な画像パターンは、多くの学習データに含まれる可能性が高いため、このような画像パターンに対応する分類コードが学習結果に多く含まれることになるという問題である。 Furthermore, if the function C is provided, the problem that the classification codes are unbalanced can be solved. The problem that the classification codes are unbalanced is that, for example, a simple image pattern such as a circle or a rectangle is likely to be included in a lot of learning data. Therefore, there is a classification code corresponding to such an image pattern. It is a problem that it will be included in many learning results.

機能Ｃによれば、多くの画像に含まれうる画像パターンに対応する分類コードを容易に決定することができる。また、この機能Ｃで決定できる分類コードを除外した分類コードを、機能ＡおよびＢにより決定するように構成すれば、一部の分類コードが多数の学習結果に含まれることを回避可能となる。 According to the function C, it is possible to easily determine a classification code corresponding to an image pattern that can be included in many images. Further, if the classification code excluding the classification code that can be determined by the function C is determined by the functions A and B, it is possible to avoid that some classification codes are included in many learning results.

なお、機能ＡおよびＢを共に備える場合に、それぞれが決定する分類コードを重複しないように分けてもよい。例えば、機能Ｂでは、データ量が小さい学習データにより学習可能な分類コードを選択して学習し、それ以外の分類コードを機能Ａで決定するように構成してもよい。 When both functions A and B are provided, the classification codes determined by each may be divided so as not to overlap. For example, the function B may be configured to select and learn a classification code that can be learned from learning data with a small amount of data, and to determine other classification codes using the function A.

次に、機能Ａの概要について説明する。図２は、ＫＮＮ法などにより検索される複数の画像から分類コードを決定する機能の概要を説明するための図である。 Next, an outline of function A will be described. FIG. 2 is a diagram for explaining an outline of a function for determining a classification code from a plurality of images searched by the KNN method or the like.

図２に示すように、入力された対象画像２１は、複数の分割画像２２ａ〜２２ｃに分割される。なお、分割数は３に限られるものではない。また、図２では省略しているが、各分割画像に対して、特徴抽出部１１２によって特徴量が抽出される。 As shown in FIG. 2, the input target image 21 is divided into a plurality of divided images 22a to 22c. Note that the number of divisions is not limited to three. Although omitted in FIG. 2, feature amounts are extracted by the feature extraction unit 112 for each divided image.

検索部１１３は、各分割画像について、インデックス記憶部１５１を参照してＫＮＮ法により抽出した特徴量が類似する学習画像を検索する。これにより、各分割画像に対する検索結果２３が得られる。なお、検索結果２３は、分割画像ごとにｋ／ｎ個（ｎは分割数）の類似する学習画像を含んでいる。得られたｋ／ｎ個の学習画像は、それぞれ分割画像に対する類似度が対応づけられている。この結果、対象画像に対しては、ｋ個の学習画像が得られる。図２では、得られたｋ個の学習画像を、類似度が高い順にＩｍｇ_１〜Ｉｍｇ_ｋのように並べた例が示されている。また、図２では、画像群２４が、得られたｋ個の学習画像を表している。 The search unit 113 searches the divided images for learning images with similar feature amounts extracted by the KNN method with reference to the index storage unit 151. Thereby, a search result 23 for each divided image is obtained. The search result 23 includes k / n (n is the number of divisions) similar learning images for each divided image. Each of the obtained k / n learning images is associated with a similarity to the divided image. As a result, k learning images are obtained for the target image. FIG. 2 shows an example in which the obtained k learning images are arranged like Img ₁ to Img _{k in} descending order of similarity. In FIG. 2, the image group 24 represents the obtained k learning images.

算出部１１４は、このようにして得られた分割画像ごとの学習画像に対応づけられた分類コードを対応記憶部１５２から取得する。そして、算出部１１４は、得られた分類コードに対して、出現頻度、類似度、類似度の大きい順を表すランク、および、対応づけられた学習画像の特徴量の相互類似度（後述）を用いて、分類コードのスコアを算出する。これにより、スコア付分類コードリスト２５が得られる。 The calculation unit 114 acquires the classification code associated with the learning image for each divided image obtained in this manner from the correspondence storage unit 152. Then, the calculation unit 114 calculates the appearance frequency, the similarity, the rank representing the order of similarity, and the mutual similarity (described later) of the feature quantities of the associated learning images with respect to the obtained classification code. Use to calculate the score of the classification code. Thereby, the scored classification code list 25 is obtained.

次に、このように構成された本実施の形態にかかる注釈付与装置１００による注釈決定処理について説明する。図３は、本実施の形態における注釈決定処理の全体の流れを示すフローチャートである。 Next, the annotation determination process by the annotation assigning apparatus 100 according to the present embodiment configured as described above will be described. FIG. 3 is a flowchart showing the overall flow of the annotation determination process in the present embodiment.

まず、画像受付部１０１が、指定された対象画像を受付ける（ステップＳ３０１）。以下、機能Ａ（ステップＳ３０２〜ステップＳ３０７）、機能Ｂ（ステップＳ３０２〜ステップＳ３０９）、および機能Ｃ（ステップＳ３１０、ステップＳ３１１）がそれぞれ実行される。なお、各機能は任意の順序で実行できる。また、各機能を並列に実行するように構成してもよい。 First, the image receiving unit 101 receives a designated target image (step S301). Thereafter, function A (steps S302 to S307), function B (steps S302 to S309), and function C (steps S310 and S311) are respectively executed. Each function can be executed in any order. Moreover, you may comprise so that each function may be performed in parallel.

機能Ａおよび機能Ｂに共通する処理として、まず、分割部１１１が、対象画像を複数の分割画像に分割する（ステップＳ３０２）。また、特徴抽出部１１２が、各分割画像から、分割画像の特徴量を抽出する（ステップＳ３０３）。 As processing common to function A and function B, first, the dividing unit 111 divides the target image into a plurality of divided images (step S302). Further, the feature extraction unit 112 extracts the feature amount of the divided image from each divided image (step S303).

次に、機能Ａに固有の処理として、検索部１１３は、抽出された特徴量と類似する特徴量に対応するｋ個の学習画像を、インデックス記憶部１５１から検索する（ステップＳ３０４）。検索部１１３は、検索した学習画像に対応する分類コードを対応記憶部１５２から取得する（ステップＳ３０５）。 Next, as processing unique to function A, the search unit 113 searches the index storage unit 151 for k learning images corresponding to feature quantities similar to the extracted feature quantities (step S304). The search unit 113 acquires the classification code corresponding to the searched learning image from the correspondence storage unit 152 (step S305).

次に、算出部１１４が、取得された分類コードのスコアを算出するスコア算出処理を実行する（ステップＳ３０６）。スコア算出処理の詳細については後述する。次に、第１決定部１１５が、算出されたスコアが上位の所定数の分類コードを、対象画像に付与する分類コードとして決定する（ステップＳ３０７）。 Next, the calculation unit 114 executes a score calculation process for calculating the score of the acquired classification code (step S306). Details of the score calculation process will be described later. Next, the first determination unit 115 determines a predetermined number of classification codes having higher calculated scores as classification codes to be assigned to the target image (step S307).

機能Ｂに固有の処理としては、分類部１１６が、規則記憶部１５３を参照してランダム・フォレスト法により各分割画像をいずれかのクラスに分類する（ステップＳ３０８）。次に、第２決定部１１７が、分類されたクラスに対応する分類コードを、対象画像に付与する分類コードとして決定する（ステップＳ３０９）。 As processing unique to the function B, the classification unit 116 refers to the rule storage unit 153 and classifies each divided image into any class by the random forest method (step S308). Next, the second determination unit 117 determines a classification code corresponding to the classified class as a classification code to be given to the target image (step S309).

機能Ｃに固有の処理としては、パターン検出部１１８が、対象画像から予め定められた画像パターンを検出する（ステップＳ３１０）。画像パターン検出処理の詳細については後述する。次に、第３決定部１１９が、検出された画像パターンに対して予め定められた分類コードを、対象画像に付与する分類コードとして決定する（ステップＳ３１１）。 As processing unique to the function C, the pattern detection unit 118 detects a predetermined image pattern from the target image (step S310). Details of the image pattern detection process will be described later. Next, the 3rd determination part 119 determines the classification code predetermined with respect to the detected image pattern as a classification code provided to a target image (step S311).

各機能により分類コードが決定された後（ステップＳ３０７、ステップＳ３０９、ステップＳ３１１）、出力部１２１が、決定された分類コードの中から、最適な分類コードを選択し、最終的に出力する分類コードとして決定する（ステップＳ３１２）。出力部１２１は、決定された分類コードを例えば通信部１２２を介してユーザ端末２００に出力し（ステップＳ３１２）、注釈決定処理を終了する。 After the classification code is determined by each function (step S307, step S309, step S311), the output unit 121 selects an optimal classification code from the determined classification codes and finally outputs the classification code (Step S312). The output unit 121 outputs the determined classification code, for example, to the user terminal 200 via the communication unit 122 (step S312), and ends the annotation determination process.

次に、ステップＳ３０６のスコア算出処理の詳細について説明する。図４は、スコア算出処理の全体の流れを示すフローチャートである。 Next, details of the score calculation process in step S306 will be described. FIG. 4 is a flowchart showing the overall flow of the score calculation process.

まず、算出部１１４は、図３のステップＳ３０５で取得された分類コードから未処理の分類コードを取得する（ステップＳ４０１）。なお、算出部１１４は、各分割画像に対して取得された分類コードをすべて含む分類コード群から、未処理であるいずれかの分類コードを１つ取得する。 First, the calculation unit 114 acquires an unprocessed classification code from the classification code acquired in step S305 in FIG. 3 (step S401). Note that the calculation unit 114 acquires one unprocessed classification code from the classification code group including all the classification codes acquired for each divided image.

次に、算出部１１４は、スコアを算出するために用いられる値（第１要素値）である要素ａ（ｆａｃｔｏｒ＿ａ）を算出する要素ａ算出処理を実行する（ステップＳ４０２）。要素ａは、分割画像に対する類似度が大きく、かつ、類似度を大きい順に並べたときの順序が小さいほど大きくなるような値である。要素ａ算出処理の詳細については後述する。なお、要素ａの算出式は以下の（１）式で表される。

Next, the calculation unit 114 executes element a calculation processing for calculating an element a (factor_a) that is a value (first element value) used for calculating a score (step S402). The element a is a value that increases as the degree of similarity with respect to the divided images increases and the order decreases when the similarities are arranged in descending order. Details of the element a calculation process will be described later. The calculation formula for the element a is expressed by the following formula (1).

ここで、ｃｏｄｅ_ｒは、取得された分類コードのうちｒ番目（１≦ｒ≦分類コード数）の分類コードを表す。ｆａｃｔｏｒ＿ａ（ｃｏｄｅ_ｒ）は、ｃｏｄｅ_ｒに対するスコアを算出するための要素ａの値を表す。ＳｉｍＳｃｏｒｅ（Ｉｍｇ_ｉ）は、検索時にＩｍｇ_ｉに対して求められた分割画像に対する類似度を表す。Ｉｍｇ_ｉは、ｃｏｄｅ_ｒが対応づけられた学習画像の集合であるＩＭ（ｃｏｄｅ_ｒ）に含まれる学習画像を表す。ｂａｓｅは、予め定められた基数である。ｒａｎｋ（Ｉｍｇ_ｉ）は、ｋ個の検索結果内での学習画像Ｉｍｇ_ｉの類似度の大きさの順序（ランク）を表す。ランクは、１〜ｋの整数値であり、類似度が大きいほど値は小さくなる。 Here, code _r represents the r-th (1 ≦ r ≦ number of classification codes) classification code among the acquired classification codes. factor_a (code _r ) represents the value of element a for calculating a score for code _r . SimScore (Img _i ) represents the similarity to the divided image obtained for Img _i at the time of search. Img _i represents the learning image code _r is included in the IM (code _r) is the set of correspondence is learning image. base is a predetermined radix. rank (Img _i ) represents the order (rank) of the magnitude of similarity of the learning image Img _{i in} the k search results. The rank is an integer value of 1 to k, and the value decreases as the similarity degree increases.

ｂａｓｅは、１未満の値（例えば０．９５）を設定する。これにより、ｒａｎｋ（Ｉｍｇ_ｉ）を指数とするｂａｓｅのべき乗が、ｒａｎｋ（Ｉｍｇ_ｉ）が大きいほど小さくなるようにできる。なお、ｒａｎｋ（Ｉｍｇ_ｉ）−１を指数とするように構成してもよい。 The base is set to a value less than 1 (for example, 0.95). Thus, power of base to rank the (Img _i) and index, can be made smaller as the rank (Img _i) is large. It may be configured so as to index the rank _(Img i) -1.

要素ａを算出した後、算出部１１４は、スコアを算出するために用いられる別の値（第２要素値）である要素ｂ（ｆａｃｔｏｒ＿ｂ）を算出する要素ｂ算出処理を実行する（ステップＳ４０３）。要素ｂは、分類コードに複数の学習画像が対応づけられているときに、各学習画像の特徴量が相互に類似するほど大きくなるような値である。３つ以上の学習画像が分類コードに対応づけられているときは、すべての２つの学習画像の組み合わせに対して特徴量間が相互に類似する度合いを表す相互類似度が算出され、要素ｂに加算される。要素ｂ算出処理の詳細については後述する。なお、要素ｂの算出式は以下の（２）式〜（５）式で表される。

After calculating the element a, the calculation unit 114 executes element b calculation processing for calculating the element b (factor_b), which is another value (second element value) used for calculating the score (step S403). . The element b is a value such that when a plurality of learning images are associated with the classification code, the feature amount of each learning image increases as they become similar to each other. When three or more learning images are associated with the classification code, a mutual similarity indicating a degree of similarity between the feature quantities is calculated for all two learning image combinations, and element b Is added. Details of the element b calculation process will be described later. The calculation formula for the element b is expressed by the following formulas (2) to (5).

ｆａｃｔｏｒ＿ｂ（ｃｏｄｅ_ｒ）は、ｃｏｄｅ_ｒに対するスコアを算出するための要素ｂの値を表す。（４）式のｗ（Ｉｍｇ_ｓ，Ｉｍｇ_ｔ）は、Ｉｍｇ_ｓとＩｍｇ_ｔとの相互類似度を表す。（４）式に含まれるｄｉｓｔ（Ｉｍｇ_ｓ，Ｉｍｇ_ｔ）は、特徴ベクトル空間内でのＩｍｇ_ｓの特徴量を含む特徴ベクトルとＩｍｇ_ｔの特徴量を含む特徴ベクトルとの間のユークリッド距離を表す。また、（４）式に含まれるＣは、予め定められる定数であり、例えばＣ＝１００を用いる。 factor_b (code _r ) represents the value of element b for calculating the score for code _r . In the formula (4), w (Img _s , Img _t ) represents the mutual similarity between Img _s and Img _t . _{(4) dist (Img s,} Img t) contained in the formula represents the Euclidean distance between the feature vector comprising a feature value of a feature vector and Img _t including the feature quantity of Img _s in feature vector space . Further, C included in the equation (4) is a predetermined constant, and for example, C = 100 is used.

要素ｂを算出した後、算出部１１４は、分類コード（ｃｏｄｅ_ｒ）のＩＤＦ（Inverse Document Frequency）を算出する（ステップＳ４０４）。ｃｏｄｅ_ｒのＩＤＦは、以下の（６）式により算出される。
ｉｄｆ（ｃｏｄｅ_ｒ）＝ｌｏｇ（Ｎ／ｄｆ（ｃｏｄｅ_ｒ））・・・（６） After calculating the element b, the calculation unit 114 calculates an IDF (Inverse Document Frequency) of the classification code (code _r ) (step S404). IDF of code _r is calculated by the following equation (6).
idf (code _r ) = log (N / df (code _r )) (6)

Ｎは、インデックス記憶部１５１に記憶されている学習画像の総数を表す。ｄｆ（ｃｏｄｅ_ｒ）は、インデックス記憶部１５１に記憶されている学習画像のうち、ｃｏｄｅ_ｒが対応づけられている学習画像の個数を表す。 N represents the total number of learning images stored in the index storage unit 151. df (code _r ) represents the number of learning images associated with code _r among learning images stored in the index storage unit 151.

次に、算出部１１４は、算出したＩＤＦ、要素ａ、および要素ｂを用いて、以下の（７）式により、分類コード（ｃｏｄｅ_ｒ）のスコアを表すｓｃｏｒｅ（ｃｏｄｅ_ｒ）を算出する（ステップＳ４０５）。
ｓｃｏｒｅ（ｃｏｄｅ_ｒ）＝α×ｉｄｆ（ｃｏｄｅ_ｒ）×ｆａｃｔｏｒ＿ａ（ｃｏｄｅ_ｒ）
＋β×ｆａｃｔｏｒ＿ｂ（ｃｏｄｅ_ｒ）・・・（７） Next, the calculation unit 114 calculates score (code _r ) representing the score of the classification code (code _r ) by using the calculated IDF, element a, and element b according to the following equation (7) (step S405).
score (code _r ) = α × idf (code _r ) × factor_a (code _r )
+ Β × factor_b (code _r ) (7)

αおよびβは予め定められた定数を表す。このように、算出部１１４は、ＩＤＦと要素ａとの積と、要素ｂとの線形和をスコアとして算出する。 α and β represent predetermined constants. As described above, the calculation unit 114 calculates the linear sum of the product of the IDF and the element a and the element b as a score.

次に、算出部１１４は、すべての分類コードを処理したか否かを判断し（ステップＳ４０６）、処理していない場合は（ステップＳ４０６：Ｎｏ）、次の未処理の分類コードを取得して処理を繰り返す（ステップＳ４０１）。すべての分類コードを処理した場合は（ステップＳ４０６：Ｙｅｓ）、スコア算出処理を終了する。 Next, the calculation unit 114 determines whether or not all the classification codes have been processed (step S406), and if not (step S406: No), obtains the next unprocessed classification code. The process is repeated (step S401). When all the classification codes have been processed (step S406: Yes), the score calculation process ends.

次に、ステップＳ４０２の要素ａ算出処理の詳細について説明する。図５は、要素ａ算出処理の全体の流れを示すフローチャートである。 Next, details of the element a calculation process in step S402 will be described. FIG. 5 is a flowchart showing the overall flow of the element a calculation process.

まず、算出部１１４は、現在処理している分類コード（ｃｏｄｅ_ｒ）に対応する要素ａ（ｆａｃｔｏｒ＿ａ（ｃｏｄｅ_ｒ））の値を０に初期化する（ステップＳ５０１）。次に、算出部１１４は、検索された学習画像のうち未処理の学習画像（以下、Ｉｍｇ_ｉとする）を取得する（ステップＳ５０２）。次に、算出部１１４は、分類コードｃｏｄｅ_ｒが、学習画像Ｉｍｇ_ｉに対応づけられているか否かを判断する（ステップＳ５０３）。算出部１１４は、例えば対応記憶部１５２を参照することにより、学習画像Ｉｍｇ_ｉに所望の分類コードが対応づけられているかを判断する。 First, the calculation unit 114 initializes the value of the element a (factor_a (code _r )) corresponding to the currently processed classification code (code _r ) to 0 (step S501). Next, the calculation unit 114 acquires an unprocessed learning image (hereinafter referred to as Img _i ) among the searched learning images (step S502). Next, the calculation unit 114 determines whether or not the classification code code _r is associated with the learning image Img _i (step S503). The calculation unit 114 determines whether a desired classification code is associated with the learning image Img _i by referring to the correspondence storage unit 152, for example.

分類コードｃｏｄｅ_ｒが、学習画像Ｉｍｇ_ｉに対応づけられている場合（ステップＳ５０３：Ｙｅｓ）、算出部１１４は、ｂａｓｅのｒａｎｋ（Ｉｍｇ_ｉ）乗を算出し、得られた値と学習画像Ｉｍｇ_ｉに対して得られた類似度（ＳｉｍＳｃｏｒｅ（Ｉｍｇ_ｉ））との積を、ｆａｃｔｏｒ＿ａ（ｃｏｄｅ_ｒ）に加算する（ステップＳ５０４）。 When the classification code code _r is associated with the learning image Img _i (step S503: Yes), the calculation unit 114 calculates the base rank (Img _i ) power, and the obtained value and the learning image Img _i The product of the similarity (SimScore (Img _i )) obtained with respect to is added to factor_a (code _r ) (step S504).

分類コードｃｏｄｅ_ｒが、学習画像Ｉｍｇ_ｉに対応づけられていない場合（ステップＳ５０３：Ｎｏ）、および、ステップＳ５０４の後、算出部１１４は、すべての学習画像を処理したか否かを判断する（ステップＳ５０５）。 When the classification code code _r is not associated with the learning image Img _i (step S503: No), and after step S504, the calculation unit 114 determines whether or not all learning images have been processed ( Step S505).

すべての学習画像を処理していない場合（ステップＳ５０５：Ｎｏ）、算出部１１４は、次の未処理の学習画像を取得して処理を繰り返す（ステップＳ５０２）。すべての学習画像を処理した場合（ステップＳ５０５：Ｙｅｓ）、要素ａ算出処理を終了する。 When all the learning images have not been processed (step S505: No), the calculation unit 114 acquires the next unprocessed learning image and repeats the processing (step S502). When all the learning images have been processed (step S505: Yes), the element a calculation process ends.

次に、ステップＳ４０３の要素ｂ算出処理の詳細について説明する。図６は、要素ｂ算出処理の全体の流れを示すフローチャートである。 Next, details of the element b calculation process in step S403 will be described. FIG. 6 is a flowchart showing the overall flow of the element b calculation process.

まず、算出部１１４は、現在処理している分類コード（ｃｏｄｅ_ｒ）に対応する要素ｂ（ｆａｃｔｏｒ＿ｂ（ｃｏｄｅ_ｒ））の値、および、カウンタｉを０に初期化する（ステップＳ６０１）。また、算出部１１４は、カウンタｊをｉ＋１に初期化する（ステップＳ６０２）。 First, calculation unit 114, the value of the classification is currently processing code (code _r) to the corresponding element _{b (factor_b (code r))} , and the counter i is initialized to 0 (step S601). Also, the calculation unit 114 initializes the counter j to i + 1 (step S602).

次に、算出部１１４は、ｒａｎｋがｉ番目となる学習画像（Ｉｍ_ｉとする）を取得する（ステップＳ６０３）。また、算出部１１４は、ｒａｎｋがｊ番目となる学習画像（Ｉｍ_ｊとする）を取得する（ステップＳ６０４）。次に、算出部１１４は、分類コードｃｏｄｅ_ｒが、取得したＩｍ_ｉおよびＩｍ_ｊの両方に対応づけられているか否かを判断する（ステップＳ６０５）。 Next, the calculation unit 114 acquires a learning image (referred to as Im _i ) in which the rank is i-th (step S603). In addition, the calculation unit 114 acquires a learning image (Im _j ) in which the rank is jth (step S604). Next, the calculation unit 114 determines whether or not the classification code code _r is associated with both the acquired Im _i and Im _j (step S605).

両方に対応づけられている場合（ステップＳ６０５：Ｙｅｓ）、算出部１１４は、Ｉｍ_ｉおよびＩｍ_ｊとの相互類似度を算出し、ｆａｃｔｏｒ＿ｂ（ｃｏｄｅ_ｒ）に加算する（ステップＳ６０６）。相互類似度は、上記（４）式のように特徴ベクトル空間でのユークリッド距離が小さいほど大きくなるように算出される。 If both are associated (step S605: Yes), calculation unit 114 calculates the mutual similarity between Im _i and Im _j, is added to FACTOR_B (code _r) (step S606). The mutual similarity is calculated so as to increase as the Euclidean distance in the feature vector space decreases as in the above equation (4).

分類コードｃｏｄｅ_ｒがＩｍ_ｉおよびＩｍ_ｊの両方に対応づけられていない場合（ステップＳ６０５：Ｎｏ）、および、ステップＳ６０６の後、算出部１１４は、ｊの値に１を加算する（ステップＳ６０７）。次に、算出部１１４は、ｊがｋより小さいか否かを判断し（ステップＳ６０８）、小さい場合は（ステップＳ６０８：Ｙｅｓ）、ステップＳ６０４に戻り処理を繰り返す。ｊがｋ以上となった場合は（ステップＳ６０８：Ｎｏ）、算出部１１４は、ｉの値に１を加算する（ステップＳ６０９）。次に、算出部１１４は、ｉがｋ−１より小さいか否かを判断し（ステップＳ６１０）、小さい場合は（ステップＳ６１０：Ｙｅｓ）、ステップＳ６０３に戻り処理を繰り返す。ｉがｋ−１以上となった場合は（ステップＳ６１０：Ｎｏ）、要素ｂ算出処理を終了する。 When the classification code code _r is not associated with both Im _i and Im _j (step S605: No), and after step S606, the calculation unit 114 adds 1 to the value of j (step S607). . Next, the calculation unit 114 determines whether j is smaller than k (step S608). If j is smaller (step S608: Yes), the process returns to step S604 and repeats the process. When j is greater than or equal to k (step S608: No), the calculation unit 114 adds 1 to the value of i (step S609). Next, the calculation unit 114 determines whether i is smaller than k−1 (step S610). If i is smaller (step S610: Yes), the process returns to step S603 and repeats the process. If i is greater than or equal to k−1 (step S610: No), the element b calculation process ends.

次に、ステップＳ３１０の画像パターン検出処理の詳細について説明する。図７は、画像パターン検出処理の一例を示す説明図である。図７は、１つ以上の四角形を含む画像パターンを検出する場合の例を表している。 Next, details of the image pattern detection processing in step S310 will be described. FIG. 7 is an explanatory diagram showing an example of image pattern detection processing. FIG. 7 shows an example in which an image pattern including one or more quadrangles is detected.

パターン検出部１１８は、対象画像のノイズを除去し（ステップＳ７０１）、輪郭を検出する（ステップＳ７０２）。次に、パターン検出部１１８は、検出した各輪郭の頂点を算出し（ステップＳ７０３）、頂点が４つである輪郭を取得する（ステップＳ７０４）。次に、パターン検出部１１８は、輪郭の各辺の長さ、角の大きさを算出する（ステップＳ７０５）。パターン検出部１１８は、辺の長さ、角の大きさの算出結果から、輪郭が長方形（ウイーン分類の２６．４．２）、正方形（ウイーン分類の２６．４．１）、またはそれ以外の四角形であるかを判定できる。すなわち、パターン検出部１１８は、長方形および正方形である画像パターンを検出できる。 The pattern detection unit 118 removes noise from the target image (step S701) and detects a contour (step S702). Next, the pattern detection unit 118 calculates vertices of each detected contour (step S703), and acquires a contour having four vertices (step S704). Next, the pattern detection unit 118 calculates the length of each side of the contour and the size of the corner (step S705). The pattern detection unit 118 determines that the outline is rectangular (26.4.2 of the Wien classification), square (26.4.1 of the Wien classification), or other than the calculation result of the length of the side and the size of the corner. Whether it is a rectangle can be determined. That is, the pattern detection unit 118 can detect image patterns that are rectangular and square.

複数の輪郭が検出された場合は、パターン検出部１１８は、さらに輪郭間の位置関係を検出する（ステップＳ７０６）。例えば、パターン検出部１１８は、各四角形の４つの頂点の座標の大小関係を比較することにより、２つの四角形が重なっていることを検出する。この場合、例えば「並置・結合又は交差する複数の四角形（ウイーン分類の２６．４．９）」である画像パターンが検出される。同様にして、ウイーン分類のその他の分類コード（図７では２６．４．４、２６．４．７、２６．４．８）に対応する画像パターンを検出することができる。 If a plurality of contours are detected, the pattern detection unit 118 further detects the positional relationship between the contours (step S706). For example, the pattern detection unit 118 detects that two rectangles overlap each other by comparing the magnitude relationship between the coordinates of four vertices of each rectangle. In this case, for example, an image pattern which is “a plurality of juxtaposed / combined or intersecting quadrangles (Wien classification 26.4.9)” is detected. Similarly, image patterns corresponding to other classification codes of Wien classification (26.4.4, 26.4.7, and 26.4.8 in FIG. 7) can be detected.

次に、出力部１２１が最適な分類コードを決定するときに参照する信用テーブルについて説明する。図８は、信用テーブルのデータ構造の一例を示す図である。図８に示すように、信用テーブルは、分類コードごとに、機能Ａ（ＫＮＮ）、機能Ｂ（ランダムフォレスト）、および機能Ｃ（パターン検出）のそれぞれで検出したときの分類コードの信用値を記憶する。 Next, the trust table referred to when the output unit 121 determines the optimum classification code will be described. FIG. 8 is a diagram illustrating an example of the data structure of the trust table. As shown in FIG. 8, the credit table stores the credit value of the classification code when detected by each of the function A (KNN), the function B (random forest), and the function C (pattern detection) for each classification code. To do.

信用値は、例えば学習データに対して各機能で分類コードを検出したときに事前に算出し、信用テーブルに保存しておく。信用値としては、例えば、再現率と適合率との調和平均であるＦ値を用いることができる。なお、信用値はこれに限られるものではなく、分類コードの確からしさを表すものであればあらゆる指標を適用できる。例えば、再現率または適合率のいずれかを信用値として利用してもよい。 The trust value is calculated in advance when, for example, a classification code is detected for each function in the learning data, and stored in the trust table. As the credit value, for example, an F value that is a harmonic average of the recall rate and the matching rate can be used. The credit value is not limited to this, and any index can be applied as long as it represents the certainty of the classification code. For example, either the recall rate or the matching rate may be used as the credit value.

出力部１２１は、このような信用テーブルを参照し、第１決定部１１５、第２決定部１１７、および第３決定部１１９で決定された各分類コードに対応する信用値を取得する。そして、信用値が高い所定数の分類コードを選択して最終的な分類コードとして出力する。なお、出力部１２１は、上記各決定部（第１決定部１１５、第２決定部１１７、第３決定部１１９）のうち、分類コードを決定した決定部に対応する信用テーブルの列から、信用値を取得する。例えば、分類コード「２６．１．１」が第１決定部１１５により得られた場合は、「ＫＮＮ」の列に対応する信用値「０．４１」を信用テーブルから取得する。また、複数の決定部が同一の分類コードが決定された場合は、対応する信用値のうち最大の信用値を取得する。 The output unit 121 refers to such a credit table, and obtains a credit value corresponding to each classification code determined by the first determination unit 115, the second determination unit 117, and the third determination unit 119. Then, a predetermined number of classification codes having high credit values are selected and output as final classification codes. In addition, the output unit 121 determines the credit from the column of the credit table corresponding to the determination unit that has determined the classification code among the determination units (first determination unit 115, second determination unit 117, and third determination unit 119). Get the value. For example, when the classification code “26.1.1” is obtained by the first determination unit 115, the credit value “0.41” corresponding to the column “KNN” is acquired from the credit table. Further, when the same classification code is determined by a plurality of determination units, the maximum credit value is acquired from the corresponding credit values.

次に、機能Ａにより分類コードのスコアを算出する処理の具体例について説明する。以下では、対象画像を分割した分割画像に類似する学習画像として、Ｉｍｇ_１〜Ｉｍｇ_５の５つの学習画像が得られた場合を例に説明する。図９は、学習画像ごとに求められるランク、類似度、対応づけられた分類コード（関連分類コード）の例を示す図である。 Next, a specific example of processing for calculating the score of the classification code by the function A will be described. Hereinafter, a case where five learning images Img _{1 to} Img ₅ are obtained as learning images similar to the divided image obtained by dividing the target image will be described as an example. FIG. 9 is a diagram illustrating an example of ranks, similarities, and associated classification codes (related classification codes) obtained for each learning image.

図９に示すように、この例では、各学習画像に対応する分類コードとして、｛１．１、２．３、４．２、３．５、５．１｝が得られる。したがって、ステップＳ３０６のスコア算出処理は、この５つの分類コードを対象として実行される。 As shown in FIG. 9, in this example, {1.1, 2.3, 4.2, 3.5, 5.1} is obtained as the classification code corresponding to each learning image. Therefore, the score calculation process in step S306 is executed for these five classification codes.

分類コードは、複数の学習画像に対応づけられている場合がある。図９の例では、例えば分類コード「１．１」がＩｍｇ_１およびＩｍｇ_３の２つの学習画像に対応づけられている。したがって、例えば分類コード「１．１」が対応づけられた学習画像の集合ＩＭ（１．１）は、ＩＭ（１．１）＝｛Ｉｍｇ_１、Ｉｍｇ_３｝となる。同様に、ＩＭ（２．３）＝｛Ｉｍｇ_１、Ｉｍｇ_２｝、ＩＭ（４．２）＝｛Ｉｍｇ_１、Ｉｍｇ_４｝、ＩＭ（３．５）＝｛Ｉｍｇ_２、Ｉｍｇ_４、Ｉｍｇ_５｝、およびＩＭ（５．１）＝｛Ｉｍｇ_２、Ｉｍｇ_３、Ｉｍｇ_５｝となる。 The classification code may be associated with a plurality of learning images. In the example of FIG. 9, for example, the classification code “1.1” is associated with two learning images of Img ₁ and Img ₃ . Therefore, for example, a set IM (1.1) of learning images associated with the classification code “1.1” is IM (1.1) = {Img ₁ , Img ₃ }. Similarly, IM (2.3) = {Img ₁ , Img ₂ }, IM (4.2) = {Img ₁ , Img ₄ }, IM (3.5) = {Img ₂ , Img ₄ , Img ₅ } , And IM (5.1) = {Img ₂ , Img ₃ , Img ₅ }.

分類コード「１．１」の要素ａ（ｆａｃｔｏｒ＿ａ（１．１））は、以下のように算出される。
ｆａｃｔｏｒ＿ａ（１．１）＝０．５７×ｐｏｗｅｒ（０．９５、０）
＋０．４８×ｐｏｗｅｒ（０．９５、２）＝１．００３２ The element a (factor_a (1.1)) of the classification code “1.1” is calculated as follows.
factor_a (1.1) = 0.57 × power (0.95, 0)
+ 0.48 × power (0.95, 2) = 1.0032

ｐｏｗｅｒ（ａ，ｂ）は、ａを基数、ｂを指数とするべき乗を算出する関数を表す。なお、ここでは、ｒａｎｋ（Ｉｍｇ_ｉ）−１を指数として用いている。 power (a, b) represents a function for calculating a power with a as a radix and b as an exponent. Here, rank (Img _i ) -1 is used as an index.

同様に、その他の分類コードの要素ａは、以下のように算出される。
ｆａｃｔｏｒ＿ａ（２．３）＝０．５７×ｐｏｗｅｒ（０．９５、０）
＋０．５２×ｐｏｗｅｒ（０．９５、１）＝１．０６４
ｆａｃｔｏｒ＿ａ（４．２）＝０．５７×ｐｏｗｅｒ（０．９５、０）
＋０．４６×ｐｏｗｅｒ（０．９５、３）≒０．９６４
ｆａｃｔｏｒ＿ａ（３．５）＝０．５２×ｐｏｗｅｒ（０．９５、１）
＋０．４６×ｐｏｗｅｒ（０．９５、３）
＋０．３２×ｐｏｗｅｒ（０．９５、４）≒１．１４９
ｆａｃｔｏｒ＿ａ（５．１）＝０．５２×ｐｏｗｅｒ（０．９５、１）
＋０．４８×ｐｏｗｅｒ（０．９５、２）
＋０．３２×ｐｏｗｅｒ（０．９５、４）≒１．１８８ Similarly, the element a of the other classification code is calculated as follows.
factor_a (2.3) = 0.57 × power (0.95, 0)
+ 0.52 × power (0.95, 1) = 1.064
factor_a (4.2) = 0.57 × power (0.95, 0)
+ 0.46 × power (0.95, 3) ≈0.964
factor_a (3.5) = 0.52 × power (0.95, 1)
+ 0.46 × power (0.95, 3)
+ 0.32 × power (0.95, 4) ≈1.149
factor_a (5.1) = 0.52 × power (0.95, 1)
+ 0.48 × power (0.95, 2)
+ 0.32 × power (0.95, 4) ≈1.188

この例では、分類コード「５．１」に対する要素ａの値が最も大きい。したがって、分類コード「５．１」に対するスコアがより大きい値となることが予想される。 In this example, the value of the element a for the classification code “5.1” is the largest. Therefore, the score for the classification code “5.1” is expected to be a larger value.

図１０は、学習画像間の相互類似度の算出例を示す図である。図１０は、上記（４）式により算出される値の一例を示している。図１０の値を用いると、上記（２）式により、分類コードの要素ｂが以下のように算出される。
ｆａｃｔｏｒ＿ｂ（１．１）＝０．４１
ｆａｃｔｏｒ＿ｂ（２．３）＝０．４３
ｆａｃｔｏｒ＿ｂ（４．２）＝０．３５
ｆａｃｔｏｒ＿ｂ（３．５）＝０．４０＋０．３８＋０．１５＝０．９３
ｆａｃｔｏｒ＿ｂ（５．１）＝０．５３＋０．３８＋０．５６＝１．４７ FIG. 10 is a diagram illustrating a calculation example of the mutual similarity between learning images. FIG. 10 shows an example of values calculated by the above equation (4). Using the values in FIG. 10, the element b of the classification code is calculated as follows according to the above equation (2).
factor_b (1.1) = 0.41
factor_b (2.3) = 0.43
factor_b (4.2) = 0.35
factor_b (3.5) = 0.40 + 0.38 + 0.15 = 0.93
factor_b (5.1) = 0.53 + 0.38 + 0.56 = 1.47

この例では、分類コード「５．１」に対する要素ｂの値が最も大きい。したがって、分類コード「５．１」に対するスコアがより大きい値となることが予想される。実際に算出されるスコアは、上記（２）式に示すように、各分類コードのＩＤＦ、α、およびβの値によって変わる。しかし、上記例の要素ａおよび要素ｂの算出結果からは、分類コード「５．１」に対するスコアが最大となる可能性が高いと言える。 In this example, the value of the element b for the classification code “5.1” is the largest. Therefore, the score for the classification code “5.1” is expected to be a larger value. The actually calculated score varies depending on the IDF, α, and β values of each classification code, as shown in the above equation (2). However, from the calculation results of the element a and the element b in the above example, it can be said that there is a high possibility that the score for the classification code “5.1” is maximized.

図９に示すように、この分類コード「５．１」は、検索された５つの画像のうち、３つの画像（Ｉｍｇ_２、Ｉｍｇ_３、Ｉｍｇ_５）に対応づけられている。また、図１０に示すように、分類コード「５．１」は、相互類似度が最も大きい２つの画像であるＩｍｇ_３およびＩｍｇ_５に共に含まれている。このように、本実施の形態によれば、より多くの画像に対応づけられており、かつ、対応づけられた複数の画像間の相互類似度が大きい分類コードに対して、より大きい値のスコアを算出できる。したがって、画像に付与する注釈をより適切に決定することが可能となる。 As shown in FIG. 9, the classification code “5.1” is associated with three images (Img ₂ , Img ₃ , Img ₅ ) among the searched five images. As shown in FIG. 10, the classification code “5.1” is included in both of the two images Img ₃ and Img ₅ that have the highest mutual similarity. As described above, according to the present embodiment, a score having a larger value is associated with a classification code that is associated with a larger number of images and has a large mutual similarity between a plurality of associated images. Can be calculated. Therefore, it is possible to more appropriately determine the annotation to be added to the image.

次に、本実施の形態にかかる注釈付与装置のハードウェア構成について図１１を用いて説明する。図１１は、本実施の形態にかかる注釈付与装置のハードウェア構成の一例を示す図である。 Next, a hardware configuration of the annotation assigning apparatus according to the present embodiment will be described with reference to FIG. FIG. 11 is a diagram illustrating an example of a hardware configuration of the annotation assigning apparatus according to the present embodiment.

本実施の形態の注釈付与装置は、ＣＰＵ５１などの制御装置と、ＲＯＭ（Read Only Memory）５２やＲＡＭ５３などの記憶装置と、ＨＤＤ、ＣＤドライブ装置などの外部記憶装置と、ネットワークに接続して通信を行う通信Ｉ／Ｆ５４と、ディスプレイ装置などの表示装置と、キーボードやマウスなどの入力装置と、各部を接続するバス６１を備えており、通常のコンピュータを利用したハードウェア構成となっている。 The annotation assigning apparatus according to the present embodiment communicates with a control device such as a CPU 51, a storage device such as a ROM (Read Only Memory) 52 and a RAM 53, and an external storage device such as an HDD and a CD drive device by connecting to a network. The communication I / F 54 that performs the above, a display device such as a display device, an input device such as a keyboard and a mouse, and a bus 61 that connects each unit are provided, and has a hardware configuration using a normal computer.

本実施の形態の注釈付与装置で実行されるプログラムは、インストール可能な形式又は実行可能な形式のファイルでＣＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）等のコンピュータで読み取り可能な記録媒体に記録されてコンピュータ・プログラム・プロダクトとして提供される。 The program executed by the annotation assigning apparatus according to the present embodiment is an installable or executable file, and is a computer such as a CD-ROM, a flexible disk (FD), a CD-R, a DVD (Digital Versatile Disk). Recorded on a readable recording medium and provided as a computer program product.

また、本実施の形態の注釈付与装置で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成しても良い。また、本実施の形態の注釈付与装置で実行されるプログラムをインターネット等のネットワーク経由で提供または配布するように構成しても良い。 Further, the program executed by the annotation assigning apparatus of the present embodiment may be provided by being stored on a computer connected to a network such as the Internet and downloaded via the network. In addition, the program executed by the annotation assigning apparatus according to the present embodiment may be provided or distributed via a network such as the Internet.

また、本実施の形態のプログラムを、ＲＯＭ５２等に予め組み込んで提供するように構成してもよい。 Further, the program of the present embodiment may be configured to be provided by being incorporated in advance in the ROM 52 or the like.

本実施の形態の注釈付与装置で実行されるプログラムは、上述した各部（画像受付部、分割部、特徴抽出部、検索部、算出部、第１決定部、分類部、第２決定部、パターン検出部、第３決定部、出力部、通信部）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ（プロセッサ）５１が上記記憶媒体からプログラムを読み出して実行することにより上記各部が主記憶装置上にロードされ、上記各部が主記憶装置上に生成されるようになっている。 The program executed by the annotation assigning apparatus according to the present embodiment includes the above-described units (image receiving unit, dividing unit, feature extracting unit, search unit, calculating unit, first determining unit, classification unit, second determining unit, pattern The module configuration includes a detection unit, a third determination unit, an output unit, and a communication unit). As actual hardware, the CPU (processor) 51 reads a program from the storage medium and executes the program, so that each unit is It is loaded on the main storage device, and the above-described units are generated on the main storage device.

１００注釈付与装置
１０１画像受付部
１１１分割部
１１２特徴抽出部
１１３検索部
１１４算出部
１１５第１決定部
１１６分類部
１１７第２決定部
１１８パターン検出部
１１９第３決定部
１２１出力部
１２２通信部
１５１インデックス記憶部
１５２対応記憶部
１５３規則記憶部
２００ａ、２００ｂユーザ端末
３００ネットワーク DESCRIPTION OF SYMBOLS 100 Annotation giving apparatus 101 Image reception part 111 Division | segmentation part 112 Feature extraction part 113 Search part 114 Calculation part 115 1st determination part 116 Classification | category part 117 2nd determination part 118 Pattern detection part 119 3rd determination part 121 Output part 122 Communication part 151 Index storage unit 152 Corresponding storage unit 153 Rule storage unit 200a, 200b User terminal 300 Network

Jia-Yu Pan et al., "Cross-Modal Correlation Mining Using Graph Algorithms"（Knowledge Discovery and Data Mining: Challenges and Realities with Real-word Data, Chapter IV）, Information Science Reference, June 2006.Jia-Yu Pan et al., "Cross-Modal Correlation Mining Using Graph Algorithms" (Knowledge Discovery and Data Mining: Challenges and Realities with Real-word Data, Chapter IV), Information Science Reference, June 2006.

Claims

An index storage unit that stores an index in which a plurality of predetermined images and feature amounts of the images are associated;
A correspondence storage unit that associates and stores the image and a plurality of predetermined annotations;
A dividing unit that divides the target image to be annotated into a plurality of divided images;
A feature extraction unit that analyzes the divided image and extracts a feature amount of the divided image;
A search unit that searches the index storage unit for a predetermined number of images having a high degree of similarity between the corresponding feature value and the extracted feature value;
For each of the annotations associated with the searched image, the annotation is greater as the appearance frequency of the annotation in the correspondence storage unit is smaller, and is larger as the similarity of the corresponding image is larger. A calculation unit that calculates a score of the annotation that becomes a larger value as the feature quantities of the plurality of images obtained are similar to each other;
A first determination unit that determines the annotation to be given to the target image in preference to the annotation with a large score;
An annotation giving apparatus comprising:

The calculation unit is similar in that the reverse appearance frequency that is greater as the appearance frequency is smaller, the first element value that is greater as the similarity of the corresponding image is larger, and the feature quantities of the plurality of associated images are similar to each other. Calculating a second element value that is larger, and calculating the score that is a linear sum of the product of the first element value and the reverse appearance frequency and the second element value;
The annotation giving apparatus according to claim 1.

The calculation unit further calculates the score that is larger as the rank representing the order of the similarity of the corresponding images is larger,
The annotation giving apparatus according to claim 1.

A rule storage unit for storing a rule for classifying an image into one of a plurality of classes associated with annotations according to the feature amount of the image;
A classification unit that applies the feature amount extracted by the feature extraction unit to the rule to classify the divided image into one of the classes;
A second determination unit that determines the annotation associated with the classified class as an annotation to be added to the target image;
The annotation giving apparatus according to claim 1.

A pattern detection unit for detecting a predetermined image pattern from the target image;
A third determination unit that determines an annotation previously associated with the image pattern as an annotation to be added to the target image;
The annotation giving apparatus according to claim 1.

A dividing step in which the dividing unit divides the target image to be annotated into a plurality of divided images;
A feature extraction step of analyzing the divided image and extracting a feature amount of the divided image;
A search unit stores an index in which a plurality of predetermined images and feature amounts of the images are associated with each other, and the similarity between the corresponding feature amount and the extracted feature amount is large in advance. A search step for searching a predetermined number of the images;
Appearance of the annotation in the correspondence storage unit for each of the annotations associated with the image searched in the correspondence storage unit in which the calculation unit associates and stores the image and a plurality of predetermined annotations The annotation score that is larger as the frequency is smaller and larger as the degree of similarity of the corresponding image is larger and becomes larger as the feature quantities of the plurality of associated images are similar to each other is calculated. A calculating step to
A first determination unit that determines the annotation to be given to the target image in preference to the annotation with a large score;
Annotation method characterized by comprising:

An index storage unit that stores an index that associates a plurality of predetermined images with the feature amount of the image, and a correspondence storage unit that stores the image and a plurality of predetermined annotations in association with each other. A computer with
A dividing unit that divides the target image to be annotated into a plurality of divided images;
A feature extraction unit that analyzes the divided image and extracts a feature amount of the divided image;
A search unit that searches the index storage unit for a predetermined number of images having a high degree of similarity between the corresponding feature value and the extracted feature value;
For each of the annotations associated with the searched image, the annotation is greater as the appearance frequency of the annotation in the correspondence storage unit is smaller, and is larger as the similarity of the corresponding image is larger. A calculation unit that calculates a score of the annotation that becomes a larger value as the feature quantities of the plurality of images obtained are similar to each other;
A first determination unit that determines the annotation to be given to the target image in preference to the annotation with a large score;
Annotation program to make it function.