JP5633424B2

JP5633424B2 - Program and information processing system

Info

Publication number: JP5633424B2
Application number: JP2011036675A
Authority: JP
Inventors: 文渊戚; 加藤　典司; 典司加藤
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2011-02-23
Filing date: 2011-02-23
Publication date: 2014-12-03
Anticipated expiration: 2031-02-23
Also published as: JP2012174083A

Description

本発明は、プログラム及び情報処理システムに関する。 The present invention relates to a program and an information processing system.

近年、画像に検索のためのキーワード（標識、ラベル）を自動的に付与する画像自動アノテーション技術が、画像検索のための１つの重要な技術となっている。この技術により、人がキーワードを画像に付与する作業を不要とすることができる。 In recent years, an image automatic annotation technique for automatically assigning keywords (signs and labels) for retrieval to images has become one important technique for image retrieval. This technique makes it unnecessary for a person to add a keyword to an image.

非特許文献１には、画像から局所的な特徴量を抽出して、その画像全体の分布を計算して画像特徴とし、各ラベルに対して設けられた確率化したバイナリサポートベクタマシンにより、ラベルの事後確率を計算し、一番高い事後確率に対応するラベルを未知画像に付与する技術が開示されている。 In Non-Patent Document 1, a local feature amount is extracted from an image, the distribution of the entire image is calculated as an image feature, and a label is generated by a probabilistic binary support vector machine provided for each label. A technique is disclosed in which a posterior probability is calculated and a label corresponding to the highest posterior probability is assigned to an unknown image.

K.S.Goh, "Using One-Classand Two-Class SVMs for Multiclass Image Annotation", IEEE Trans. OnKnowledge and Data Engineering, Vol.17, No.10, Oct.2005K.S.Goh, "Using One-Classand Two-Class SVMs for Multiclass Image Annotation", IEEE Trans.OnKnowledge and Data Engineering, Vol.17, No.10, Oct.2005

本発明は、画像の分類精度を従来よりも高くすることを目的とする。 An object of the present invention is to increase the classification accuracy of an image as compared with the prior art.

請求項１に記載の発明は、プログラムであって、複数の分類のうちの少なくとも１つの分類を特定する対象となる画像を受け付ける画像受付手段、前記複数の分類それぞれについて、前記画像受付手段が受け付ける画像の特徴量と、それぞれ前記複数の分類のうちの少なくとも１つの分類に属する複数の画像の特徴量と、に基づいて、前記画像受付手段が受け付ける画像が当該分類に属するか否かを判別する基準となる判別基準値を特定する判別基準値特定手段、前記複数の分類それぞれについて、それぞれ前記複数の分類のうちの少なくとも１つの分類に属する複数の画像に基づいて特定される、当該分類に属する画像が他の分類にも属する可能性、あるいは、当該分類に属さない画像が前記他の分類にも属さない可能性、の少なくとも一方を表す相関情報と、当該分類と前記他の分類についての前記判別基準値と、に基づいて、前記画像受付手段が受け付ける画像が当該分類に属する可能性の高低を表す値を特定する分類属否可能性特定手段、前記分類属否可能性特定手段により特定される前記値に基づいて特定される、前記画像受付手段が受け付ける画像が属する少なくとも１つの分類を示す情報を出力する出力手段、としてコンピュータを機能させることとしたものである。 The invention according to claim 1 is a program, an image receiving unit that receives an image that is a target for specifying at least one of a plurality of classifications, and the image receiving unit receives each of the plurality of classifications. Based on the feature quantity of the image and the feature quantities of a plurality of images each belonging to at least one of the plurality of classifications, it is determined whether the image received by the image receiving means belongs to the classification. A discrimination reference value specifying means for specifying a discrimination reference value serving as a reference; each of the plurality of classifications is specified based on a plurality of images belonging to at least one of the plurality of classifications; At least one of the possibility that an image belongs to another category or the possibility that an image that does not belong to the category does not belong to the other category. Based on the correlation information representing the classification and the discrimination reference value for the classification and the other classification, the classification affiliation specifying the value representing the probability that the image received by the image receiving means belongs to the classification Computer as output means for outputting information indicating at least one classification to which the image received by the image receiving means, specified based on the value specified by the possibility specifying means, the classification attribute possibility specifying means, belongs Is supposed to function.

請求項２に記載の発明は、請求項１に記載のプログラムであって、前記分類属否可能性特定手段が、前記各分類について、当該分類とは異なるすべての分類それぞれについての前記相関情報及び前記判別基準値の組合せに基づいて、前記画像受付手段が受け付ける画像が当該分類に属する可能性の高低を表す値を特定することとしたものである。 Invention of Claim 2 is the program of Claim 1, Comprising: The said classification genus possibility determination means is the said correlation information about each classification | category different from the said classification | category about each said classification | category, and Based on the combination of the discriminant reference values, a value representing the possibility that the image received by the image receiving means belongs to the classification is specified.

請求項３に記載の発明は、情報処理システムであって、複数の分類のうちの少なくとも１つの分類を特定する対象となる画像を受け付ける画像受付手段と、前記複数の分類それぞれについて、前記画像受付手段が受け付ける画像の特徴量と、それぞれ前記複数の分類のうちの少なくとも１つの分類に属する複数の画像の特徴量と、に基づいて、前記画像受付手段が受け付ける画像が当該分類に属するか否かを判別する基準となる判別基準値を特定する判別基準値特定手段と、前記複数の分類それぞれについて、それぞれ前記複数の分類のうちの少なくとも１つの分類に属する複数の画像に基づいて特定される、当該分類に属する画像が他の分類にも属する可能性、あるいは、当該分類に属さない画像が前記他の分類にも属さない可能性、の少なくとも一方を表す相関情報と、当該分類と前記他の分類についての前記判別基準値と、に基づいて、前記画像受付手段が受け付ける画像が当該分類に属する可能性の高低を表す値を特定する分類属否可能性特定手段と、前記分類属否可能性特定手段により特定される前記値に基づいて特定される、前記画像受付手段が受け付ける画像が属する少なくとも１つの分類を示す情報を出力する出力手段と、を含むこととしたものである。 The invention according to claim 3 is an information processing system, wherein an image receiving unit that receives an image for specifying at least one of a plurality of classifications, and the image reception for each of the plurality of classifications Whether the image received by the image receiving means belongs to the category based on the feature amount of the image received by the means and the feature amounts of a plurality of images each belonging to at least one of the plurality of classifications A discrimination reference value specifying means for specifying a discrimination reference value serving as a reference for discriminating, and for each of the plurality of classifications, each is specified based on a plurality of images belonging to at least one of the plurality of classifications, There is little possibility that an image belonging to the category belongs to another category, or an image that does not belong to the category does not belong to the other category. Based on the correlation information representing at least one and the discrimination reference value for the classification and the other classification, a value representing the possibility that the image received by the image receiving means belongs to the classification is specified. Output that outputs information indicating at least one classification to which an image received by the image receiving unit is specified, which is specified based on the value specified by the classification attribute possibility determining unit and the classification attribute possibility determining unit Means.

請求項１，３に記載の発明によれば、本発明の構成を有しない場合と比較して、画像の分類精度が高くなる。 According to the first and third aspects of the invention, the image classification accuracy is higher than in the case of not having the configuration of the present invention.

請求項２に記載の発明によれば、画像がある分類に属する可能性の高低が、その分類とは異なるすべての分類との間の相関に基づいて特定される。 According to the second aspect of the present invention, the possibility that an image belongs to a certain class is specified based on the correlation between all classes different from the class.

本発明の一実施形態に係る情報処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the information processing apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係る情報処理装置により実現される機能の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of the function implement | achieved by the information processing apparatus which concerns on one Embodiment of this invention. 学習モデル情報の一例を示す図である。It is a figure which shows an example of learning model information. 第１の相関マトリックス情報の一例を示す図である。It is a figure which shows an example of 1st correlation matrix information. 第２の相関マトリックス情報の一例を示す図である。It is a figure which shows an example of 2nd correlation matrix information. 確率モデル情報の一例を示す図である。It is a figure which shows an example of probability model information. 本実施形態に係る情報処理装置で行われる処理の流れの一例を示すフロー図である。It is a flowchart which shows an example of the flow of the process performed with the information processing apparatus which concerns on this embodiment. 比較例における確率モデル情報の一例を示す図である。It is a figure which shows an example of the probability model information in a comparative example. 第１の相関マトリックス情報の一例を示す図である。It is a figure which shows an example of 1st correlation matrix information. 第２の相関マトリックス情報の一例を示す図である。It is a figure which shows an example of 2nd correlation matrix information. 標識の認識率の比較結果の一例を示す図である。It is a figure which shows an example of the comparison result of the recognition rate of a label | marker. 標識ｍｏｔｏｒｂｉｋｅに対する判別基準値によって、最適化された標識ｓｈｅｅｐのｓｉｇｍｏｉｄ関数の一例を示す図である。It is a figure which shows an example of the sigmoid function of the label | marker shape optimized by the discrimination | determination reference value with respect to the marker motorbike. 標識ｍｏｔｏｒｂｉｋｅに対する判別基準値によって、最適化された標識ｓｈｅｅｐのｓｉｇｍｏｉｄ関数の一例を示す図である。It is a figure which shows an example of the sigmoid function of the label | marker shape optimized by the discrimination | determination reference value with respect to the marker motorbike. 標識の認識率の比較結果の一例を示す図である。It is a figure which shows an example of the comparison result of the recognition rate of a label | marker. 標識ｐｅｒｓｏｎに対する判別基準値によって、最適化された標識ｃａｔのｓｉｇｍｏｉｄ関数の一例を示す図である。It is a figure which shows an example of the sigmoid function of the label | marker cat optimized with the discrimination | determination reference value with respect to the label | marker person. 標識ｐｅｒｓｏｎに対する判別基準値によって、最適化された標識ｃａｔのｓｉｇｍｏｉｄ関数の一例を示す図である。It is a figure which shows an example of the sigmoid function of the label | marker cat optimized with the discrimination | determination reference value with respect to the label | marker person. 確率モデル情報に含まれるパラメータチルダＡの一具体例を示す図である。It is a figure which shows one specific example of the parameter tilde A contained in probability model information.

以下、本発明の一実施形態について図面に基づき詳細に説明する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

図１は、本実施形態に係る情報処理システムとして機能する情報処理装置１０のハードウェア構成の一例を示す図である。図１に例示するように、本実施形態に係る情報処理装置１０は、例えば、制御部１２、記憶部１４、ユーザインタフェース（ＵＩ）部１６、を含んでいる。これらの要素は、バスなどを介して接続される。制御部１２は、ＣＰＵ等のプログラム制御デバイスであり、情報処理装置１０にインストールされるプログラムに従って動作する。記憶部１４は、ＲＯＭやＲＡＭ等の記憶素子やハードディスクドライブなどである。記憶部１４には、制御部１２によって実行されるプログラムなどが記憶される。また、記憶部１４は、制御部１２のワークメモリとしても動作する。ＵＩ部１６は、ディスプレイ、マイク、マウス、キーボードなどであり、利用者が行った操作の内容や、利用者が入力した音声を制御部１２に出力する。また、このＵＩ部１６は、制御部１２から入力される指示に従って情報を表示出力したり音声出力したりする。 FIG. 1 is a diagram illustrating an example of a hardware configuration of an information processing apparatus 10 that functions as an information processing system according to the present embodiment. As illustrated in FIG. 1, the information processing apparatus 10 according to the present embodiment includes, for example, a control unit 12, a storage unit 14, and a user interface (UI) unit 16. These elements are connected via a bus or the like. The control unit 12 is a program control device such as a CPU, and operates according to a program installed in the information processing apparatus 10. The storage unit 14 is a storage element such as a ROM or a RAM, a hard disk drive, or the like. The storage unit 14 stores a program executed by the control unit 12. The storage unit 14 also operates as a work memory for the control unit 12. The UI unit 16 is a display, a microphone, a mouse, a keyboard, and the like, and outputs the content of the operation performed by the user and the voice input by the user to the control unit 12. In addition, the UI unit 16 displays and outputs information according to an instruction input from the control unit 12.

図２は、本実施形態に係る情報処理装置１０により実現される機能の一例を示す機能ブロック図である。図２に例示するように、情報処理装置１０は、本実施形態では、例えば、画像受付部２０、学習モデル情報記憶部２２、相関マトリックス情報記憶部２４、確率モデル情報記憶部２６、特徴量抽出部２８、判別基準値算出部３０、判別基準値重み付け部３２、属否可能性算出部３４、分類決定部３６、出力部３８、学習部４０、を含むものとして機能する。学習モデル情報記憶部２２、相関マトリックス情報記憶部２４、確率モデル情報記憶部２６は、記憶部１４を主として実現される。その他の要素は制御部１２を主として実現される。また、学習部４０は、例えば、サポートベクタマシン等の識別器を含んでいる。 FIG. 2 is a functional block diagram illustrating an example of functions realized by the information processing apparatus 10 according to the present embodiment. As illustrated in FIG. 2, in the present embodiment, the information processing apparatus 10 includes, for example, an image reception unit 20, a learning model information storage unit 22, a correlation matrix information storage unit 24, a probability model information storage unit 26, and feature amount extraction. The unit 28, the discrimination reference value calculation unit 30, the discrimination reference value weighting unit 32, the affiliation possibility calculation unit 34, the classification determination unit 36, the output unit 38, and the learning unit 40 function. The learning model information storage unit 22, the correlation matrix information storage unit 24, and the probability model information storage unit 26 are realized mainly by the storage unit 14. Other elements are realized mainly by the control unit 12. The learning unit 40 includes a discriminator such as a support vector machine, for example.

これらの要素は、コンピュータである情報処理装置１０にインストールされたプログラムを、情報処理装置１０の制御部１２で実行することにより実現されている。このプログラムは、例えば、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭなどのコンピュータ読み取り可能な情報記録媒体を介して、あるいは、インターネットなどの通信手段を介して情報処理装置１０に供給される。 These elements are realized by executing a program installed in the information processing apparatus 10 that is a computer by the control unit 12 of the information processing apparatus 10. This program is supplied to the information processing apparatus 10 via a computer-readable information recording medium such as a CD-ROM or DVD-ROM, or via communication means such as the Internet.

本実施形態では、例えば、画像受付部２０が受け付ける画像に、予め定められたＬ種類の標識（ラベル）のうちの少なくとも１つが関連付けられることとなる。本実施形態では、Ｌ種類の標識それぞれに、標識番号ｊ（ｊ＝１，・・・，Ｌ）が割り当てられている。また、本実施形態では、標識と画像の分類とが１対１で対応している。すなわち、ある分類に属する画像には、その分類に対応する標識が関連付けられることとなる。なお、画像は複数の分類に属することもあり得る。その場合には、画像に複数の標識が関連付けられることとなる。 In the present embodiment, for example, at least one of L types of labels (labels) determined in advance is associated with the image received by the image receiving unit 20. In the present embodiment, a label number j (j = 1,..., L) is assigned to each of the L types of labels. In the present embodiment, there is a one-to-one correspondence between the sign and the image classification. That is, an image belonging to a certain category is associated with a label corresponding to that category. An image may belong to a plurality of categories. In that case, a plurality of signs are associated with the image.

本実施形態では、学習モデル情報記憶部２２に、予め、図３に例示する学習モデル情報５０が記憶されている。本実施形態では、学習モデル情報５０は、情報処理装置１０が学習の対象となる複数の画像（学習画像）を学習することにより生成される。学習モデル情報５０は、上述の標識番号、サポートベクタ情報、パラメータ情報、を含んでいる。学習モデル情報５０は、例えば、標識に対応する特徴量と他の標識に対応する特徴量とを識別するバイナリサポートベクタマシンの学習モデルを表す情報である。サポートベクタ情報及びパラメータ情報は、後述する判別基準値を算出する基礎となる情報である。また、図３に示すように、標識番号ｉに対応付けられるサポートベクタ情報の値は（ｘ_ｉ１，ｘ_ｉ２，・・・，ｘ_ｉＮｉ）、パラメータ情報の値は（ａ_ｉ１，ａ_ｉ２，・・・，ａ_ｉＮｉ）及びｂ_ｉである。学習画像に基づく、学習モデル情報５０の生成処理については後述する。 In the present embodiment, learning model information 50 illustrated in FIG. 3 is stored in the learning model information storage unit 22 in advance. In the present embodiment, the learning model information 50 is generated by the information processing apparatus 10 learning a plurality of images (learning images) to be learned. The learning model information 50 includes the above-described label number, support vector information, and parameter information. The learning model information 50 is information representing a learning model of a binary support vector machine that identifies, for example, a feature quantity corresponding to a sign and a feature quantity corresponding to another sign. The support vector information and the parameter information are information serving as a basis for calculating a discrimination reference value described later. Further, as shown in FIG. 3, the values of the support vector information associated with the label number i are (x _i1 , x _i2 ,..., X _iNi ), and the parameter information values are (a _i1 , a _i2 _,. _.. , A _iNi ) and b _i . The generation process of the learning model information 50 based on the learning image will be described later.

また、本実施形態では、相関マトリックス情報記憶部２４に、予め、図４に例示する第１の相関マトリックス情報５２−１、及び、図５に例示する第２の相関マトリックス情報５２−２が記憶されている。第１の相関マトリックス情報５２−１、及び、第２の相関マトリックス情報５２−２は、情報処理装置１０が複数の学習画像を学習することにより生成される。これらの学習画像は、学習モデル情報５０の生成の際に用いられる学習画像と同じであっても異なっていても構わない。第１の相関マトリックス情報５２−１のｉ行ｊ列の値Ｒ_ｉｊには、例えば、標識番号ｊの標識が関連付けられている学習画像に標識番号ｉの標識も関連付けられている条件付確率Ｐ（Ｃ_ｉ｜Ｃ_ｊ）と、学習画像に標識番号ｉの標識が関連付けられている確率Ｐ（Ｃ_ｉ）との差の絶対値Ｒ_ｉｊ＝｜Ｐ（Ｃ_ｉ）−Ｐ（Ｃ_ｉ｜Ｃ_ｊ）｜が設定される。学習画像に標識番号ｊの標識が関連付けられているときに、その学習画像に標識番号ｉの標識も関連付けられている場合には、そうでない場合よりもＰ（Ｃ_ｉ｜Ｃ_ｊ）の値が大きくなる。一方、学習画像に標識番号ｊの標識が関連付けられていないときに、その学習画像に標識番号ｉの標識も関連付けられていない場合には、そうでない場合よりもＰ（Ｃ_ｉ｜Ｃ_ｊ）の値が小さくなる。いずれにしても、標識番号ｉの標識と標識番号ｊの標識とに学習画像に関連付けられているか否かについての相関がある場合には、｜Ｐ（Ｃ_ｉ）−Ｐ（Ｃ_ｉ｜Ｃ_ｊ）｜の値は大きくなる。一方、その相関がない場合は、｜Ｐ（Ｃ_ｉ）−Ｐ（Ｃ_ｉ｜Ｃ_ｊ）｜の値は小さくなる。独立である場合は、｜Ｐ（Ｃ_ｉ）−Ｐ（Ｃ_ｉ｜Ｃ_ｊ）｜の値はゼロとなる。 Further, in the present embodiment, first correlation matrix information 52-1 exemplified in FIG. 4 and second correlation matrix information 52-2 exemplified in FIG. 5 are stored in the correlation matrix information storage unit 24 in advance. Has been. The first correlation matrix information 52-1 and the second correlation matrix information 52-2 are generated by the information processing apparatus 10 learning a plurality of learning images. These learning images may be the same as or different from the learning images used when the learning model information 50 is generated. For example, the conditional probability P in which the sign of the sign number i is also associated with the learning image in which the sign of the sign number j is associated with the value R _ij of the i row and j column of the first correlation matrix information 52-1. Absolute value R _ij = | P (C _i ) −P (C _i | C) of the difference between (C _i | C _j ) and the probability P (C _i ) that the sign of the sign number i is associated with the learning image _j ) | is set. When the sign of the sign number j is associated with the learning image and the sign of the sign number i is also associated with the learning image, the value of P (C _i | C _j ) is larger than the case where the sign is not. growing. On the other hand, when the sign of the sign number j is not associated with the learning image and the sign of the sign number i is not associated with the learning image, P (C _i | C _j ) The value becomes smaller. In any case, if there is a correlation as to whether or not the label with the label number i and the label with the label number j are associated with the learning image, | P (C _i ) −P (C _i | C _j ) | Increases. On the other hand, when there is no correlation, the value of | P (C _i ) −P (C _i | C _j ) | If independent, the value of | P (C _i ) −P (C _i | C _j ) | is zero.

一方、第２の相関マトリックス情報５２−２のｉ行ｊ列の値Ｒ’_ｉｊには、例えば、標識番号ｊの標識が関連付けられていない学習画像に標識番号ｉの標識も関連付けられていない条件付確率Ｐ（Ｃ’_ｉ｜Ｃ’_ｊ）と、学習画像に標識番号ｉの標識が関連付けられていない確率Ｐ（Ｃ’ｉ）との差の絶対値Ｒ’_ｉｊ＝｜Ｐ（Ｃ’_ｉ）−Ｐ（Ｃ’_ｉ｜Ｃ’_ｊ）｜が設定される。 On the other hand, the value R ′ _ij in the i-th row and j-th column of the second correlation matrix information 52-2 is, for example, a condition in which the sign of the sign number i is not associated with the learning image that is not associated with the sign of the sign number j. The absolute value R ′ _ij = | P (C ′ _i ) of the difference between the assigned probability P (C ′ _i | C ′ _j ) and the probability P (C′i) that the sign of the sign number i is not associated with the learning image ) -P (C ′ _i | C ′ _j ) | is set.

また、本実施形態では、確率モデル情報記憶部２６に、予め、図６に例示する確率モデル情報５４が記憶されている。本実施形態では、確率モデル情報５４は、情報処理装置１０が複数の学習画像を学習することにより生成される。これらの学習画像は、確率モデル情報５４の生成の際に用いられる学習画像と同じであっても異なっていても構わない。確率モデル情報５４は、Ｌ行Ｌ列の行列（ｉ行ｊ列の値はチルダＡ_ｉｊ（上方に〜が配置されたＡ_ｉｊ））であるパラメータチルダＡ及び各行に割り当てられる要素（ｉ行に対応付けられるこの要素の値はチルダＢ_ｉ（上方に〜が配置されたＢ_ｉ））であるパラメータチルダＢが含まれている。本実施形態では、図６に示すように、確率モデル情報５４における、ｉ行目のパラメータが、標識番号ｉの標識に対応するパラメータとなっている。確率モデル情報５４は、本実施形態では、例えば、ｓｉｇｍｏｉｄ関数のパラメータベクトルである。学習画像に基づく、確率モデル情報５４の生成処理については後述する。 In the present embodiment, probability model information 54 illustrated in FIG. 6 is stored in advance in the probability model information storage unit 26. In the present embodiment, the probability model information 54 is generated by the information processing apparatus 10 learning a plurality of learning images. These learning images may be the same as or different from the learning images used when the probability model information 54 is generated. Probability model information 54, the L rows and L parameters tilde A and element (i row assigned to each line is a matrix (A _ij) values of i-th row and the j which is arranged ~ tilde A _{ij _(upper)} column the value of this element to be associated is contains parameters tilde B is a tilde B _{i (B} _i of ~ upward are _arranged)). In the present embodiment, as shown in FIG. 6, the parameter in the i-th row in the probability model information 54 is a parameter corresponding to the marker with the marker number i. In the present embodiment, the probability model information 54 is a parameter vector of a sigmoid function, for example. The generation process of the probability model information 54 based on the learning image will be described later.

ここで、本実施形態に係る情報処理装置１０で行われる処理の流れの一例を、図７に例示するフロー図を参照しながら説明する。 Here, an example of the flow of processing performed by the information processing apparatus 10 according to the present embodiment will be described with reference to the flowchart illustrated in FIG.

まず、画像受付部２０が、分類の対象となる画像（すなわち、上述のＬ種類の標識のうちの少なくとも１つが関連付けられることとなる画像）を受け付けて、この画像の画素行列を出力する（Ｓ１０１）。以下、Ｓ１０１に示す処理で受け付ける画像を、受付画像と呼ぶこととする。なお、Ｓ１０１に示す処理で、一般的な前処理（例えば、移動、色の修正、変形、フォーマットの変換、ノイズ削除等）を行ってもよい。 First, the image receiving unit 20 receives an image to be classified (that is, an image to which at least one of the above-described L types of signs is associated), and outputs a pixel matrix of this image (S101). ). Hereinafter, the image received in the process shown in S101 is referred to as a received image. Note that general preprocessing (for example, movement, color correction, transformation, format conversion, noise deletion, etc.) may be performed in the processing shown in S101.

そして、特徴量抽出部２８が、例えば、受付画像の特徴量を特定する（Ｓ１０２）。本処理例では、特徴量抽出部２８は、例えば、受付画像に含まれる各画素のＲＧＢ値、ｎｏｒｍａｌｉｚｅｄ−ＲＧＢ値、ＨＳＶ値、ＬＡＢ値、ｒｏｂｕｓｔＨｕｅ特徴量（van de Weijer, C. Schmid, "Coloring Local Feature Extraction", ECCV 2006を参照）Ｇａｂｏｒ特徴量、ＤＣＴ特徴量、ＳＩＦＴ特徴量、ＧＩＳＴ特徴量等を特定する。そして、特徴量抽出部２８は、予め、学習コーパスからＫ−Ｍｅａｎｓクラスタリングによって生成されたコードブックに基づいて、抽出された特徴量を量子化する。そして、特徴量抽出部２８は、量子化された特徴量の受付画像全体におけるヒストグラムを、Ｓ１０２に示す処理における特徴量ベクトルｘとして特定する。 Then, the feature amount extraction unit 28 specifies, for example, the feature amount of the received image (S102). In the present processing example, the feature amount extraction unit 28, for example, the RGB value, normalized-RGB value, HSV value, LAB value, robustHue feature amount (van de Weijer, C. Schmid, “Coloring”) of each pixel included in the received image. Local Feature Extraction "(Refer to ECCV 2006) Gabor feature value, DCT feature value, SIFT feature value, GIST feature value, etc. are specified. Then, the feature quantity extraction unit 28 quantizes the extracted feature quantity in advance based on a codebook generated by K-Means clustering from the learning corpus. Then, the feature amount extraction unit 28 specifies the histogram of the entire received image of the quantized feature amount as the feature amount vector x in the process shown in S102.

そして、判別基準値算出部３０が、Ｓ１０２に示す処理で特定された受付画像の特徴量と、学習モデル情報記憶部２２に予め記憶されている学習モデル情報５０とに基づいて、受付画像が各分類に属するか否か（受付画像に標識を関連付けるか否か）を判別する基準となる値である判別基準値を算出する（Ｓ１０３）。本実施形態では、判別基準値算出部３０は、例えば、各分類について（各標識について）、判別基準値を算出する。例えば、判別基準値算出部３０は、次式を計算することにより、標識番号ｉの標識に対応付けられる判別基準値Ｆ_ｉ（ｉ＝１，・・・，Ｌ）を算出する。なお、次式において、Ｋはカーネル関数（例えば、ガウシアンカーネル）、太文字のｘはＳ１０２に示す処理で特定される受付画像の特徴量ベクトルである。ｘ_ｉｎは、学習モデル情報５０に含まれるサポートベクタ情報の値である。ａ_ｉｎ及びｂ_ｉの値は、学習モデル情報５０に含まれるパラメータ情報の値である。判別基準値Ｆ_ｉは、標識に対応する分類に属する画像については、大きな値をとり、その他の画像については小さな値をとる。判別基準値は、例えば、対応する分類と他の分類とを識別するバイナリサポートベクタマシンの決定関数の出力である。 Then, the discrimination reference value calculation unit 30 determines that each of the received images is based on the feature amount of the received image specified in the process shown in S102 and the learning model information 50 stored in advance in the learning model information storage unit 22. A discrimination reference value, which is a reference value for discriminating whether or not the image belongs to the classification (whether or not the sign is associated with the received image) is calculated (S103). In the present embodiment, the discrimination reference value calculation unit 30 calculates a discrimination reference value for each classification (for each marker), for example. For example, the discrimination reference value calculation unit 30 calculates the discrimination reference value F _i (i = 1,..., L) associated with the label with the label number i by calculating the following equation. In the following equation, K is a kernel function (for example, Gaussian kernel), and the bold letter x is a feature vector of the received image specified by the processing shown in S102. x _in is a value of support vector information included _in the learning model information 50. The values of a _in and b _i are the values of the parameter information included in the learning model information 50. The discrimination reference value F _i takes a large value for an image belonging to the classification corresponding to the sign, and takes a small value for other images. The discrimination reference value is, for example, an output of a decision function of a binary support vector machine that identifies a corresponding classification and another classification.

そして、判別基準値重み付け部３２が、Ｓ１０３に示す処理で算出された、標識番号ｉの標識に対応付けられる判別基準値Ｆ_ｉ（ｉ＝１，・・・，Ｌ）に対する、相関マトリックス情報５２に基づく重み付けを行う（Ｓ１０４）。判別基準値重み付け部３２は、具体的には、例えば、各標識について、その標識に対応付けられるＬ個の重み付け済判別基準値を算出する。以下、標識番号ｉの標識に対応付けられるＬ個の重み付け済判別基準値をＦ’_ｉｊ（ｊ＝１，・・・，Ｌ）で表すこととする。 The discrimination reference value weighting unit 32 calculates the correlation matrix information 52 for the discrimination reference value F _i (i = 1,..., L) associated with the label with the label number i calculated in the process shown in S103. Is weighted based on (S104). Specifically, the discrimination reference value weighting unit 32 calculates, for example, L weighted discrimination reference values associated with the label for each label. Hereinafter, L weighted discrimination reference values associated with the label with the label number i are represented by F ′ _ij (j = 1,..., L).

判別基準値重み付け部３２は、例えば、標識番号ｊ（ｊ＝１，・・・，Ｌ）の標識に対応付けられる判別基準値Ｆ_ｊの値がゼロより大きい際に、第１の相関マトリックス情報５２−１に含まれる値Ｒ_ｊｉが予め定められた閾値よりも大きいか否かを確認する。そして、判別基準値重み付け部３２は、値Ｒ_ｊｉが閾値よりも大きい場合に、重みｗ_ｉｊの値を１に設定し、そうでない場合に、重みｗ_ｉｊの値を０に設定する。標識番号ｊ（ｊ＝１，・・・，Ｌ）の標識に対応付けられる判別基準値Ｆ_ｊの値がゼロより小さい際に、第２の相関マトリックス情報５２−２に含まれる値Ｒ’_ｊｉが予め定められた閾値よりも大きいか否かを確認する。そして、判別基準値重み付け部３２は、値Ｒ’_ｊｉが閾値よりも大きい場合に、重みｗ_ｉｊの値を１に設定し、そうでない場合に、重みｗ_ｉｊの値を０に設定する。そして、判別基準値重み付け部３２は、Ｆ’_ｉｊ＝Ｆ_ｊ×ｗ_ｉｊという数式に従って、Ｆ’_ｉｊ（ｊ＝１，・・・，Ｌ）の値を算出する。以上の処理を各ｉ（ｉ＝１，・・・，Ｌ）について繰り返し実行することで、Ｆ’_ｉｊ（ｉ＝１，・・・，Ｌ，ｊ＝１，・・・，Ｌ）の値が算出される。 For example, the discrimination reference value weighting unit 32 sets the first correlation matrix information when the value of the discrimination reference value F _j associated with the label with the label number j (j = 1,..., L) is greater than zero. It is checked whether the value R _ji included in 52-1 is larger than a predetermined threshold value. Then, the discrimination reference value weighting unit 32 sets the value of the weight w _ij to 1 when the value R _ji is larger than the threshold value, and sets the value of the weight w _ij to 0 otherwise. The value R ′ _ji included in the second correlation matrix information 52-2 when the value of the discrimination reference value F _j associated with the label with the label number j (j = 1,..., L) is smaller than zero. Is greater than a predetermined threshold value. Then, the discrimination reference value weighting unit 32 sets the value of the weight w _ij to 1 when the value R ′ _ji is larger than the threshold value, and sets the value of the weight w _ij to 0 when not. Then, the discrimination reference value weighting unit 32 calculates the value of F ′ _ij (j = 1,..., L) according to the mathematical formula F ′ _ij = F _j × w _ij . By repeatedly executing the above process for each i (i = 1,..., L), the value of F ′ _ij (i = 1,..., L, j = 1,..., L) Is calculated.

そして、属否可能性算出部３４が、Ｆ’_ｉｊの値、及び、確率モデル情報５４に基づいて、次式を計算することにより、各ｉ（ｉ＝１，・・・，Ｌ）についての、属否可能性の値チルダｐ_ｉ（上方に〜が配置されたｐ_ｉ）の値を算出する（Ｓ１０５）。次式において、Ｔは特徴量を表している。チルダＡ_ｉｊ及びチルダＢ_ｉの値は、上述の確率モデル情報５４に含まれる値を指す。チルダｐ_ｉは標識番号ｉの標識に対応する事後確率である。 Then, the affiliation possibility calculation unit 34 calculates the following expression based on the value of F ′ _ij and the probability model information 54, so that each i (i = 1,..., L) is calculated. calculates a value of attribution possible values tilde _{p i} _{(p i} to ~ upward are arranged) (S105). In the following equation, T represents a feature amount. The values of tilde A _ij and tilde B _i indicate values included in the probability model information 54 described above. The tilde p _i is a posterior probability corresponding to the label with the label number i.

そして、分類決定部３６は、チルダｐ_ｉ（ｉ＝１，・・・，Ｌ）の値に基づいて、受付画像が属する分類（すなわち、受付画像に関連付けられる標識）を決定する（Ｓ１０６）。本実施形態では、分類決定部３６は、例えば、チルダｐ_ｉ（ｉ＝１，・・・，Ｌ）の値が予め定められた閾値以上である標識番号を特定する。そして、分類決定部３６は、特定された標識番号のうち、チルダｐ_ｉ（ｉ＝１，・・・，Ｌ）の値が大きいものから順に、予め定められた個数以下の標識番号を特定する。分類決定部３６は、このようにして特定された標識番号に対応する分類を、受付画像が属する分類として特定する。 Then, the classification determination unit 36 determines the classification to which the received image belongs (that is, the label associated with the received image) based on the value of the tilde p _i (i = 1,..., L) (S106). In the present embodiment, the classification determination unit 36 identifies a label number whose tilde p _i (i = 1,..., L) is greater than or equal to a predetermined threshold value, for example. Then, the classification determination unit 36 specifies a number of label numbers equal to or less than a predetermined number in order from the identified label number in descending order of the value of the tilde p _i (i = 1,..., L). . The classification determination unit 36 specifies the classification corresponding to the label number specified in this way as the classification to which the received image belongs.

そして、出力部３８が、Ｓ１０６に示す処理で決定された分類に対応する少なくとも１つの標識を受付画像に関連付けて、記憶部１４に出力するとともに、標識番号をディスプレイ等のＵＩ部１６に表示出力する（Ｓ１０７）。このようにして、本処理例によれば、受付画像が属する分類が決定され、対応する標識が受付画像に関連付けられるとともに、標識番号がディスプレイ等に表示出力されることとなる。 Then, the output unit 38 associates at least one sign corresponding to the classification determined in the process shown in S106 with the received image, outputs the sign to the storage unit 14, and outputs the sign number to the UI unit 16 such as a display. (S107). Thus, according to this processing example, the classification to which the received image belongs is determined, the corresponding sign is associated with the received image, and the sign number is displayed and output on a display or the like.

ここで、情報処理装置１０による学習モデル情報５０の生成処理の一例について説明する。 Here, an example of the generation process of the learning model information 50 by the information processing apparatus 10 will be described.

まず、画像受付部２０が、学習の対象となる画像（学習画像）を複数（例えば、Ｎ個）受け付ける。各学習画像には、その学習画像が属する分類に対応する標識が少なくとも１つ関連付けられている。そして、特徴量抽出部２８は、各学習画像の特徴量を抽出する。特徴量抽出部２８は、具体的には、例えば、学習画像に含まれる各画素のＲＧＢ値、ｎｏｒｍａｌｉｚｅｄ−ＲＧＢ値、ＨＳＶ値、ＬＡＢ値、ｒｏｂｕｓｔＨｕｅ特徴量（van de Weijer, C. Schmid, "Coloring Local Feature Extraction", ECCV 2006を参照）、Ｇａｂｏｒ特徴量、ＤＣＴ特徴量、ＳＩＦＴ特徴量、ＧＩＳＴ特徴量等を特定する。そして、特徴量抽出部２８は、予め、学習コーパスからＫ−Ｍｅａｎｓクラスタリングによって生成されたコードブックに基づいて、抽出された特徴量を量子化する。そして、特徴量抽出部２８は、量子化された特徴量の受付画像全体におけるヒストグラムを、学習画像の特徴量ベクトルとして特定する。以下、第ｎの学習画像（ｎ＝１，・・・，Ｎ）から特定される特徴量ベクトルをｘ_ｎで表す。 First, the image receiving unit 20 receives a plurality (for example, N) of images (learning images) to be learned. Each learning image is associated with at least one sign corresponding to the classification to which the learning image belongs. Then, the feature amount extraction unit 28 extracts the feature amount of each learning image. Specifically, the feature quantity extraction unit 28, for example, includes RGB values, normalized-RGB values, HSV values, LAB values, robustHue feature values (van de Weijer, C. Schmid, “Coloring”) of each pixel included in the learning image. Local Feature Extraction "(see ECCV 2006), Gabor feature, DCT feature, SIFT feature, GIST feature, etc. Then, the feature quantity extraction unit 28 quantizes the extracted feature quantity in advance based on a codebook generated by K-Means clustering from the learning corpus. Then, the feature amount extraction unit 28 specifies the histogram of the entire received image of the quantized feature amount as the feature amount vector of the learning image. Hereinafter, a feature vector identified from the nth learning image (n = 1,..., N) is represented by _xn .

そして、学習部４０は、各識別番号ｊ（ｊ＝１，・・・，Ｌ）についての、サポートベクタ情報及びパラメータ情報の特定処理を実行する。ここで、識別番号ｊのサポートベクタ情報及びパラメータ情報の特定処理は、例えば、第ｎの学習画像（ｎ＝１，・・・，Ｎ）の特徴量ベクトルｘ_ｎと、第ｎの学習画像が識別番号ｊに対応する標識に関連付けられているか否かを示す値ｔ_ｎ（識別番号ｊに対応する標識に関連付けられている場合は、ｔ_ｎ＝＋１。そうでない場合は、ｔ_ｎ＝０。）とを対応付ける処理（この処理により、Ｎ組のベクトルと値の組合せ（ｘ_１,ｔ_１），・・・，（ｘ_Ｎ,ｔ_Ｎ）が生成される。）、これらＮ組のベクトルと値の組合せをサポートベクタマシン（ＳＶＭ）により学習する処理（例えば、N.Cristianini and J.Shawe-Taylor, "An Introduction to Support Vector Machines and Other Kernel-based Learning Methods", Chapter6. Cambridge University Press 2000の最適化基準のもとで、1-v-OthersのバイナリＳＶＭのパラメータを学習する処理）、識別番号ｊに対応するサポートベクタ情報及びパラメータ情報を特定する処理、を順に実行することにより実現される。本実施形態では、学習部４０は、ｔ_ｎ＝＋１に対応付けられている特徴量ベクトルｘ_ｎを正例として、ｔ_ｎ＝０に対応付けられている特徴量ベクトルｘ_ｎを負例として学習する。そして、学習部４０は、上述のようにして特定されたサポートベクタ情報及びパラメータ情報に、対応する識別番号が関連付けられた学習モデル情報５０を学習モデル情報記憶部２２に出力する。 And the learning part 40 performs the specific process of support vector information and parameter information about each identification number j (j = 1, ..., L). Here, specific processing of support vector information and parameter information of the identification number j is, for example, the learning image of the n (n = 1, ···, N) features and the vector x _n of the learning image of the n A value t _n indicating whether or not it is associated with the sign corresponding to the identification number j (t _n = + 1 if associated with the sign corresponding to the identification number j. Otherwise, t _n = 0. ) (This process generates N sets of vectors and combinations of values (x ₁ , t ₁ ),..., (X _N , t _N )), and these N sets of vectors and Processing to learn combinations of values by support vector machine (SVM) (for example, N. Cristianini and J. Shawe-Taylor, "An Introduction to Support Vector Machines and Other Kernel-based Learning Methods", Chapter 6. Cambridge University Press 2000 Under optimization criteria, 1-v-Others Process of learning the parameters of the binary SVM), the process of identifying the support vector information and parameter information corresponding to the identification number j, is realized by a run in sequence. In the present embodiment, the learning section 40 learns the feature quantity vector x _n which is associated with t n _{= +} 1 as a positive example, the feature vector x _n which is associated with t n _{= 0} as a negative example To do. Then, the learning unit 40 outputs the learning model information 50 in which the corresponding identification number is associated with the support vector information and the parameter information specified as described above, to the learning model information storage unit 22.

このようにして、学習モデル情報記憶部２２に学習モデル情報５０が記憶されることとなる。 In this way, the learning model information 50 is stored in the learning model information storage unit 22.

次に、情報処理装置１０による確率モデル情報５４の生成処理の一例について説明する。 Next, an example of processing for generating the probability model information 54 by the information processing apparatus 10 will be described.

まず、画像受付部２０が、学習の対象となる画像（学習画像）を複数受け付ける。これらの学習画像は、学習モデル情報５０の生成処理に用いた学習画像と同じものであっても異なっていても構わない。ここでは、標識番号ｉの標識の学習に用いられる学習画像の数は、Ｍ_ｉであることとする。各学習画像には、その学習画像が属する分類に対応する標識が少なくとも１つ関連付けられている。そして、情報処理装置１０は、各学習画像について、上述のＳ１０２〜Ｓ１０４に示す処理と同様の処理により、重み付け済判別基準値（Ｆ_ｉｊ ^ｎ）’を特定する。そして、学習部４０が、学習の対象となる標識（例えば、標識番号ｉの標識）が関連付けられている学習画像の重み付け済判別基準値を正例として取り扱い（ｔ_ｉ ^ｎ＝＋１）、学習の対象となる標識（例えば、標識番号ｉの標識）が関連付けられていない学習画像の重み付け済判別基準値を負例として取り扱う（ｔ_ｉ ^ｎ＝０）設定を行った上で、次式で表される尤度が最大となるパラメータチルダＡ_ｉｊ及びチルダＢ_ｉを、バックトラッキング線形探索法（backtracking linear search）を用いたニュートン法によって（Nocedal,J.and S.J.Wright: “Numerical Optimization” Algorithm 6.2. New York, NY: Springer- Verlag, 1999.参照）計算する。次式において、（Ｆ_ｉｊ ^ｎ）’の値は、標識番号ｉの標識の学習に用いられるｎ番目の学習画像（ｎ＝１，・・・，Ｍ_ｉ）についての重み付け済判別基準値を表している。また、Ｔは特徴量を表している。 First, the image receiving unit 20 receives a plurality of images (learning images) to be learned. These learning images may be the same as or different from the learning images used for the generation process of the learning model information 50. Here, it is assumed that the number of learning images used for learning the sign of the sign number i is M _i . Each learning image is associated with at least one sign corresponding to the classification to which the learning image belongs. Then, the information processing apparatus 10 identifies the weighted determination reference value (F _ij ⁿ ) ′ for each learning image by the same processing as the processing shown in S102 to S104 described above. Then, the learning unit 40 treats the weighted discrimination reference value of the learning image associated with the learning target sign (for example, the sign with the sign number i) as a positive example (t _i ⁿ = + 1), and performs learning. A weighted discrimination reference value of a learning image that is not associated with a target sign (for example, a sign with a sign number i) is handled as a negative example (t _i ⁿ = 0), and is expressed by the following equation: The parameter tilde A _ij and tilde B _i that give the maximum likelihood are calculated by Newton's method using backtracking linear search (Nocedal, J. and SJWright: “Numerical Optimization” Algorithm 6.2. New York , NY: Springer-Verlag, 1999.). In the following equation, the value of (F _ij ⁿ ) ′ represents a weighted discrimination reference value for the n-th learning image (n = 1,..., M _i ) used for learning the marker with the marker number i. ing. T represents a feature amount.

そして、学習部４０は、上述のようにして特定されたチルダＡ_ｉｊ及びチルダＢ_ｉの値が含まれる確率モデル情報５４を確率モデル情報記憶部２６に出力する。 Then, the learning unit 40 outputs the probability model information 54 including the values of the tilde A _ij and the tilde B _i specified as described above to the probability model information storage unit 26.

このようにして、確率モデル情報記憶部２６に確率モデル情報５４が記憶されることとなる。 In this way, the probability model information 54 is stored in the probability model information storage unit 26.

ここで、本実施形態の一比較例について説明する。 Here, a comparative example of the present embodiment will be described.

本比較例に係る情報処理装置１０は、図２に例示する構成と比較して、判別基準値重み付け部３２を含まないという点が異なる。 The information processing apparatus 10 according to this comparative example is different from the configuration illustrated in FIG. 2 in that the determination reference value weighting unit 32 is not included.

また、本比較例では、確率モデル情報記憶部２６に、図８に例示する確率モデル情報５４が記憶される。この確率モデル情報５４では、例えば、各識別番号について２つのパラメータ（例えば、識別番号ｉに対しては、パラメータＡ_ｉ及びパラメータＢ_ｉ）が関連付けられている。 In this comparative example, the probability model information 54 illustrated in FIG. 8 is stored in the probability model information storage unit 26. In the probability model information 54, for example, two parameters (for example, a parameter A _i and a parameter B _i for the identification number _i ) are associated with each identification number.

そして、本比較例では、学習部４０は、学習対象となる複数の画像それぞれについて、ｋ番目の学習画像が学習の対象となる標識（例えば、標識番号ｉの標識）が関連付けられている場合に判別基準値Ｆ_ｋを正例として取り扱い（ｔ_ｋ＝＋１）、学習の対象となる標識（例えば、標識番号ｉの標識）が関連付けられていない場合に判別基準値Ｆ_ｋを負例として取り扱う（ｔ_ｋ＝０）設定を行った上で、次式で表される尤度が最大となるパラメータＡ_ｉ及びＢ_ｉを、バックトラッキング線形探索法（backtracking linear search）を用いたニュートン法によって（Nocedal,J.and S.J.Wright: “Numerical Optimization” Algorithm 6.2. New York, NY: Springer- Verlag, 1999.参照）計算する。次式において、Ｆ_ｋの値は、標識番号ｉの標識の学習に用いられるｋ番目の学習画像についての判別基準値Ｆ_ｋの値を表している。また、Ａ、Ｂの値は、それぞれ、標識番号ｉに対応するパラメータＡ、Ｂの値を示している。 And in this comparative example, the learning part 40 is the case where the label | marker (For example, the label | marker of the label | marker number i) with which the kth learning image becomes learning object is linked | related about each of several image used as learning object. The discrimination reference value F _k is handled as a positive example (t _k = + 1), and the discrimination reference value F _k is handled as a negative example when a label to be learned (for example, a label with the label number i) is not associated ( After setting t _k = 0), parameters A _i and B _i maximizing the likelihood expressed by the following equation are obtained by Newton's method using backtracking linear search (Nocedal , J. and SJWright: “Numerical Optimization” Algorithm 6.2. New York, NY: Springer-Verlag, 1999.). In the formula, the value of F _k denotes the value of the discriminant reference values F _k for the k-th learning image used for learning of the label of the labeled number i. The values of A and B indicate the values of parameters A and B corresponding to the label number i, respectively.

そして、本比較例では、属否可能性算出部３４が、受付画像のついての判別基準値Ｆ_ｉ（ｉ＝１，・・・，Ｌ）の値、及び、確率モデル情報５４に基づいて、次式を計算することにより、各ｉ（ｉ＝１，・・・，Ｌ）についての、属否可能性の値ｐ_ｉの値を算出する（Ｓ１０５）。次式において、ｐ_ｉが標識番号ｉの標識に対応する事後確率である。また、Ａ、Ｂの値は、それぞれ、標識番号ｉに対応するパラメータＡ、Ｂの値を示している。 Then, in this comparative example, the attribute possibility calculation unit 34, based on the value of the discrimination reference value F _i (i = 1,..., L) for the received image and the probability model information 54, by calculating the following equation, each i (i = 1, ···, L) for, it calculates a value of attribution potential value _{p i} (S105). In the following formulas, a posterior probability p _i corresponds to the label of the labeled number i. The values of A and B indicate the values of parameters A and B corresponding to the label number i, respectively.

本実施形態に係る情報処理装置１０は、相関マトリックス情報５２を用いて属否可能性の値を算出するという点が少なくとも比較例に係る情報処理装置１０と異なっている。 The information processing apparatus 10 according to the present embodiment is different from at least the information processing apparatus 10 according to the comparative example in that the correlation matrix information 52 is used to calculate the value of the possibility of belonging.

ここで、本実施形態に係る情報処理装置１０により相関マトリックス情報５２に基づく重み付け済判別基準値の算出を行う場合と、行わない場合との比較結果の一例について説明する。なお、ここでは、図９に例示する第１相関マトリックス情報５２−１及び図１０に例示する第２相関マトリックス情報５２−２を用いることとする。 Here, an example of a comparison result between the case where the information processing apparatus 10 according to the present embodiment calculates the weighted determination reference value based on the correlation matrix information 52 and the case where the weighted determination reference value is not performed will be described. Here, the first correlation matrix information 52-1 illustrated in FIG. 9 and the second correlation matrix information 52-2 illustrated in FIG. 10 are used.

以下の説明では、例えば、標識ｓｈｅｅｐに対応付けられる重み付け済判別基準値の値をＦ_{ｓｈｅｅｐ}と、標識ｓｈｅｅｐに対応する事後確率をＰ_{ｓｈｅｅｐ}、標識ｓｈｅｅｐに対応する上述したｔの値をｔ_{ｓｈｅｅｐ}で表す。なお、標識ｓｈｅｅｐ以外の標識についても同様の表現とする。 In the following description, for example, the value of the weighted discriminating reference value associated with the sign “sheep” is F _sheep , the posterior probability corresponding to the sign “sheep” is P _Sheep , and the value of t described above corresponding to the sign “sheep” is t _Sheep . Represent. In addition, it is set as the same expression also about labels other than label | marker sheep.

まず、第１の相関マトリックス情報５２−１に基づく重み付け済判別基準値の算出を行う場合と行わない場合との比較結果の一例について説明する。図１１に、第１の相関マトリックス情報５２−１を用いて重み付け済判別基準値の算出を行う場合と行わない場合との、標識の認識率の比較結果を示す。図１１で実施した比較では、判別基準値の値が０より大きい画像を用いた。また、図１１において、丸印は、重み１が設定されるデータを示しており、バツ印は、重み０が設定されるデータを示している。 First, an example of a comparison result between when the weighted discrimination reference value is calculated based on the first correlation matrix information 52-1 and when not calculated will be described. FIG. 11 shows a comparison result of the recognition rate of the sign when the weighted discrimination reference value is calculated using the first correlation matrix information 52-1 and when it is not calculated. In the comparison performed in FIG. 11, an image having a discrimination reference value larger than 0 was used. In FIG. 11, circles indicate data for which weight 1 is set, and crosses indicate data for which weight 0 is set.

図１１のＦ_{ｍｏｔｏｒｂｉｋｅ}に着目し、重み付け済判別基準値の算出を行う場合と行わない場合との違いを可視化する。図１２に、標識ｍｏｔｏｒｂｉｋｅに対する判別基準値によって、最適化された標識ｓｈｅｅｐのｓｉｇｍｏｉｄ関数を示す。ここでは、Ｆ_{ｍｏｔｏｒｂｉｋｅ}＞０の判別基準値も選択の対象となる。図１３に、同一の判別基準値に対して、図９に例示する第１の相関マトリックス情報５２−１に基づき、Ｆ_{ｍｏｔｏｒｂｉｋｅ}＞０の判別基準値を事後確率の算出の基礎としないようにした、標識ｓｈｅｅｐのｓｉｇｍｏｉｄ関数の、標識ｍｏｔｏｒｂｉｋｅの判別基準値に依存する部分を示す。図１２でも図１３でも、Ｆ_{ｍｏｔｏｒｂｉｋｅ}＜０の判別基準値については選択されている。 Focusing on F _motorbike in FIG. 11, the difference between when the weighted discrimination reference value is calculated and when it is not calculated is visualized. FIG. 12 shows the sigmoid function of the label sheep optimized by the discrimination reference value for the label motorbike. Here, the discrimination reference value of F _motorbike > 0 is also a selection target. In FIG. 13, for the same discriminant reference value, based on the first correlation matrix information 52-1 illustrated in FIG. 9, the discriminant reference value of F _motorbike > 0 is not used as the basis for calculating the posterior probability. The part depending on the discriminant reference value of the label motorbike of the sigmoid function of the label sheep In both FIG. 12 and FIG. 13, the discrimination reference value of F _motorbike <0 is selected.

図１２及び図１３において、丸印の点は、標識ｓｈｅｅｐが関連付けられている画像から抽出した特徴量（ｔ_{ｓｈｅｅｐ}＝１）であり、バツ印の点は、標識ｓｈｅｅｐが関連付けられていない画像から抽出した特徴量（ｔ_{ｓｈｅｅｐ}＝０）である。横軸は、標識ｍｏｔｏｒｂｉｋｅの学習モデルに対応する判別基準値Ｆ_{ｍｏｔｏｒｂｉｋｅ}を示している。図１３の例では、Ｆ_{ｍｏｔｏｒｂｉｋｅ}＜０のデータについては、判別基準値の重みが1となる。Ｆ_{ｍｏｔｏｒｂｉｋｅ}＞０のデータについては、判別基準値の重みが０となる。 In FIG. 12 and FIG. 13, the circled points are feature amounts (t _sheep = 1) extracted from the image associated with the marker shape, and the crossed points are obtained from the image not associated with the marker shape. The extracted feature amount (t _sheep = 0). The horizontal axis indicates the discrimination reference value F _motorbike corresponding to the learning model of the marker motorbike. In the example of FIG. 13, the weight of the discrimination reference value is 1 for data of F _motorbike <0. For data of F _motorbike > 0, the weight of the discrimination reference value is 0.

図１２ではＦ_{ｍｏｔｏｒｂｉｋｅ}＞０に該当するデータ（図１２の右半分のデータ）の影響で、ｓｉｇｍｏｉｄ関数の特性が、図１３のｓｉｇｍｏｉｄ関数よりも緩やかなカーブとなり、Ｆ_{ｍｏｔｏｒｂｉｋｅ}の値がＰ_{ｓｈｅｅｐ}の値に与える影響が小さくなっていることがわかる。一方、図１３ではＦ_{ｍｏｔｏｒｂｉｋｅ}の値がＰ_{ｓｈｅｅｐ}の値に与える影響が図１２のｓｉｇｍｏｉｄ関数よりも大きくなっていることがわかる。また、図１１に示すように、本比較結果では、第１の相関マトリックス情報５２−１を用いて重み付け済判別基準値の算出を行う場合の方が、行わない場合よりも、標識ｓｈｅｅｐの認識率が高くなっている。 Under the influence of FIG. 12, data corresponding to _{F motorbike>} 0 (the right half of the data in FIG. 12), characteristic of sigmoid function becomes a gentle curve than sigmoid function of Fig. _13, the value of _{F motorbike} is _{P sheep} It can be seen that the effect on the value is small. On the other hand, in FIG. 13, it can be seen that the influence of the value F _{motorbike on} the value of P _sheep is greater than that of the sigmoid function of FIG. In addition, as shown in FIG. 11, in this comparison result, the recognition of the marker shape is performed when the weighted discrimination reference value is calculated using the first correlation matrix information 52-1, rather than when it is not performed. The rate is high.

次に、第２の相関マトリックス情報５２−２を用いて重み付け済判別基準値の算出を行う場合と行わない場合との比較結果の一例について説明する。図１４に、第２の相関マトリックス情報５２−２を用いて重み付け済判別基準値の算出を行う場合と行わない場合との、標識の認識率の比較結果を示す。図１４で実施した比較では、判別基準値の値が０より小さい画像を用いた。また、図１４において、丸印は、重みとして１が設定されるデータを示しており、バツ印は、重みとして０が設定されるデータを示している。 Next, an example of a comparison result between the case where the weighted determination reference value is calculated using the second correlation matrix information 52-2 and the case where the weighted determination reference value is not calculated will be described. FIG. 14 shows a comparison result of the recognition rate of the sign when the weighted discrimination reference value is calculated using the second correlation matrix information 52-2 and when it is not calculated. In the comparison performed in FIG. 14, an image having a discrimination reference value smaller than 0 was used. In FIG. 14, circles indicate data for which 1 is set as the weight, and crosses indicate data for which 0 is set as the weight.

図１４のＦ_{ｐｅｒｓｏｎ}に着目し、重み付け済判別基準値の算出を行う場合と行わない場合との違いを可視化する。図１５に、標識ｐｅｒｓｏｎに対する判別基準値によって、最適化された標識ｃａｔのｓｉｇｍｏｉｄ関数を示す。ここでは、Ｆ_{ｐｅｒｓｏｎ}＜０の判別基準値も選択の対象となる。図１６に、同一の判別基準値に対して、図１５に例示する第２の相関マトリックス情報５２−２に基づき、Ｆ_{ｐｅｒｓｏｎ}＜０の判別基準値を事後確率の算出の基礎としないようにした、標識ｃａｔのｓｉｇｍｏｉｄ関数の、標識ｐｅｒｓｏｎの判別基準値に依存する部分を示す。図１５でも図１６でも、Ｆ_{ｐｅｒｓｏｎ}＞０の判別基準値については選択されている。 Focusing on F _person in FIG. 14, the difference between the case where the weighted discrimination reference value is calculated and the case where it is not calculated is visualized. FIG. 15 shows the sigmoid function of the label cat optimized by the discrimination reference value for the label person. Here, a discrimination reference value of _Fperson <0 is also a selection target. In FIG. 16, for the same discriminant reference value, based on the second correlation matrix information 52-2 illustrated in FIG. 15, the discriminant reference value of F _person <0 is not used as the basis for calculating the posterior probability. The part depending on the discriminant reference value of the label person of the sigmoid function of the label cat is shown. In FIG. 15 and FIG. 16, the discrimination reference value of F _person > 0 is selected.

図１５及び図１６において、丸印の点は、標識ｃａｔが関連付けられている画像から抽出した特徴量（ｔ_ｃａｔ＝１）であり、バツ印の点は、標識ｃａｔが関連付けられていない画像から抽出した特徴量（ｔ_ｃａｔ＝０）である。横軸は、標識ｐｅｒｓｏｎの学習モデルに対応する判別基準値Ｆ_{ｐｅｒｓｏｎ}を示している。 In FIGS. 15 and 16, the circled points are feature quantities (t _cat = 1) extracted from the image associated with the marker cat, and the crossed dots are from the image not associated with the marker cat. The extracted feature amount (t _cat = 0). The horizontal axis indicates the discrimination reference value _Fperson corresponding to the learning model of the label person.

図１６の例では、Ｆ_{ｐｅｒｓｏｎ}＞０のデータについては、判別基準値の重みが1となる。Ｆ_{ｐｅｒｓｏｎ}＜０のデータについては、判別基準値の重みが０となる。 In the example of FIG. 16, the weight of the discrimination reference value is 1 for data with F _person > 0. For data of _Fperson <0, the weight of the discrimination reference value is 0.

図１５ではＦ_{ｐｅｒｓｏｎ}＜０に該当するデータ（図１５の左半分のデータ）の影響で、ｓｉｇｍｏｉｄ関数の特性が、図１６のｓｉｇｍｏｉｄ関数よりもより緩やかなカーブとなり、Ｆ_{ｐｅｒｓｏｎ}の値がＰ_ｃａｔの値に与える影響が小さくなっていることがわかる。一方、図１６ではＦ_{ｐｅｒｓｏｎ}の値がＰ_ｃａｔの値に与える影響が図１５のｓｉｇｍｏｉｄ関数よりも大きくなっていることがわかる。また、図１４に示すように、本比較結果では、第２の相関マトリックス情報５２−２を用いて重み付け済判別基準値の算出を行う場合の方が、行わない場合よりも、標識ｃａｔの認識率が高くなっている。 In FIG. 15, due to the influence of data corresponding to F _person <0 (the data in the left half of FIG. 15), the characteristics of the sigmoid function become a gentler curve than the sigmoid function of FIG. 16, and the value of F _person is P _cat It can be seen that the influence on the value of is small. On the other hand, FIG. 16 shows that the influence of the value of F _{person on} the value of P _cat is larger than that of the sigmoid function of FIG. Further, as shown in FIG. 14, in this comparison result, the marker cat is recognized when the weighted discrimination reference value is calculated using the second correlation matrix information 52-2 than when it is not calculated. The rate is high.

図１７に、本比較での確率モデル情報５４に含まれるパラメータチルダＡの一具体例を示す。図１７からも、本比較において標識の相関性が活用されていることがわかる。例えば、行ｂｕｓと列ｃａｒのパラメータチルダＡの値が負であり、その絶対値が大きいために、標識ｂｕｓと標識ｃａｒの正の相関性を表せていることがわかる。同様に、行ｂｕｓと列ｍｏｔｏｒｂｉｋｅのパラメータチルダＡの値が正であり、その絶対値も大きいために、標識ｂｕｓと標識ｍｏｔｏｒｂｉｋｅの負の相関性を表せていることがわかる。それにより、標識ｂｕｓに関連付けられている画像の判別性能が高くなっている。一方、行ｂｕｓと列ｃａｒのパラメータチルダＡの値がゼロに近く、標識ｄｏｇと標識ｃａｒに相関性がないということも表している。 FIG. 17 shows a specific example of the parameter tilde A included in the probability model information 54 in this comparison. FIG. 17 also shows that label correlation is utilized in this comparison. For example, since the value of the parameter tilde A in the row bus and the column car is negative and the absolute value thereof is large, it can be seen that the positive correlation between the sign bus and the sign car can be expressed. Similarly, since the value of the parameter tilde A of row bus and column motorbike is positive and the absolute value thereof is also large, it can be seen that the negative correlation between the sign bus and the sign motorbike can be expressed. Thereby, the discrimination performance of the image associated with the sign bus is high. On the other hand, the value of the parameter tilde A in the row bus and the column car is close to zero, which indicates that there is no correlation between the label dog and the label car.

なお、本発明は上述の実施形態に限定されるものではない。また、上記の具体的な数値や文字列は例示であり、これらの数値や文字列には限定されない。 In addition, this invention is not limited to the above-mentioned embodiment. Moreover, the above specific numerical values and character strings are examples, and are not limited to these numerical values and character strings.

１０情報処理装置、１２制御部、１４記憶部、１６ユーザインタフェース（ＵＩ）部、２０画像受付部、２２学習モデル情報記憶部、２４相関マトリックス情報記憶部、２６確率モデル情報記憶部、２８特徴量抽出部、３０判別基準値算出部、３２判別基準値重み付け部、３４属否可能性算出部、３６分類決定部、３８出力部、４０学習部、５０学習モデル情報、５２相関マトリックス情報、５４確率モデル情報。 DESCRIPTION OF SYMBOLS 10 Information processing apparatus, 12 Control part, 14 Storage part, 16 User interface (UI) part, 20 Image reception part, 22 Learning model information storage part, 24 Correlation matrix information storage part, 26 Probability model information storage part, 28 Feature-value Extraction unit, 30 Discrimination reference value calculation unit, 32 Discrimination reference value weighting unit, 34 Dependence possibility calculation unit, 36 Classification determination unit, 38 Output unit, 40 Learning unit, 50 Learning model information, 52 Correlation matrix information, 54 Probability Model information.

Claims

Image receiving means for receiving an image that is a target for specifying at least one of a plurality of classifications;
For each of the plurality of classifications, the image receiving means is based on the feature quantity of the image received by the image receiving means and the feature quantities of a plurality of images belonging to at least one of the plurality of classifications. A discriminant reference value specifying means for specifying a discriminant reference value serving as a reference for discriminating whether or not the received image belongs to the classification;
For each of the plurality of classifications, a plurality of classification criteria values that determine whether or not they belong to the classifications, and a plurality of classifications that belong to at least one of the plurality of classifications that are selected based on the discrimination criterion values First correlation information that is specified based on the image and represents the possibility that an image belonging to the category belongs to another category, or a possibility that an image not belonging to the category belongs to the other category Weighted discrimination reference value specifying means for specifying a weighted discrimination reference value associated with a combination of the classification and the other classification based on any one of the second correlation information to be expressed,
For each of the plurality of classifications, the image received by the image receiving unit is based on the weighted determination reference value associated with the combination of the classification specified for each of the plurality of other classifications and the other classification. Classification categorical possibility determination means for specifying a value indicating the likelihood of belonging to a classification,
An output means for outputting information indicating at least one classification to which an image received by the image receiving means is specified, which is specified based on the value specified by the classification affiliation possibility specifying means;
A program characterized by causing a computer to function.

The image that the image receiving means accepts based on the combination of the correlation information and the discrimination reference value for each of the classifications that is different from the classification for each classification, Identify values that are likely to belong to
The program according to claim 1.

Image receiving means for receiving an image that is a target for specifying at least one of a plurality of classifications;
For each of the plurality of classifications, the image receiving means is based on the feature quantity of the image received by the image receiving means and the feature quantities of a plurality of images belonging to at least one of the plurality of classifications. A discrimination reference value specifying means for specifying a discrimination reference value serving as a reference for discriminating whether or not the received image belongs to the classification;
For each of the plurality of classifications, a plurality of classification criteria values that determine whether or not they belong to the classifications, and a plurality of classifications that belong to at least one of the plurality of classifications that are selected based on the discrimination criterion values First correlation information that is specified based on the image and represents the possibility that an image belonging to the category belongs to another category, or a possibility that an image not belonging to the category belongs to the other category Weighted discrimination reference value specifying means for specifying a weighted discrimination reference value associated with a combination of the classification and the other classification based on any one of the second correlation information to be expressed;
For each of the plurality of classifications, the image received by the image receiving unit is based on the weighted determination reference value associated with the combination of the classification specified for each of the plurality of other classifications and the other classification. A classification affiliation possibility identification means for identifying a value indicating the level of possibility of belonging to a classification,
Output means for outputting information indicating at least one classification to which an image received by the image receiving means, specified based on the value specified by the classification affiliation possibility specifying means,
An information processing system comprising: