JP2013061732A

JP2013061732A - Image identification information provision program and image identification information provision device

Info

Publication number: JP2013061732A
Application number: JP2011198654A
Authority: JP
Inventors: Yukihiro Tsuboshita; 幸寛坪下; Noriji Kato; 典司加藤
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2011-09-12
Filing date: 2011-09-12
Publication date: 2013-04-04

Abstract

PROBLEM TO BE SOLVED: To provide an image identification information provision program and an image identification information provision device, which provide input image information with identification information by using a learning result obtained by learning using a probability model.SOLUTION: An image identification information provision device 1 includes: image obtainment means 100 for obtaining image information, for each piece of identification information, from image information pre-correlated with the identification information; feature vector generation means 102 for generating feature information from the obtained image information; and model learning-means 104 for generating a learning result by learning relation between the generated feature information and the identification information of image information corresponding the feature information by using a probability model by mixing probability distribution. Further, the model learning-means 104 determines variables in the process of generating mixing elements of the probability model by using a set of feature information obtained from all pieces of image information obtained by the obtainment means regardless of contents of the identification information and determines prior distribution for the variables of the probability distribution.

Description

本発明は、画像識別情報付与プログラム及び画像識別情報付与装置に関する。 The present invention relates to an image identification information providing program and an image identification information providing apparatus.

従来の画像識別情報付与装置として、画像情報等から得られる特徴情報に基づいて付される識別情報（以下、「ラベル」という。）が予め用意されたとき、特徴情報とラベルとの関係について予め学習を行っておき、学習の結果に基づき入力される画像情報がどのラベルに所属するか識別するものがある（例えば、特許文献１参照）。 As a conventional image identification information providing device, when identification information (hereinafter referred to as “label”) attached based on feature information obtained from image information or the like is prepared in advance, the relationship between the feature information and the label is previously determined. There is one that learns and identifies which label the input image information belongs to based on the result of the learning (see, for example, Patent Document 1).

特許文献１に記載のパターン認識装置は、画像情報から多次元の特徴情報を生成し、当該特徴情報に予めラベルを用意したものを学習データとしてラベルと特徴情報とを対応付けることで学習するとともに、入力される画像情報の特徴情報を算出して学習した特徴情報を参照することで入力された画像情報の特徴情報が各ラベルに所属する度合いを表す値を算出して、当該所属する度合いを表す値が最大となったラベルに入力された画像情報が所属すると認識する画像識別情報付与装置が開示されている。 The pattern recognition device described in Patent Literature 1 generates multidimensional feature information from image information, learns by associating a label with feature information in which feature information is prepared in advance as learning data, By calculating the feature information of the input image information and referring to the learned feature information, a value indicating the degree to which the feature information of the input image information belongs to each label is calculated, and the degree to which the information belongs An image identification information assigning device that recognizes that image information input to a label having the maximum value belongs is disclosed.

特開平９−２３１３６６号公報JP-A-9-231366

本発明の目的は、確率モデルにより学習した学習結果を利用して、入力される画像情報に識別情報を付与する画像識別情報付与プログラム及び画像識別情報付与装置を提供することにある。 An object of the present invention is to provide an image identification information adding program and an image identification information adding device for adding identification information to input image information using a learning result learned by a probability model.

本発明の一態様は、上記目的を達成するため、以下の画像識別情報付与プログラム及び画像識別情報付与装置を提供する。 In order to achieve the above object, one aspect of the present invention provides the following image identification information providing program and image identification information providing apparatus.

［１］コンピュータを、
予め識別情報が関連付けられた画像情報から識別情報毎に画像情報を取得する取得手段と、
前記取得手段が取得した画像情報から特徴情報を生成する生成手段と、
前記生成手段が生成した特徴情報と当該特徴情報に対応する画像情報の識別情報との関係を確率分布の混合による確率モデルにより学習して学習結果を生成するものであって、当該確率モデルの混合要素生成過程の変数を前記取得手段によって識別情報の内容に関わらず取得された全ての画像情報から得られる特徴情報の集合を用いて定めるとともに、前記確率分布の変数に対して事前分布を定める学習手段と、
前記学習手段が生成した前記学習結果に基づいて、識別情報毎に前記確率モデルを用いて識別情報の付与対象である画像情報から前記生成手段によって生成される特徴情報の事後分布を計算する計算手段と、
前記事後分布に基づき前記識別情報の付与対象である画像情報に関連付けられる識別情報を推定する推定手段として機能させる画像識別情報付与プログラム。 [1]
Acquisition means for acquiring image information for each identification information from image information associated with the identification information in advance;
Generating means for generating feature information from the image information acquired by the acquiring means;
A learning result is generated by learning the relationship between the feature information generated by the generating means and the identification information of the image information corresponding to the feature information using a probability model based on a mixture of probability distributions, and the mixture of the probability models Learning to determine a variable of an element generation process using a set of feature information obtained from all image information acquired by the acquisition means regardless of the content of identification information, and to determine a prior distribution for the variable of the probability distribution Means,
Based on the learning result generated by the learning unit, a calculation unit that calculates a posterior distribution of feature information generated by the generation unit from image information to which identification information is added using the probability model for each piece of identification information When,
An image identification information addition program that functions as an estimation unit that estimates identification information associated with image information to which the identification information is to be added based on the posterior distribution.

［２］前記学習手段は、前記事前分布の変数の初期値を識別情報の内容に関わらず全ての画像情報から得られる特徴情報の集合を用いて定める前記［１］に記載の画像識別情報付与プログラム。 [2] The image identification information according to [1], wherein the learning unit determines an initial value of the prior distribution variable using a set of feature information obtained from all image information regardless of the content of the identification information. Grant program.

［３］予め識別情報が関連付けられた画像情報から識別情報毎に画像情報を取得する取得手段と、
前記取得手段が取得した画像情報から特徴情報を生成する生成手段と、
前記生成手段が生成した特徴情報と当該特徴情報に対応する画像情報の識別情報との関係を確率分布の混合による確率モデルにより学習して学習結果を生成するものであって、当該確率モデルの混合要素生成過程の変数を前記取得手段によって識別情報の内容に関わらず取得された全ての画像情報から得られる特徴情報の集合を用いて定めるとともに、前記確率分布の変数に対して事前分布を定める学習手段と、
前記学習手段が生成した前記学習結果に基づいて、識別情報毎に前記確率モデルを用いて識別情報の付与対象である画像情報から前記生成手段によって生成される特徴情報の事後分布を計算する計算手段と、
前記事後分布に基づき前記識別情報の付与対象である画像情報に関連付けられる識別情報を推定する推定手段とを有する画像識別情報付与装置。 [3] Acquisition means for acquiring image information for each identification information from image information associated with the identification information in advance;
Generating means for generating feature information from the image information acquired by the acquiring means;
A learning result is generated by learning the relationship between the feature information generated by the generating means and the identification information of the image information corresponding to the feature information using a probability model based on a mixture of probability distributions, and the mixture of the probability models Learning to determine a variable of an element generation process using a set of feature information obtained from all image information acquired by the acquisition means regardless of the content of identification information, and to determine a prior distribution for the variable of the probability distribution Means,
Based on the learning result generated by the learning unit, a calculation unit that calculates a posterior distribution of feature information generated by the generation unit from image information to which identification information is added using the probability model for each piece of identification information When,
An image identification information providing apparatus comprising: estimation means for estimating identification information associated with image information to which the identification information is to be applied based on the posterior distribution.

請求項１又は３に係る発明によれば、本発明の構成を有していない場合と比較して、識別情報が既知の学習データに対する過学習を抑制し、入力される画像情報に対する識別性能を向上させ、入力される画像情報に識別情報を付与することができる。 According to the invention which concerns on Claim 1 or 3, compared with the case where it does not have the structure of this invention, the over-learning with respect to learning data with which identification information is known is suppressed, and the identification performance with respect to the input image information is improved. The identification information can be given to the input image information.

請求項２に係る発明によれば、事前分布の変数の初期値を識別情報の内容に関わらず全ての画像情報から得られる特徴情報の集合を用いて定めることができる。 According to the invention of claim 2, the initial value of the prior distribution variable can be determined using a set of feature information obtained from all image information regardless of the content of the identification information.

図１は、本発明の実施の形態に係る画像識別情報付与装置の構成の一例を示す図である。FIG. 1 is a diagram showing an example of the configuration of an image identification information providing apparatus according to an embodiment of the present invention. 図２は、モデル学習手段が学習動作に用いる確率生成モデルの一例を示した概略図である。FIG. 2 is a schematic diagram showing an example of a probability generation model used by the model learning means for the learning operation. 図３は、画像識別情報付与装置の動作例を示すフローチャートである。FIG. 3 is a flowchart illustrating an operation example of the image identification information adding apparatus. 図４は、学習アルゴリズムの内容を示すフローチャートである。FIG. 4 is a flowchart showing the contents of the learning algorithm. 図５は、画像識別情報付与装置の動作例を示すフローチャートである。FIG. 5 is a flowchart illustrating an operation example of the image identification information adding apparatus.

（画像識別情報付与装置の構成）
図１は、本発明の実施の形態に係る画像識別情報付与装置の構成の一例を示す図である。 (Configuration of image identification information adding device)
FIG. 1 is a diagram showing an example of the configuration of an image identification information providing apparatus according to an embodiment of the present invention.

画像識別情報付与装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等から構成され各部を制御するとともに各種のプログラムを実行する制御部１０と、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やフラッシュメモリ等の記憶媒体から構成され情報を記憶する記憶部１１と、外部とネットワークを介して通信する通信部１２を備える。 The image identification information assigning device 1 is composed of a CPU (Central Processing Unit) and the like, and is configured by a control unit 10 that controls each unit and executes various programs, and a storage medium such as an HDD (Hard Disk Drive) and a flash memory. A storage unit 11 that stores information and a communication unit 12 that communicates with the outside via a network are provided.

画像識別情報付与装置１は、通信部１２等を介して入力される画像に「山」「川」「子供」等の被写体が含まれるとき、これら「山」「川」「子供」等の語句（以下、「アノテーション単語」という。）を画像情報に対する識別情報（以下、「ラベル」という。）として付与する。また、画像識別情報付与装置１は、記憶部１１等に格納された予めラベルが付与された学習用の画像情報を用いて学習を行うものである。 When the image input through the communication unit 12 or the like includes an object such as “mountain”, “river”, “child”, etc., the image identification information assigning device 1 (Hereinafter referred to as “annotation word”) is assigned as identification information for the image information (hereinafter referred to as “label”). Moreover, the image identification information provision apparatus 1 learns using the image information for learning by which the label was previously provided stored in the memory | storage part 11 grade | etc.,.

制御部１０は、後述する画像識別情報付与プログラム１１０を実行することで、画像取得手段１００、画像分割手段１０１、特徴ベクトル生成手段１０２、学習データ集合取得手段１０３、モデル学習手段１０４、尤度計算手段１０５、アノテーション単語推定手段１０６及び出力手段１０７等として機能する。 The control unit 10 executes an image identification information adding program 110 to be described later, whereby an image acquisition unit 100, an image division unit 101, a feature vector generation unit 102, a learning data set acquisition unit 103, a model learning unit 104, a likelihood calculation. Functions as means 105, annotation word estimation means 106, output means 107, and the like.

画像取得手段１００は、学習の際に記憶部１１に格納された画像情報を取得し、又ラベルを推定する際に通信部１２を介して外部の端末装置等から入力された画像情報を取得する。 The image acquisition unit 100 acquires image information stored in the storage unit 11 during learning, and acquires image information input from an external terminal device or the like via the communication unit 12 when estimating a label. .

画像分割手段１０１は、画像取得手段１００が取得した画像情報及び記憶部１１に格納された学習用の画像情報１１１を、それぞれ複数の領域に分割し、部分領域を生成する。画像分割手段１０１は、例えば、矩形でメッシュ状に分割する方法やｋ近傍法等のクラスタリング手法を用いて類似の近接する画素を同一の部分領域と定める方法等を用いる。 The image dividing unit 101 divides the image information acquired by the image acquiring unit 100 and the learning image information 111 stored in the storage unit 11 into a plurality of regions, and generates partial regions. The image dividing unit 101 uses, for example, a method of dividing a rectangle into a mesh shape or a method of determining similar adjacent pixels as the same partial region using a clustering method such as a k-nearest neighbor method.

特徴ベクトル生成手段１０２は、画像分割手段１０１が生成した部分領域のそれぞれから、例えば、ガボールフィルタを用いる方法やＲＧＢ、正規化ＲＧ、ＣＩＥＬＡＢ等の特徴量を抽出する方法によって特徴ベクトルを生成する。ここで、特徴ベクトルは特徴情報の一例である。 The feature vector generating unit 102 generates a feature vector from each of the partial regions generated by the image dividing unit 101 by, for example, a method using a Gabor filter or a method of extracting feature quantities such as RGB, normalized RG, CIELAB. Here, the feature vector is an example of feature information.

学習データ集合取得手段１０３は、画像情報１１１から同一のラベルが与えられた画像情報を取得し、当該画像情報に含まれる特徴ベクトルの集合を学習データ集合として取得する。また、ラベルの内容に関わらず全ての画像情報１１１から得られる特徴ベクトルの集合（以下、「ユニバーサルモデル」という。）を学習データ集合として取得する。 The learning data set acquisition unit 103 acquires image information provided with the same label from the image information 111, and acquires a set of feature vectors included in the image information as a learning data set. Also, a set of feature vectors (hereinafter referred to as “universal model”) obtained from all image information 111 regardless of the contents of the label is acquired as a learning data set.

モデル学習手段１０４は、学習データ集合取得手段１０３が取得した学習データ集合を用いて学習を行う。 The model learning unit 104 performs learning using the learning data set acquired by the learning data set acquisition unit 103.

尤度計算手段１０５は、画像取得手段１００が取得した画像情報の特徴ベクトルの任意のラベルに対する尤度を計算する。 The likelihood calculating unit 105 calculates the likelihood for an arbitrary label of the feature vector of the image information acquired by the image acquiring unit 100.

アノテーション単語推定手段１０６は、尤度の高いラベルに対応したアノテーション単語を入力された画像情報の識別情報として推定する。 The annotation word estimation unit 106 estimates an annotation word corresponding to a label with high likelihood as identification information of the input image information.

出力手段１０７は、アノテーション単語推定手段１０６が推定したアノテーション単語のうち、例えば、尤度の高い上位数個を表示装置、印刷装置、記憶部１１等に出力する。 The output unit 107 outputs, for example, the top several words with the highest likelihood among the annotation words estimated by the annotation word estimation unit 106 to the display device, the printing device, the storage unit 11, and the like.

記憶部１１は、制御部１０を上述した各手段として動作させる画像識別情報付与プログラム１１０、学習の際に用いられる画像情報１１１、画像情報１１１に含まれる画像情報とラベルとを関連付けるラベル情報１１２及びモデル学習手段１０４による学習結果として保存される学習情報１１３等を記憶する。 The storage unit 11 includes an image identification information adding program 110 that causes the control unit 10 to operate as each unit described above, image information 111 used during learning, label information 112 that associates image information and labels included in the image information 111, and The learning information 113 stored as a learning result by the model learning unit 104 is stored.

（画像識別情報付与装置の動作）
以下に、画像識別情報付与装置の動作例を各図を参照しつつ、（１）基本学習動作、（２）詳細学習動作、（３）アノテーション推定動作に分けて説明する。 (Operation of the image identification information adding device)
Hereinafter, the operation example of the image identification information assigning apparatus will be described by dividing into (1) basic learning operation, (2) detailed learning operation, and (3) annotation estimation operation with reference to each drawing.

図３は、画像識別情報付与装置の動作例を示すフローチャートである。 FIG. 3 is a flowchart illustrating an operation example of the image identification information adding apparatus.

（１）基本学習動作
まず、画像取得手段１００は、記憶部１１から学習用データとして画像情報１１１を取得する（Ｓ１）。画像情報１１１は、複数の画像情報を含むものとする。 (1) Basic Learning Operation First, the image acquisition unit 100 acquires the image information 111 as learning data from the storage unit 11 (S1). The image information 111 includes a plurality of pieces of image information.

次に、画像分割手段１０１は、画像をｎ個の領域に分割し、部分領域を生成する（Ｓ２）。一例として、矩形でメッシュ状に分割する。この動作は、画像情報１１１に含まれる複数の画像情報のそれぞれについて行われる。 Next, the image dividing unit 101 divides the image into n regions and generates a partial region (S2). As an example, a rectangular mesh is divided. This operation is performed for each of a plurality of pieces of image information included in the image information 111.

次に、特徴ベクトル生成手段１０２は、部分領域から、一例としてガボールフィルタ等を用いて複数の特徴量を抽出し、部分領域毎にこれら特徴量を成分とした特徴ベクトルｘ_１、ｘ_２、…、ｘ_ｎを生成する（Ｓ３）。この動作についても、画像情報１１１に含まれる複数の画像情報のそれぞれについて行われる。 Next, the feature vector generation unit 102 extracts a plurality of feature amounts from the partial region using a Gabor filter or the like as an example, and feature vectors x ₁ , x ₂ ,... Using these feature amounts as components for each partial region. , _Xn are generated (S3). This operation is also performed for each of a plurality of pieces of image information included in the image information 111.

次に、学習データ集合取得手段１０３は、ラベル情報１１２を参照し、画像情報１１１から、まずラベルＣ_１に対応付けられた画像情報を取得するとともに、取得した画像情報から生成された特徴ベクトルの集合を学習データ集合として取得する（Ｓ４、Ｓ５）。 Next, the learning data set obtaining unit 103 refers to the label information 112, from the image information 111, first, acquires the image information associated with the label C _1, the feature vector generated from the acquired image information A set is acquired as a learning data set (S4, S5).

次に、モデル学習手段１０４は、学習データ集合取得手段１０３が取得したラベルＣ_１の学習データについて学習を実行し（Ｓ６）、学習結果を記憶部１１の学習情報１１３に保存する（Ｓ７）。 Next, the model learning unit 104 performs learning on the learning data of the label C ₁ acquired by the learning data set acquisition unit 103 (S6), and stores the learning result in the learning information 113 of the storage unit 11 (S7).

上記ステップＳ５〜Ｓ７は、すべてのラベル（上限Ｍ）について実行される（Ｓ１８、Ｓ１９）。 Steps S5 to S7 are executed for all labels (upper limit M) (S18, S19).

以下の「（２）詳細学習動作」において、ステップＳ６でモデル学習手段１０４が実行する学習動作の詳細について説明する。 In the following “(2) Detailed learning operation”, details of the learning operation executed by the model learning unit 104 in step S6 will be described.

（２）詳細学習動作
モデル学習手段１０４は、確率生成モデルとして、混合ガウス分布モデル（ＧａｕｓｓｉａｎＭｉｘｔｕｒｅＭｏｄｅｌ：ＧＭＭ）を用いる。入力される学習データ集合をＸ＝{ｘ₁,…ｘ_N}とし、特徴ベクトルの次元をＤとすると、混合ガウスモデルｐは、次式で定義される。

ここで、入力される学習データ数をＮ、混合要素数をＫとしている。π_kは混合比を表わし、Ｎ(ｘ_i｜μ_k,Λ_k ^-1)は、平均μ_k、分散Λ_k ^-1であるＤ次元ガウス分布である。 (2) Detailed Learning Operation The model learning unit 104 uses a Gaussian Mixture Model (GMM) as a probability generation model. If the input learning data set is X = {x ₁ ,... X _N } and the dimension of the feature vector is D, the mixed Gaussian model p is defined by the following equation.

Here, the number of input learning data is N, and the number of mixed elements is K. π _k represents a mixing ratio, and N (x _i | μ _k , Λ _k ⁻¹ ) is a D-dimensional Gaussian distribution having an average μ _k and a variance Λ _k ⁻¹ .

また、混合比は、次式を満たす。

ここで、混合要素数Ｋは、事前に定めず、無限個の混合比率の和としている。 The mixing ratio satisfies the following formula.

Here, the number K of mixing elements is not determined in advance, and is the sum of an infinite number of mixing ratios.

また、混合要素をデータに合わせて適切に選択するため、混合比πは、Ｄｉｒｉｃｈｌｅｔ過程（Ｓｔｉｃｋｂｒｅａｋｉｎｇ表現）から生成されるものとする。

ここで、は、パラメータαをもつベータ分布Beta(1,α)に従い、さらにαは、Ｃ＝{c₁,c₂}をハイパーパラメータに持つガンマ分布Gamma(c₁,c₂)に従うものとする。 Further, in order to appropriately select the mixing element in accordance with the data, the mixing ratio π is assumed to be generated from a Dirichlet process (Stick breaking expression).

Here, follows a beta distribution Beta (1, α) having a parameter α, and α follows a gamma distribution Gamma (c ₁ , c ₂ ) having C = {c ₁ , c ₂ } as hyperparameters. To do.

また、より柔軟な確率推定を可能とするため、ガウス分布のパラメータ（平均、分散）に対して事前分布を定める。ここでは、平均値、分散のそれぞれの共役事前分布であるガウス分布、ウィシャート分布をそれぞれの事前分布として定める。すなわち、

Further, in order to enable more flexible probability estimation, a prior distribution is determined for the parameters (mean and variance) of the Gaussian distribution. Here, the Gaussian distribution and the Wishart distribution, which are conjugate prior distributions of the average value and the variance, are determined as the respective prior distributions. That is,

ここで、ウィシャート分布Ｗ(Λ_k｜Ｗ₀,υ₀)は次式で定義される。

ここで、

Here, the Wishart distribution W (Λ _k | W ₀ , υ ₀ ) is defined by the following equation.

here,

以上が、本発明の実施の形態で用いた確率生成モデルの説明である。これをベイジアンネットワークの表現により図示すると、図２のようになる。 The above is the description of the probability generation model used in the embodiment of the present invention. This is illustrated by a Bayesian network expression as shown in FIG.

図２は、モデル学習手段１０４が学習動作に用いる確率生成モデルの一例を示した概略図である。 FIG. 2 is a schematic diagram illustrating an example of a probability generation model used by the model learning unit 104 for a learning operation.

黒丸で示されたｃ_１、ｃ_２、Ｗ_０、ν_０、β_０、ｍ_０は定数を示し、中抜きの丸で示されたα_π、π，ｖ、ｚ_ｎ、ｘ_ｎ、Λ、μ、は確率変数を示す。また、ｘ_ｎは観測ノードを表わす。矢印は、確率的な依存関係を表わしている。また、ｚ_ｎ、ｘ_ｎを含む四角はＮ個の独立同時分布の観測値を示す。 C ₁ , c ₂ , W ₀ , ν ₀ , β ₀ , m ₀ indicated by black circles represent constants, and α _π , π, v, z _n , x _n , Λ, indicated by hollow circles μ represents a random variable. _Xn represents an observation node. Arrows represent probabilistic dependencies. A square including z _n and x _n indicates an observation value of N independent simultaneous distributions.

学習データ集合に対するモデルパラメータ{τ_1,k,τ_2,k,λ₁,λ₂,β_k,υ_k,Ｗ_k,m_k}の最尤解を求めることを、機械学習ではモデルパラメータの学習という。ここで、単純なＥＭアルゴリズムは適用できず、変分ベイズ法（ＶａｒｉａｔｉｏｎａｌＢａｙｅｓ法：ＶＢ法）を用いて最尤解を求める。 Finding the maximum likelihood solution of model parameters {τ _{1, k} , τ _{2, k} , λ ₁ , λ ₂ , β _k , υ _k , W _k , m _k } for the training data set This is called learning. Here, a simple EM algorithm cannot be applied, and a maximum likelihood solution is obtained using a variational Bayes method (Varial Bayes method: VB method).

変分ベイズ法を用いて、導出する学習アルゴリズムを以下に示す。 A learning algorithm derived using the variational Bayes method is shown below.

図４は、学習アルゴリズムの内容を示すフローチャートである。 FIG. 4 is a flowchart showing the contents of the learning algorithm.

まず、モデル学習手段１０４は、パラメータ{τ_1,k,τ_2,k,λ₁,λ₂,β_k,υ_k,Ｗ_k,m_k}を初期化する（Ｓ１１）。初期化方法には、ランダムに初期化する、ｋ−ｍｅａｎｓクラスタリングの結果を用いる等の方法があるが、ここでは、ユニバーサルモデルの学習結果のパラメータをパラメータの初期値とする。 First, the model learning unit 104 initializes the parameters {τ _{1, k} , τ _{2, k} , λ ₁ , λ ₂ , β _k , υ _k , W _k , m _k } (S11). The initialization method includes a method of initializing at random, using a result of k-means clustering, and the like. Here, a parameter of a learning result of the universal model is set as an initial value of the parameter.

次に、次式で定められる変分下限を計算する。

Next, the variation lower limit defined by the following equation is calculated.

次に、モデル学習手段１０４は、ＶＢ−Ｅステップとして負担率ｒ_ｊｋを算出する。

ただし、

ここで、

Next, the model learning unit 104 calculates a burden rate r _jk as the VB-E step.

However,

here,

φ(・）はディガンマ関数、Ｉ(・)は、引数が真の場合に１を返し、偽の場合に０を返すパラメータである。 φ (•) is a digamma function, and I (•) is a parameter that returns 1 if the argument is true and returns 0 if the argument is false.

次に、モデル学習手段１０４は、ＶＢ−Ｍステップとしてパラメータを更新する。

ここで、

である。 Next, the model learning unit 104 updates the parameters as the VB-M step.

here,

It is.

次に、モデル学習手段１０４は、収束条件を満たすか否か判定し（Ｓ１４）、変分下限（式（７））の変化が予め定められた既定値以下であれば（Ｓ１４；Ｙｅｓ）、計算を終了し、既定値以下でなければ（Ｓ１４；Ｎｏ）、ステップＳ１２に戻る。 Next, the model learning unit 104 determines whether or not the convergence condition is satisfied (S14), and if the change of the variation lower limit (formula (7)) is equal to or less than a predetermined value (S14; Yes), The calculation is terminated, and if it is not less than the predetermined value (S14; No), the process returns to step S12.

本発明の実施の形態では、パラメータに事前分布を定めることにより、より柔軟な学習を実現したが、パラメータの分布を定めるためのハイパーパラメータβ₀、υ₀、ｍ₀、Ｗ₀は依然として適切な値を定めなければならない。 In the embodiment of the present invention, more flexible learning is realized by determining the prior distribution for the parameters. However, the hyperparameters β ₀ , υ ₀ , m ₀ , and W ₀ for determining the parameter distribution are still appropriate. A value must be defined.

モデル学習手段１０４は、ユニバーサルモデルを活用し、蓋然性のあるハイパーパラメータ設定を行う。ユニバーサルモデルのハイパーパラメータはデフォルトの値に定め、上記のステップＳ１１〜Ｓ１４に従い予め学習を行なう。 The model learning unit 104 uses a universal model and performs probable hyperparameter setting. The hyper parameters of the universal model are set to default values, and learning is performed in advance according to the above steps S11 to S14.

デフォルトのハイパーパラメータをそれぞれ、

とする。ユニバーサルモデルの学習結果を用いることによって、一律のハイパーパラメータをそれぞれのガウス分布に与えるのではなく、その学習データの全体の傾向を反映した、混合要素毎に異なるハイパーパラメータの設定を行うことができる。 Each of the default hyperparameters

And By using the learning results of the universal model, it is possible to set different hyper parameters for each mixing element, reflecting the overall tendency of the learning data, instead of giving uniform hyper parameters to each Gaussian distribution. .

また、ｋ番目の混合要素のハイパーパラメータを{ｍ₀ ^k,Ｗ₀ ^k,υ₀ ^k,β₀ ^k }
とする。このとき、ユニバーサルモデルの学習結果を用いて、ハイパーパラメータを次のように設定する。

ここで、変数の右肩にあるｕは、ユニバーサルモデルの学習結果であることを示している。 The hyperparameters of the k-th mixing element are {m ₀ ^k , W ₀ ^k , υ ₀ ^k , β ₀ ^k }
And At this time, the hyper parameter is set as follows using the learning result of the universal model.

Here, u on the right shoulder of the variable indicates a learning result of the universal model.

さらに、ユニバーサルモデルのサンプル数が多いとユニバーサルモデルの影響が大きくなりすぎるので、ユニバーサルモデル（事前分布）の影響をコントロールする、新たなパラメータκ(０≦κ≦１)を導入してもよい。

Furthermore, since the influence of the universal model becomes too large when the number of samples of the universal model is large, a new parameter κ (0 ≦ κ ≦ 1) for controlling the influence of the universal model (prior distribution) may be introduced.

さらに、簡易的な方法としては、

としてもよい。いずれの場合も、κ＝１の場合は、（２７）〜（２８）となり、κ＝０の場合には、デフォルトのハイパーパラメータが設定される。 Furthermore, as a simple method,

It is good. In any case, when κ = 1, (27) to (28), and when κ = 0, default hyperparameters are set.

κの一つの決め方として、ユニバーサルモデルのサンプル数と、現在学習しているモデルのサンプル数を比較し、ユニバーサルモデルのサンプル数が多い場合には、κを小さな値に、少ない場合には、大きな値にする。例えば、Ｎ^ｕ、Ｎそれぞれ、ユニバーサルモデルと現在学習しているモデルのサンプル数とすると、κを決定する関数の一例として、

を用いてもよい。また、さらに単純に

としてもよい。ここで、Ｆ(・)は、原点を通り、１に漸近する単調増加関数となる。 One method of determining κ is to compare the number of samples of the universal model with the number of samples of the model that is currently being learned. If the number of samples of the universal model is large, κ is set to a small value. Value. For example, if N ^u and N are the number of samples of the universal model and the model currently being learned, respectively, as an example of a function that determines κ,

May be used. And even more simply

It is good. Here, F (•) is a monotonically increasing function passing through the origin and asymptotic to 1.

（３）アノテーション推定動作
図５は、画像識別情報付与装置の動作例を示すフローチャートである。 (3) Annotation Estimation Operation FIG. 5 is a flowchart showing an operation example of the image identification information adding device.

まず、画像取得手段１００は、ラベルを推定する対象として通信部１２を介して外部から入力された画像情報を取得する（Ｓ２１）。 First, the image acquisition unit 100 acquires image information input from the outside via the communication unit 12 as a target of label estimation (S21).

次に、画像分割手段１０１は、画像をｎ個の領域に分割し、部分領域を生成する（Ｓ２２）。 Next, the image dividing unit 101 divides the image into n regions and generates a partial region (S22).

次に、特徴ベクトル生成手段１０２は、部分領域から複数の特徴量を抽出し、部分領域毎にこれら特徴量を成分とした特徴ベクトルｘ_１、ｘ_２、…、ｘ_ｎを生成する（Ｓ２３）。 Next, the feature vector generation unit 102 extracts a plurality of feature amounts from the partial region, and generates feature vectors x ₁ , x ₂ ,..., X _n using these feature amounts as components for each partial region (S23). .

次に、尤度計算手段１０５は、学習情報１１３からステップＳ６で学習した各ラベルのモデルを読み込む（Ｓ２４）。具体的には、モデルのパラメータ{τ_1,k,τ_2,k,λ₁,λ₂,β_k,υ_k,Ｗ_k,m_k}を記憶部１１から読み込み、図示しないメモリに展開する。 Next, the likelihood calculating means 105 reads the model of each label learned in step S6 from the learning information 113 (S24). Specifically, the model parameters {τ _{1, k} , τ _{2, k} , λ ₁ , λ ₂ , β _k , υ _k , W _k , m _k } are read from the storage unit 11 and developed into a memory (not shown). .

次に、尤度計算手段１０５は、各部分領域の特徴ベクトルについて事後確率を算出する（Ｓ２５）。予測するべき入力画像Ｉから抽出される特徴ベクトルのセットＸ＝{ｘ₁,…ｘ_n}が与えられた場合のラベルcの事後確率p(c｜Ｘ)がベイズの定理を用いて次のように計算される。

ここで、ｐ(c)は、ラベルｃの事前確率である。これには、学習データ集合における相対頻度を用いる。p(x₁…x_n)は、特徴ベクトル集合の事前分布であるが、ラベルに対しては一定値をとる。したがって、画像Ｉに対するラベルｃの対数尤度は、定数部分を除いて、次式のように表わせる。

式（40）が大きいほど画像Ｉのラベルとして適していると考えられる。式の大きいものから数個を画像Ｉのラベル（アノテーション単語）とする。 Next, the likelihood calculating means 105 calculates a posteriori probability for the feature vector of each partial region (S25). Given a set of feature vectors X = {x ₁ ,... X _n } extracted from the input image I to be predicted, the posterior probability p (c | X) of the label c is _expressed as follows using Bayes' theorem: Is calculated as follows.

Here, p (c) is the prior probability of the label c. For this, the relative frequency in the learning data set is used. p (x ₁ ... x _n ) is a prior distribution of the feature vector set, but takes a constant value for the label. Therefore, the logarithmic likelihood of the label c for the image I can be expressed as the following equation except for the constant part.

It can be considered that the larger the formula (40), the more suitable the label for the image I. Several of the large expressions are used as labels (annotation words) of the image I.

次に、尤度計算手段１０５は、あるラベルｃに対する部分画像の特徴ベクトルｘ_iの尤度を次式で算出する（Ｓ２６）。

ただし、

ここで、Γ(・)は、ガンマ関数を表わす。 Next, the likelihood calculating means 105 calculates the likelihood of the feature vector x _i of the partial image for a certain label c by the following equation (S26).

However,

Here, Γ (·) represents a gamma function.

上記の計算によって尤度が計算されると、アノテーション単語推定手段１０６は、尤度の大きいものから順に、例えば、上位５つのラベルを取得し、画像情報の識別情報としてラベルに対応付けられたアノテーション単語を付与する（Ｓ２７）。 When the likelihood is calculated by the above calculation, the annotation word estimation unit 106 acquires, for example, the top five labels in descending order of the likelihood, and the annotations associated with the labels as the identification information of the image information. A word is assigned (S27).

次に、出力手段１０７は、図示しない定められた出力装置（ディスプレイ、プリンター、ハードディスク等）にアノテーション単語推定結果を出力する（Ｓ２８）。 Next, the output means 107 outputs the annotation word estimation result to a predetermined output device (display, printer, hard disk, etc.) (not shown) (S28).

［他の実施の形態］
なお、本発明は、上記実施の形態に限定されず、本発明の趣旨を逸脱しない範囲で種々な変形が可能である。 [Other embodiments]
The present invention is not limited to the above embodiment, and various modifications can be made without departing from the spirit of the present invention.

また、上記実施の形態で使用される画像識別情報付与プログラム１１０は、ＣＤ−ＲＯＭ等の記憶媒体から装置内の記憶部に読み込んでも良く、インターネット等のネットワークに接続されているサーバ装置等から装置内の記憶部にダウンロードしてもよい。また、上記実施の形態で使用される手段１００〜１０７の一部または全部をＡＳＩＣ等のハードウェアによって実現してもよい。 Further, the image identification information providing program 110 used in the above embodiment may be read from a storage medium such as a CD-ROM into a storage unit in the apparatus, or from a server apparatus or the like connected to a network such as the Internet. You may download to the memory | storage part. Moreover, you may implement | achieve part or all of the means 100-107 used by the said embodiment with hardware, such as ASIC.

１画像識別情報付与装置
１０制御部
１１記憶部
１２通信部
１００画像取得手段
１０１画像分割手段
１０２特徴ベクトル生成手段
１０３学習データ集合取得手段
１０４モデル学習手段
１０５尤度計算手段
１０６アノテーション単語推定手段
１０７出力手段
１１０画像識別情報付与プログラム
１１１画像情報
１１２ラベル情報
１１３学習情報 DESCRIPTION OF SYMBOLS 1 Image identification information provision apparatus 10 Control part 11 Storage part 12 Communication part 100 Image acquisition means 101 Image division means 102 Feature vector generation means 103 Learning data set acquisition means 104 Model learning means 105 Likelihood calculation means 106 Annotation word estimation means 107 Output Means 110 Image identification information adding program 111 Image information 112 Label information 113 Learning information

Claims

Computer
Acquisition means for acquiring image information for each identification information from image information associated with the identification information in advance;
Generating means for generating feature information from the image information acquired by the acquiring means;
A learning result is generated by learning the relationship between the feature information generated by the generating means and the identification information of the image information corresponding to the feature information using a probability model based on a mixture of probability distributions, and the mixture of the probability models Learning to determine a variable of an element generation process using a set of feature information obtained from all image information acquired by the acquisition means regardless of the content of identification information, and to determine a prior distribution for the variable of the probability distribution Means,
Based on the learning result generated by the learning unit, a calculation unit that calculates a posterior distribution of feature information generated by the generation unit from image information to which identification information is added using the probability model for each piece of identification information When,
An image identification information addition program that functions as an estimation unit that estimates identification information associated with image information to which the identification information is to be added based on the posterior distribution.

2. The image identification according to claim 1, wherein the learning unit determines an initial value of the variable of the prior distribution using a set of feature information obtained from all image information regardless of the content of the identification information by the acquisition unit. Information grant program.

Acquisition means for acquiring image information for each identification information from image information associated with the identification information in advance;
Generating means for generating feature information from the image information acquired by the acquiring means;
A learning result is generated by learning the relationship between the feature information generated by the generating means and the identification information of the image information corresponding to the feature information using a probability model based on a mixture of probability distributions, and the mixture of the probability models Learning to determine a variable of an element generation process using a set of feature information obtained from all image information acquired by the acquisition means regardless of the content of identification information, and to determine a prior distribution for the variable of the probability distribution Means,
Based on the learning result generated by the learning unit, a calculation unit that calculates a posterior distribution of feature information generated by the generation unit from image information to which identification information is added using the probability model for each piece of identification information When,
An image identification information providing apparatus comprising: estimation means for estimating identification information associated with image information to which the identification information is to be applied based on the posterior distribution.