JP6402607B2

JP6402607B2 - Information processing apparatus, information processing method, and information processing program

Info

Publication number: JP6402607B2
Application number: JP2014241717A
Authority: JP
Inventors: 友哉岩倉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2014-11-28
Filing date: 2014-11-28
Publication date: 2018-10-10
Anticipated expiration: 2034-11-28
Also published as: JP2016103192A

Description

本発明は、学習器を用いてモデルを学習する技術に関する。 The present invention relates to a technique for learning a model using a learning device.

ある文献には、繰り返されるラウンド毎に、各学習事例に対する重みを更新して２値で表現される分類結果を出力する弱仮説を複数学習し、これらを統合させたモデル（結合モデル）を学習するブースティング技術が開示されている。この技術によれば、各ラウンドにおいて学習された弱仮説に対する確信度が算出される。最終のモデルは各弱仮説の確信度を基に結合された形で表現される。 In a certain document, for each repeated round, learn multiple weak hypotheses that update the weights for each learning case and output a binary classification result, and learn a model that combines them (combined model) A boosting technique is disclosed. According to this technique, the certainty factor for the weak hypothesis learned in each round is calculated. The final model is expressed in a combined form based on the certainty of each weak hypothesis.

このように、弱仮説が２値で表現される分類結果を出力する分類器を用いて実現される場合には、解析的に確信度を計算することができるので、確信度算出に係る処理負荷は比較的に小さい。 In this way, when the weak hypothesis is realized using a classifier that outputs a classification result expressed in binary, since the certainty factor can be calculated analytically, the processing load related to the certainty factor calculation Is relatively small.

Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)Quinlan, J.R .: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993) Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. 65(6) (1958) 386-408Rosenblatt, F .: The perceptron: A probabilistic model for information storage and organization in the brain. 65 (6) (1958) 386-408 Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences 55(1) (1997)Freund, Y., Schapire, R.E .: A decision-theoretic generalization of on-line learning and an application to boosting.Journal of computer and system sciences 55 (1) (1997) Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3) (1999) 297-336Schapire, R.E., Singer, Y .: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37 (3) (1999) 297-336

本発明の目的は、一側面では、複数値を含む分類結果を出力する仮説モデルの確信度を算出する処理の負荷を軽減することである。 In one aspect, an object of the present invention is to reduce a processing load for calculating a certainty factor of a hypothesis model that outputs a classification result including a plurality of values.

一態様に係る情報処理装置は、（Ａ）学習事例と当該学習事例に対する、所定の２値のうちのいずれかを示すラベルとを含む複数のセットと、各学習事例に対する係数値とに基づいて、仮説モデルを学習する学習部と、（Ｂ）（ｂ１）仮説モデルと各学習事例とに基づき、学習事例毎に上記２値のいずれかに対応する傾向及び当該傾向の程度を示す分類結果を求め、（ｂ２）学習事例のうち、分類結果における上記傾向が、対応するラベルに対応する学習事例を特定し、特定した各学習事例について、対応する係数値と、分類結果における上記程度との積を求め、当該積の第１合計を算出し、（ｂ３）学習事例のうち、分類結果における上記傾向が、対応するラベルに対応しない学習事例を特定し、特定した各学習事例について、対応する係数値と、分類結果における上記程度との積を求め、当該積の第２合計を算出し、（ｂ４）第２合計に対する第１合計の比の対数を、各分類結果における上記程度の絶対値のうちの最大値の２倍の値で除することによって、仮説モデルの確信度を算出する算出部とを含む。 An information processing apparatus according to an aspect is based on (A) a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case. Based on the learning unit that learns the hypothesis model, and (B) (b1) the hypothesis model and each learning case, a classification result indicating the tendency corresponding to one of the above two values and the degree of the tendency for each learning case (B2) Among the learning cases, the tendency in the classification result identifies the learning case corresponding to the corresponding label, and for each identified learning case, the product of the corresponding coefficient value and the above degree in the classification result (B3) Among the learning cases, the above-mentioned tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each identified learning case, the corresponding relation The product of the value and the above-mentioned degree in the classification result is calculated, and the second sum of the product is calculated. And a calculation unit that calculates the certainty factor of the hypothesis model by dividing by a value that is twice the maximum value.

一側面としては、複数値を含む分類結果を出力する仮説モデルの確信度を算出する処理の負荷を軽減できる。 As one aspect, it is possible to reduce the processing load for calculating the certainty factor of a hypothesis model that outputs a classification result including a plurality of values.

図１は、ＡｄａＢｏｏｓｔのメイン処理フローを示す図である。FIG. 1 is a diagram showing a main processing flow of AdaBoost. 図２は、モデル学習装置の機能ブロックを示す図である。FIG. 2 is a diagram illustrating functional blocks of the model learning device. 図３は、モデル学習部の構成を示す図である。FIG. 3 is a diagram illustrating a configuration of the model learning unit. 図４は、モデル学習装置及びモデル適用装置の機能ブロックを示す図である。FIG. 4 is a diagram illustrating functional blocks of the model learning device and the model application device. 図５は、本実施の形態におけるモデル学習処理フローを示す図である。FIG. 5 is a diagram showing a model learning process flow in the present embodiment. 図６は、学習事例データの例を示す図である。FIG. 6 is a diagram illustrating an example of learning case data. 図７は、ラベルデータの例を示す図である。FIG. 7 is a diagram illustrating an example of label data. 図８は、初期状態の結合モデルデータの例を示す図である。FIG. 8 is a diagram illustrating an example of the coupled model data in the initial state. 図９は、第１ラウンドにおける重みデータを示す図である。FIG. 9 is a diagram showing weight data in the first round. 図１０は、第１ラウンドにおける弱学習処理の概要を示す図である。FIG. 10 is a diagram showing an overview of weak learning processing in the first round. 図１１は、確信度算出処理フローを示す図である。FIG. 11 is a diagram illustrating a certainty factor calculation processing flow. 図１２は、第１弱仮説による分類の概要を示す図である。FIG. 12 is a diagram showing an outline of classification based on the first weak hypothesis. 図１３は、第１ラウンドにおける確信度算出の概要を示す図である。FIG. 13 is a diagram showing an outline of the certainty factor calculation in the first round. 図１４は、第１ラウンドにおける結合モデルの更新の概要を示す図である。FIG. 14 is a diagram showing an outline of the update of the combined model in the first round. 図１５は、重み更新処理フローを示す図である。FIG. 15 is a diagram showing a weight update processing flow. 図１６は、第１ラウンドにおける重み更新の概要を示す図である。FIG. 16 is a diagram showing an outline of weight update in the first round. 図１７は、第２ラウンドにおける弱学習処理の概要を示す図である。FIG. 17 is a diagram showing an outline of weak learning processing in the second round. 図１８は、第２弱仮説による分類の概要を示す図である。FIG. 18 is a diagram showing an outline of classification based on the second weak hypothesis. 図１９は、第２ラウンドにおける確信度算出の概要を示す図である。FIG. 19 is a diagram showing an outline of certainty calculation in the second round. 図２０は、第２ラウンドにおける結合モデルの更新の概要を示す図である。FIG. 20 is a diagram illustrating an outline of the update of the combined model in the second round. 図２１は、モデル適用処理フローを示す図である。FIG. 21 is a diagram showing a model application processing flow. 図２２は、第１ラウンドにおける結合モデルデータを用いたモデル適用の例を示す図である。FIG. 22 is a diagram illustrating an example of model application using the combined model data in the first round. 図２３は、第２ラウンドにおける結合モデルデータを用いたモデル適用の例を示す図である。FIG. 23 is a diagram illustrating an example of model application using the combined model data in the second round. 図２４は、コンピュータの機能ブロックを示す図である。FIG. 24 is a diagram illustrating functional blocks of a computer.

まず、ブースティング手法の例として、ＡｄａＢｏｏｓｔ（非特許文献１，２）について説明する。学習の目的は、与えられた学習事例Ｘからラベル集合ＹへのマッピングＦ：Ｘ−＞Ｙを導出することである。この例で、ラベルは｛−１，＋１｝の２値のいずれかを示すものとする。 First, AdaBoost (Non-Patent Documents 1 and 2) will be described as an example of a boosting technique. The purpose of learning is to derive a mapping F: X-> Y from a given learning case X to a label set Y. In this example, it is assumed that the label indicates one of binary values of {−1, +1}.

図１に、ＡｄａＢｏｏｓｔのメイン処理フローを示す。学習システムは、学習データＳと、ブースティングについての繰り返し回数Ｔとを受け付ける（Ｓ１）。学習データＳは、学習事例とラベルとのセットをｍ個含み、｛（ｘ₁，ｙ₁），．．．（ｘ_m，ｙ_m）｝と表される。 FIG. 1 shows a main processing flow of AdaBoost. The learning system receives the learning data S and the number of repetitions T for boosting (S1). The learning data S includes m sets of learning examples and labels, and {(x ₁ , y ₁ ),. . . (X _m , y _m )}.

ｘ_i∈Ｘは、ｉ番目のセットに係る学習事例を表し、ｙ_i∈Ｙは、ｉ番目のセットに係るラベルを表している。繰り返し回数Ｔは、ブースティングを繰り返す回数である。 x _i εX represents a learning case related to the i-th set, and y _i εY represents a label related to the i-th set. The number of repetitions T is the number of times boosting is repeated.

学習システムは、学習事例の重み（ｗ_1,1，．．．ｗ_1,m）を初期化する（Ｓ３）。学習事例の重みｗ_1,iは、学習事例ｘ_iの重みである。各学習事例の重みｗ_1,i（１≦ｉ≦ｍ）に、初期値１／ｍが設定される。つまり、各学習事例の初期重みは、均等とする。 The learning system initializes the weights (w _1,1 ,... W _{1, m} ) of the learning cases (S3). The learning case weight w _{1, i} is the weight of the learning case x _i . An initial value 1 / m is set to the weight w _{1, i} (1 ≦ i ≦ m) of each learning case. That is, the initial weight of each learning case is assumed to be equal.

そして、カウンタｔに１を設定する（Ｓ５）。 Then, 1 is set to the counter t (S5).

学習システムは、弱学習器に弱仮説ｈ_tを求めさせる（Ｓ７）。弱学習器は、上述の学習データＳと学習事例の重み（ｗ_1,1，．．．ｗ_1,m）を用いて弱仮説ｈ_tを学習する。例えば決定木学習器（非特許文献１）やパーセプトロン（非特許文献２）などの分類器が、弱学習器として利用される。ｈ_tは、ｔ回目のラウンドで求めた弱仮説を表している。 Learning system causes seeking weak hypothesis h _t the weak learners (S7). Weak learner learns the weak hypothesis h _t using the weight of the learning data S and training example described above _{(w 1,1, ... w 1,} m). For example, a classifier such as a decision tree learner (Non-Patent Document 1) or a perceptron (Non-Patent Document 2) is used as a weak learner. h _t represents the weak hypothesis obtained in the t-th round.

次に、学習システムは、求めた弱仮説ｈ_tに対する確信度α_tを算出する（Ｓ９）。 Next, the learning system calculates a certainty factor α _t for the obtained weak hypothesis h _t (S9).

更に、学習システムは、次の式に従って、学習事例の重みを更新する（Ｓ１１）。

Further, the learning system updates the weight of the learning case according to the following formula (S11).

式（１）に含まれるｈ_t（ｘ_i）は、学習事例ｘ_iに対するｈ_ｔによる分類結果であり、ｅは、ネイピア数である。また、（１）式の分母は、以下のように表される。

そして、（２）式は、以下のような正規化を行うための係数である。

H _t included in the formula (1) (x _i) is the classification result by _{h t} for training example x _i, e is Napier's constant. Further, the denominator of the formula (1) is expressed as follows.

Equation (2) is a coefficient for performing normalization as follows.

学習システムは、カウンタｔに１を加え（Ｓ１３）、カウンタｔが繰り返し回数Ｔを超えたか否かを判定する（Ｓ１５）。カウンタｔが繰り返し回数Ｔを超えていないと判定した場合には、学習システムは、Ｓ７に戻り、上述した一連の処理を繰り返す。 The learning system adds 1 to the counter t (S13), and determines whether the counter t has exceeded the number of repetitions T (S15). If it is determined that the counter t does not exceed the number of repetitions T, the learning system returns to S7 and repeats the series of processes described above.

カウンタｔが繰り返し回数Ｔを超えたと判定した場合には、学習システムは、最終仮説Ｆを求める（Ｓ１７）。学習システムは、上述のループ処理により求めたＴ種類の弱仮説を、以下の式に従って結合することによって、最終仮説Ｆを求める。

When it is determined that the counter t has exceeded the number of repetitions T, the learning system obtains a final hypothesis F (S17). The learning system obtains the final hypothesis F by combining the T types of weak hypotheses obtained by the loop processing described above according to the following equation.

式中のｓｉｇｎは、入力値が正の場合には＋１を返し、それ以外の場合には−１を返す関数である。 Sign in the expression is a function that returns +1 if the input value is positive, and -1 otherwise.

図１では、繰り返し処理を終えた後に弱仮説をまとめて統合する最終仮説Ｆを算出する手順の例を示したが、図５を用いて後述するように、繰り返し処理においてその都度弱仮説を結合モデルに統合することによって、最終仮説Ｆを求めるようにしてもよい。 FIG. 1 shows an example of a procedure for calculating a final hypothesis F that integrates weak hypotheses after completing the iterative processing. However, as will be described later with reference to FIG. The final hypothesis F may be obtained by integrating the model.

尚、ＡｄａＢｏｏｓｔでは、以下の式に示すように、Ｔ種類の弱仮説からなる最終仮説Ｆにおいて、学習エラーの数に関する上限値が存在することが証明されている。

[[π]] は、ある命題πが成り立つ場合に１となり、ある命題πが成り立たない場合に０となることを意味している。 Note that AdaBoost has proved that there is an upper limit on the number of learning errors in the final hypothesis F consisting of T types of weak hypotheses, as shown in the following equation.

[[π]] means 1 when a proposition π holds, and 0 when a proposition π does not hold.

各ラウンドｔにおける弱仮説ｈ_tに関して以下の式の条件を満たす確信度α_tが得られれば、ＡｄａＢｏｏｓｔによる学習は収束することが、式（５）によって導かれる。
Ｚ_t（α_t）＜１（６） As long obtained satisfies confidence alpha _t the following equation with respect to the weak hypothesis h _t in each round t, that learning by AdaBoost converges, guided by the formula (5).
Z _t (α _t ) <1 (6)

但し、係数Ｚ_t（α_t）＝１となるラウンドがあったとしても、当該ラウンドは上述した上限値に対して影響を与えない。 However, even if there is a round in which the coefficient Z _t (α _t ) = 1, the round does not affect the above-described upper limit value.

続いて、確信度α_tを算出する方法について説明する。まず、２値で示される分類結果を出力する分類器（例えば、決定木学習器）を弱学習器として利用する場合における確信度α_tの算出方法について説明する。 Next, a method for calculating the certainty factor α _t will be described. First, a method for calculating the certainty factor α _t when a classifier (for example, a decision tree learner) that outputs a binary classification result is used as a weak learner will be described.

非特許文献４によれば、２値で示される分類結果を出力する分類器を弱学習器として利用する場合には、確信度α_tが解析的に算出される。 According to Non-Patent Document 4, when a classifier that outputs a binary classification result is used as a weak learner, the certainty factor α _t is analytically calculated.

式（２）に示したＺ_t（α_t）のα_tによる微分は、以下のように求められる。

The differentiation of Z _t (α _t ) shown in Equation (2) by α _t is obtained as follows.

そして、Ｚ_t（α_t）のα_tによる微分を０とした場合のα_tは以下の式によって求められる。

Then, alpha _t in the case of a zero derivative with alpha _t the Z _{_t (α t)} is determined by the following equation.

尚、対数の底は、ｅである。

これは、t 番目の弱仮説によって正しい分類が行われた各学習事例ｘ_Otに対する重みの合計である。

これは、t 番目の弱仮説によって誤った分類が行われた各学習事例ｘ_Ntに対する重みの合計である。 The base of the logarithm is e.

This is the sum of the weights for each learning case x _Ot correctly classified by the t th weak hypothesis.

This is the sum of the weights for each learning case x _Nt that has been incorrectly classified by the t th weak hypothesis.

尚、以下の式に示すように、Ｏｔは、学習事例を識別するための１からｍまでのＩＤのうち、t 番目の弱仮説によって正しい分類が行われた学習事例のＩＤに相当する。

As shown in the following formula, Ot corresponds to the ID of a learning case that is correctly classified by the t-th weak hypothesis among IDs 1 to m for identifying a learning case.

以下の式に示すように、Ｎｔは、学習事例を識別するための１からｍまでのＩＤのうち、t 番目の弱仮説によって誤った分類が行われた学習事例のＩＤに相当する。

As shown in the following equation, Nt corresponds to the ID of a learning case that is erroneously classified by the t-th weak hypothesis among IDs 1 to m for identifying a learning case.

次に、多値で示される分類結果を出力する分類器（例えば、パーセプトロン）を弱学習器として利用する場合における確信度α_tの算出方法について説明する。 Next, a method of calculating the certainty factor α _t when a classifier (for example, a perceptron) that outputs a multilevel classification result is used as a weak learner will be described.

上記の非特許文献４の式（１１）の計算方法では、多値の実数値で示される分類結果を出力する分類器を利用することを想定していない。 In the calculation method of Equation (11) in Non-Patent Document 4 described above, it is not assumed that a classifier that outputs a classification result represented by a multivalued real value is used.

多値の実数値で示される分類結果を出力する分類器を弱学習器として利用する場合に、式（８）によって確信度α_tを求めることは適当でない。仮に式（８）を用いれば、多値が正しく確信度α_tに反映されないことになるからである。 When a classifier that outputs a classification result represented by a multi-valued real value is used as a weak learner, it is not appropriate to obtain the certainty factor α _t using the equation (8). This is because if the equation (8) is used, the multivalue is not correctly reflected in the certainty factor α _t .

解析的算出に代わる次善の方法として、確信度α_tを二分法によって求めることが考えられる。二分法によれば、２点で特定される区間に相当する探索範囲を絞り込むことによって解を導く。二分法は既知の技術であるので、二分法による確信度α_t算出については簡単な説明に留める。 As a suboptimal method instead of analytical calculation, it is conceivable to obtain the certainty factor α _t by the bisection method. According to the bisection method, a solution is derived by narrowing down a search range corresponding to a section specified by two points. Since the bisection method is a known technique, the calculation of the certainty factor α _t by the bisection method will be simply described.

非特許文献４において示されている通り、Ｚ_tに対する確信度α_tによる２階微分は正である。つまり、Ｚ_t（α_t）のグラフは、下向きに凸型を示す。この例における二分法は、Ｚ_tの最小値を試行的に探索することに相当する。 As shown in Non-Patent Document 4, the second-order differentiation by the certainty factor α _t with respect to Z _t is positive. That is, the graph of Z _t (α _t ) shows a downward convex shape. Dichotomy in this example is equivalent to searching a minimum value of Z _t trial basis.

しかし、二分法によって確信度α_tを求める場合には、処理負荷が大きくなりやすい。 However, when the certainty factor α _t is obtained by the bisection method, the processing load tends to increase.

本実施の形態では、処理負荷をより少なくするために、近似的に確信度α_tを算出する。以下、式（６）に示した条件を満たす確信度α_tを求める近似式について説明する。 In the present embodiment, the certainty factor α _t is approximately calculated in order to reduce the processing load. Hereinafter, an approximate expression for obtaining the certainty factor α _t that satisfies the condition shown in Expression (6) will be described.

まず、式（２）は、以下のように変換することができる。

First, equation (2) can be converted as follows.

式（１１）における第１項は、以下の式に示すように、ｔ番目の弱仮説による分類結果が０となった学習事例の重みの合計である。

The first term in Equation (11) is the total weight of the learning cases in which the classification result based on the t-th weak hypothesis is 0, as shown in the following equation.

式（１１）における第２項に含まれるＯｔは、式（９）を用いて上述した通り、t 番目の弱仮説によって正しい分類が行われた学習事例のＩＤに相当する。つまり、式（１１）における第２項は、t 番目の弱仮説によって正しい分類が行われた学習事例に関する。 Ot included in the second term in Equation (11) corresponds to the ID of the learning case that has been correctly classified by the t-th weak hypothesis, as described above using Equation (9). That is, the second term in Equation (11) relates to a learning case in which correct classification is performed by the t-th weak hypothesis.

また、式（１１）における第３項に含まれるＮｔは、式（１０）を用いて上述した通り、t 番目の弱仮説によって誤った分類が行われた学習事例のＩＤに相当する。つまり、式（１１）における第３項は、t 番目の弱仮説によって誤った分類が行われた学習事例に関する。 In addition, Nt included in the third term in Expression (11) corresponds to the ID of the learning case in which classification is erroneously performed by the t-th weak hypothesis, as described above using Expression (10). That is, the third term in Equation (11) relates to a learning case in which an incorrect classification is performed by the t-th weak hypothesis.

尚、式（１１）における第２項及び第３項に含まれるＵ_tは、以下の式に示すように、t 番目の弱仮説による各学習事例に対する分類結果の絶対値のうち、最大の値を意味する。

Note that U _t included in the second term and the third term in Equation (11) is the maximum value among the absolute values of the classification results for each learning case based on the t-th weak hypothesis, as shown in the following equation. Means.

そして、式（１１）における第２項及び第３項に、以下に示すベルヌーイの不等式を適用する。０＜ｘであるならば、０＜ｒ≦１である任意のｒに関して、以下の不等式が成立することが知られている。
ｘ^r≦ｒ（ｘ−１）＋１ And the Bernoulli inequality shown below is applied to the second term and the third term in the equation (11). It is known that if 0 <x, the following inequality holds for any r where 0 <r ≦ 1.
x ^r ≦ r (x−1) +1

従って、式（１１）における第２項について、以下の不等式が成立する。

Therefore, the following inequality holds for the second term in equation (11).

同様に、式（１１）における第３項について、以下の不等式が成立する。

Similarly, the following inequality holds for the third term in equation (11).

式（１１）における第２項及び第３項を上述した不等式の右辺に置き換えたＺ^~ _t（α_t）（^~は、Ｚの上のハットを意味する。）とＺ_t（α_t）との間には、以下の式が成立する。
Ｚ_t（α_t）≦Ｚ^~ _t（α_t）（１２） Z ^~ _t (α _t ) ( ^~ means a hat on Z), Z _t (α _t ), where the second and third terms in equation (11) are replaced with the right side of the above inequality. The following formula is established between:
Z _t (α _t ) ≦ Z ^~ _t (α _t ) (12)

Ｚ^~ _t（α_t）は、以下の式で表される。

Z ^~ _t (α _t ) is expressed by the following equation.

一方、式（３）に示したように、ｔ番目の弱仮説による各学習事例の重みの合計は１であるので、以下の式が成り立つ。

従って、上述したＺ^~ _t（α_t）は、以下のように書き換えられる。

On the other hand, as shown in Expression (3), since the total weight of each learning case based on the t-th weak hypothesis is 1, the following expression is established.

Accordingly, the above-described Z ^~ _t (α _t ) can be rewritten as follows.

そして、Ｚ^~ _tに対するα_tによる微分は、以下の式で表される。

Then, the differentiation by α _t with respect ^to Z ^~ _t is expressed by the following equation.

よって、Ｚ^~ _t（α_t）に対するα_tによる微分が０であるときのα_t、つまりＺ_t（α_t）が最小値であるときのα_tは、以下の式で表せる。

本実施の形態では、この式に従って確信度α_tを算出する。 Thus, alpha _t the time derivative with alpha _t for Z ^~ _{_t (α t)} is 0, the alpha _t of words when Z _{_t (α t)} is the minimum value, represented by the following equation.

In the present embodiment, the certainty factor α _t is calculated according to this equation.

尚、この式によって確信度α_tを算出するようにしても、最終仮説を算出する式（４）は変わらない。 Note that even if the certainty factor α _t is calculated using this equation, the equation (4) for calculating the final hypothesis does not change.

次に、各ラウンドｔで、式（１３）を満たす確信度α_tを求めれば、ＡｄａＢｏｏｓｔによる学習が収束することを示す。 Next, it is shown that learning by AdaBoost converges if the certainty factor α _t satisfying Expression (13) is obtained in each round t.

式（１３）で示されたα_tをＺ^~ _tに代入することによって、以下の式が求められる。

By substituting α _t shown in Expression (13) into Z ^to _t , the following expression is obtained.

この式から、Ｚ^~ _t（α_t）≦１となることがわかる。そして、式（１２）を考慮すると、以下のようになる。
Ｚ_t（α_t）≦Ｚ^~ _t（α_t）≦１ From this equation, it can be seen that Z ^~ _t (α _t ) ≦ 1. Then, considering equation (12), the following is obtained.
Z _t (α _t ) ≦ Z ^~ _t (α _t ) ≦ 1

従って、式（１３）によって算出される確信度α_tは、式（６）に示したＺ_t（α_t）＜１の条件を満たすか、あるいはＺ_t（α_t）＝１の条件を満たす。つまり、学習エラーの上限値が低下するか、あるいは維持されるので、少なくともブースティングによって学習エラーの上限値が上昇することはない。 Therefore, the certainty factor α _t calculated by the equation (13) satisfies the condition of Z _t (α _t ) <1 shown in the equation (6) or the condition of Z _t (α _t ) = 1. . In other words, since the upper limit value of the learning error is reduced or maintained, the upper limit value of the learning error is not increased at least by boosting.

以下では、具体的な装置の構成及び動作について説明する。図２に、モデル学習装置の機能ブロックを示す。モデル学習装置は、学習データ（ラベルデータ及び結合モデルデータを含む）などの入力を行うための学習データ入力部１０１と、学習データ入力部１０１によって入力された学習データなどを格納する学習データ格納部１０３と、学習データ格納部１０３に格納されているデータを用いてモデル学習を行うモデル学習部１０５とを有する。 Hereinafter, a specific configuration and operation of the apparatus will be described. FIG. 2 shows functional blocks of the model learning device. The model learning device includes a learning data input unit 101 for inputting learning data (including label data and combined model data), and a learning data storage unit for storing learning data input by the learning data input unit 101 103, and a model learning unit 105 that performs model learning using data stored in the learning data storage unit 103.

更に、モデル学習装置は、モデル学習部１０５における処理に用いられるパラメータを記憶するパラメータ記憶部１０６と、モデル学習部１０５で算出する重みデータを格納する重みデータ格納部１０７と、モデル学習部１０５で算出する確信度データを格納する確信度格納部１０８と、モデル学習部１０５の処理結果であるモデルデータ（例えば、弱仮説データ及び結合モデルデータ）を格納するモデルデータ格納部１０９とを有する。 Further, the model learning apparatus includes a parameter storage unit 106 that stores parameters used for processing in the model learning unit 105, a weight data storage unit 107 that stores weight data calculated by the model learning unit 105, and a model learning unit 105. A certainty factor storage unit 108 that stores the certainty factor data to be calculated, and a model data storage unit 109 that stores model data (for example, weak hypothesis data and combined model data) that is a processing result of the model learning unit 105 are included.

更に、モデル学習装置は、生成されたモデルデータを適用して分類を行う対象を特定する処理対象データを入力するための対象データ入力部１１１と、対象データ入力部１１１によって入力された処理対象データを格納する対象データ格納部１１３と、モデルデータ格納部１０９に格納されているモデルデータを、対象データ格納部１１３に格納されている処理対象データに適用して、分類を行うモデル適用部１１５と、モデル適用部１１５の分類結果を格納する分類結果格納部１１７と、分類結果格納部１１７に格納されている分類結果を出力する出力部１１９とを有する。 Further, the model learning device includes a target data input unit 111 for inputting processing target data for specifying a target to be classified by applying the generated model data, and processing target data input by the target data input unit 111. A target data storage unit 113 for storing the model data, and a model application unit 115 for performing classification by applying the model data stored in the model data storage unit 109 to the processing target data stored in the target data storage unit 113. A classification result storage unit 117 that stores the classification result of the model application unit 115, and an output unit 119 that outputs the classification result stored in the classification result storage unit 117.

図３に、モデル学習部の構成を示す。モデル学習部１０５は、ブースティング部２０１と弱学習部２０３とを有する。ブースティング部２０１は、ブースティング処理を行う。弱学習部２０３は、弱学習処理を実行する。この例における弱学習部２０３は、分類結果として多値のスコアを出力するパーセプトロンを分類器として用いる。この例におけるパーセプトロンは、正又は負の実数値をスコアとして出力する。但し、弱学習部２０３が、他の分類器を用いるようにしてもよい。 FIG. 3 shows the configuration of the model learning unit. The model learning unit 105 includes a boosting unit 201 and a weak learning unit 203. The boosting unit 201 performs a boosting process. The weak learning unit 203 executes weak learning processing. The weak learning unit 203 in this example uses a perceptron that outputs a multi-valued score as a classification result as a classifier. The perceptron in this example outputs a positive or negative real value as a score. However, the weak learning unit 203 may use another classifier.

ブースティング部２０１は、初期化部２０５、確信度算出部２０７、結合モデル更新部２０９及び重み更新部２１１を有する。初期化部２０５は、データの初期化を行う。確信度算出部２０７は、弱仮説に対する確信度を算出する。結合モデル更新部２０９は、結合モデルを更新する。重み更新部２１１は、学習事例の重みを更新する。 The boosting unit 201 includes an initialization unit 205, a certainty factor calculation unit 207, a combined model update unit 209, and a weight update unit 211. The initialization unit 205 initializes data. The certainty factor calculation unit 207 calculates the certainty factor for the weak hypothesis. The combined model update unit 209 updates the combined model. The weight update unit 211 updates the weight of the learning case.

図２に示したモデル学習装置は、モデル学習の処理及びモデル適用の処理を行う。但し、モデル適用の処理を、モデル学習の処理を行った装置とは別の装置によって行うようにしてもよい。 The model learning apparatus shown in FIG. 2 performs model learning processing and model application processing. However, the model application process may be performed by an apparatus different from the apparatus that performed the model learning process.

図４に、モデル学習装置とは別に、モデル適用装置を設ける例を示す。モデル学習装置は、図２に示したモデルデータ格納部１０９と同様のモデルデータ格納部１０９ａを有する。更に、モデルデータ格納部１０９ａに格納されているモデルデータ（例えば、最終的な結合モデルデータ）を出力する出力部１１９ａを有している。 FIG. 4 shows an example in which a model application device is provided separately from the model learning device. The model learning device has a model data storage unit 109a similar to the model data storage unit 109 shown in FIG. Further, an output unit 119a that outputs model data (for example, final combined model data) stored in the model data storage unit 109a is provided.

一方、モデル適用装置は、モデル学習装置から出力されたモデルデータを受け付ける受付部１２１を有している。モデル適用装置は、更に受付部１２１において受け付けたモデルデータを格納するためのモデルデータ格納部１０９ｂを有している。モデル適用装置が有する対象データ入力部１１１乃至分類結果格納部１１７は、図２におけるモデル学習装置の場合と同様である。また、モデル適用装置は、更に図２におけるモデル学習装置における出力部１１９と同様の出力部１１９ｂを有する。 On the other hand, the model application device includes a reception unit 121 that receives model data output from the model learning device. The model application apparatus further includes a model data storage unit 109b for storing the model data received by the reception unit 121. The target data input unit 111 to the classification result storage unit 117 included in the model application device are the same as those of the model learning device in FIG. The model application apparatus further includes an output unit 119b similar to the output unit 119 in the model learning apparatus in FIG.

上述した学習データ入力部１０１、モデル学習部１０５、対象データ入力部１１１、モデル適用部１１５、出力部１１９、ブースティング部２０１、弱学習部２０３、初期化部２０５、確信度算出部２０７、結合モデル更新部２０９及び重み更新部２１１は、ハードウエア資源（例えば、図２４）と、以下で述べる処理をプロセッサに実行させるプログラムとを用いて実現される。 Learning data input unit 101, model learning unit 105, target data input unit 111, model application unit 115, output unit 119, boosting unit 201, weak learning unit 203, initialization unit 205, certainty factor calculation unit 207, combination The model update unit 209 and the weight update unit 211 are realized using hardware resources (for example, FIG. 24) and a program that causes a processor to execute the processing described below.

上述した学習データ格納部１０３、パラメータ記憶部１０６、重みデータ格納部１０７、確信度格納部１０８、モデルデータ格納部１０９、対象データ格納部１１３及び分類結果格納部１１７は、ハードウエア資源（例えば、図２４）を用いて実現される。 The learning data storage unit 103, the parameter storage unit 106, the weight data storage unit 107, the certainty degree storage unit 108, the model data storage unit 109, the target data storage unit 113, and the classification result storage unit 117 described above are hardware resources (for example, 24).

次に、図５乃至図２０を用いて、本実施の形態におけるモデル学習部１０５等の処理内容について説明する。 Next, processing contents of the model learning unit 105 and the like in the present embodiment will be described with reference to FIGS.

図５に、本実施の形態におけるモデル学習処理フローを示す。上述したように本実施の形態における確信度算出処理は、従来技術と異なる。 FIG. 5 shows a model learning process flow in the present embodiment. As described above, the certainty factor calculation process in the present embodiment is different from the conventional technique.

まず、学習データ入力部１０１は、例えばユーザからの指示に従って、学習データＳと、繰り返し回数Ｔとについての入力を受け付ける（Ｓ１０１）。学習データＳには、学習事例データ及びラベルデータが含まれる。 First, the learning data input unit 101 accepts input regarding the learning data S and the number of repetitions T, for example, in accordance with an instruction from the user (S101). The learning data S includes learning case data and label data.

図６に、学習事例データ６０１の例を示す。学習事例データ６０１は、学習事例ＩＤに対応する素性を含んでいる。この例で、ｘ₁で識別される学習事例は、素性ａ、素性ｂ及び素性ｃから構成される。同様に、ｘ₂で識別される学習事例は、素性ａ、素性ｂ及び素性ｄから構成される。ｘ₃で識別される学習事例は、素性ａ及び素性ｂから構成される。この例における学習事例の数ｍは、３である。 FIG. 6 shows an example of learning case data 601. The learning case data 601 includes a feature corresponding to the learning case ID. In this example, the learning case identified by x ₁ is composed of a feature a, a feature b, and a feature c. Similarly, training example identified by x ₂ is composed of a feature a, feature b and feature d. training example identified by x ₃ is composed of feature a and feature b. The number m of learning examples in this example is 3.

図７に、ラベルデータ７０１の例を示す。ラベルＩＤは、学習事例ＩＤに対応する。ｙ₁は、ｘ₁に対応する。ｙ₂は、ｘ₂に対応する。更に、ｙ₃は、ｘ₃に対応する。そして、ラベルデータ７０１は、ラベルＩＤに対応するラベルを含んでいる。この例で、ｙ₁で識別されるラベルは、「＋１」である。同様に、ｙ₂で識別されるラベルは、「−１」である。更に、ｙ₃で識別されるラベルは、「＋１」である。 FIG. 7 shows an example of label data 701. The label ID corresponds to the learning case ID. y ₁ corresponds to x ₁ . y ₂ corresponds to x ₂ . Furthermore, y ₃ corresponds to x ₃ . The label data 701 includes a label corresponding to the label ID. In this example, the label identified by y ₁ is “+1”. Similarly, the label identified by y ₂ is “−1”. Furthermore, the label identified by y ₃ is “+1”.

学習事例とラベルとをまとめた学習データを用いるようにしてもよい。 You may make it use the learning data which put together the learning example and the label.

初期化部２０５は、モデルデータ格納部１０９に初期の結合モデルを設定する（Ｓ１０３）。図８に、初期状態の結合モデルデータ８０１ａの例を示す。結合モデルデータ８０１は、各素性に対する第３スコアを含んでいる。尚、この例では、３種類のスコアを用いる。第１スコアは、弱仮説データに含まれる。第２スコアは、分類結果データに含まれる。第１スコア及び第２スコアについては、後述する。初期化部２０５は、各素性に対応する第３スコアに０を設定する。この例では、素性ａ乃至素性ｄのそれぞれに対応する第３スコアに０が設定される。 The initialization unit 205 sets an initial combined model in the model data storage unit 109 (S103). FIG. 8 shows an example of the coupled model data 801a in the initial state. The combined model data 801 includes a third score for each feature. In this example, three types of scores are used. The first score is included in the weak hypothesis data. The second score is included in the classification result data. The first score and the second score will be described later. The initialization unit 205 sets 0 to the third score corresponding to each feature. In this example, 0 is set to the third score corresponding to each of the features a to d.

次に、初期化部２０５は、重みデータ格納部１０７に格納されている学習事例の重み（ｗ_1,1，．．．ｗ_1,m）を初期化する（Ｓ１０５）。具体的には、各重みに、等しく１／ｍの値が設定される。 Next, the initialization unit 205 initializes the weights (w _1,1 ,... W _{1, m} ) of the learning cases stored in the weight data storage unit 107 (S105). Specifically, a value of 1 / m is set to each weight equally.

図９に、第１ラウンドにおける重みデータ９０１ａを示す。重みＩＤは、学習事例ＩＤに対応する。ｗ_1,1は、ｘ₁に対応する。ｗ_1,2は、ｘ₂に対応する。更に、ｗ_1,3は、ｘ₃に対応する。そして、重みデータ９０１は、重みＩＤに対応する重みを含んでいる。この例で、ｗ₁で識別される重み、ｗ₂で識別される重み及びｗ₃で識別される重みには、いずれも「０．３３３３３」が設定されている。 FIG. 9 shows the weight data 901a in the first round. The weight ID corresponds to the learning case ID. w _1,1 corresponds to x ₁ . w _1,2 corresponds to x ₂ . Furthermore, w _1,3 corresponds to x ₃ . The weight data 901 includes a weight corresponding to the weight ID. In this example, “0.33333” is set for each of the weight identified by w ₁ , the weight identified by w ₂ , and the weight identified by w ₃ .

ブースティング部２０１は、パラメータｔに１を設定する（Ｓ１０７）。パラメータｔは、Ｓ１０９からＳ１１７までのルーチンの実行回数を計数するための変数であり、ｔによってラウンドを特定する。 The boosting unit 201 sets 1 to the parameter t (S107). The parameter t is a variable for counting the number of executions of the routine from S109 to S117, and the round is specified by t.

そして、弱学習部２０３は、弱学習処理を実行する（Ｓ１０９）。この例における弱学習処理では、分類器としてパーセプトロンを用いる。この例では、パーセプトロンの学習は、１度だけの繰り返しとする。 The weak learning unit 203 executes weak learning processing (S109). In the weak learning process in this example, a perceptron is used as a classifier. In this example, perceptron learning is repeated only once.

図１０に、第１ラウンドにおける弱学習処理の概要を示す。図１０は、上述した学習事例データ６０１、ラベルデータ７０１及び第１ラウンドにおける重みデータ９０１ａを用いて弱学習処理を行った結果、弱仮説データ１００１ａが生成された様子を示している。 FIG. 10 shows an overview of weak learning processing in the first round. FIG. 10 shows how weak hypothesis data 1001a is generated as a result of performing weak learning processing using the above-described learning case data 601, label data 701, and weight data 901a in the first round.

第１ラウンドで生成された弱仮説を、第１弱仮説という。弱仮説データ１００１ａは、第１弱仮説のデータである。弱仮説データ１００１は、各素性に対する第１スコアを含んでいる。正の値である第１スコアは、当該素性を含む学習事例のラベルが「＋１」である傾向があることを示している。負の値である第１スコアは、当該素性を含む学習事例のラベルが「−１」である傾向があることを示している。 The weak hypothesis generated in the first round is referred to as the first weak hypothesis. The weak hypothesis data 1001a is data of the first weak hypothesis. The weak hypothesis data 1001 includes a first score for each feature. The first score that is a positive value indicates that the label of the learning case including the feature tends to be “+1”. The first score which is a negative value indicates that the label of the learning case including the feature tends to be “−1”.

事例の重みを用いたモデル更新部分以外は、従来技術と同様であるので、パーセプトロンを用いた弱学習処理の詳細については省略する。 The details other than the model update portion using the case weights are the same as those in the prior art, and the details of the weak learning processing using the perceptron will be omitted.

確信度算出部２０７は、確信度算出処理を実行する（Ｓ１１１）。確信度算出処理において、確信度算出部２０７は、上述した式（１３）に従って確信度α_tを算出する。 The certainty factor calculation unit 207 executes a certainty factor calculation process (S111). In the certainty factor calculation process, the certainty factor calculation unit 207 calculates the certainty factor α _t according to the above-described equation (13).

図１１に、確信度算出処理フローを示す。確信度算出部２０７は、まず、Ｓ１０９において求めた弱仮説による分類処理を実行する（Ｓ２０１）。 FIG. 11 shows a certainty factor calculation processing flow. The certainty factor calculation unit 207 first executes classification processing based on the weak hypothesis obtained in S109 (S201).

図１２に、第１弱仮説による分類の概要を示す。学習事例データ６０１に含まれる各学習事例について第１弱仮説に係る弱仮説データ１００１ａを適用した分類が行われ、分類結果が得られる。分類結果データ１２０１ａは、第１弱仮説による分類結果を示している。この例では、分類結果に対する評価が付されている。分類結果とラベルとの正負が一致する場合に、「正しい」と評価される。他方、分類結果とラベルとの正負が一致しない場合に、「誤り」と評価される。つまり、分類結果がラベルと合致する傾向を示す場合に「正しい」と評価され、分類結果がラベルと合致しない傾向を示す場合に「誤り」と評価される。 FIG. 12 shows an overview of classification based on the first weak hypothesis. For each learning case included in the learning case data 601, classification is performed by applying the weak hypothesis data 1001a related to the first weak hypothesis, and a classification result is obtained. The classification result data 1201a indicates the classification result based on the first weak hypothesis. In this example, the classification result is evaluated. When the classification result and the label have the same sign, it is evaluated as “correct”. On the other hand, if the classification result and the label do not coincide with each other, it is evaluated as “error”. That is, it is evaluated as “correct” when the classification result shows a tendency to match the label, and is evaluated as “error” when the classification result shows a tendency not to match the label.

各学習事例に対する分類結果は、第２スコアとして得られる。この例で、学習事例ｘ₁に対して第１弱仮説を適用した分類結果である第２スコアｈ₁（ｘ₁）は、「１」であり、その評価は「正しい」である。同様に、学習事例ｘ₂に対して第１弱仮説を適用した分類結果である第２スコアｈ₁（ｘ₂）は、「０．３３３３３」であり、その評価は「誤り」である。更に、学習事例ｘ₃に対して第１弱仮説を適用した分類結果である第２スコアｈ₁（ｘ₃）は、「０．６６６６７」であり、その評価は「正しい」である。 The classification result for each learning case is obtained as the second score. In this example, the second score h ₁ (x ₁ ), which is the classification result obtained by applying the first weak hypothesis to the learning example x ₁ , is “1”, and the evaluation is “correct”. Similarly, the second score h ₁ (x ₂ ), which is a classification result obtained by applying the first weak hypothesis to the learning case x ₂ , is “0.33333”, and the evaluation is “error”. Furthermore, the second score h ₁ (x ₃ ), which is the classification result obtained by applying the first weak hypothesis to the learning case x ₃ , is “0.66667”, and the evaluation is “correct”.

この例では、説明に資するために評価を分類結果に付したが、評価は省略するようにしてもよい。 In this example, evaluation is given to the classification result for the purpose of explanation, but the evaluation may be omitted.

確信度算出部２０７は、分類結果である第２スコアの絶対値のうち最大の値Ｕ_tを求める（Ｓ２０３）。図１２に示した分類結果データ１２０１ａの例では、第２スコアｈ₁（ｘ₃）の絶対値「１」が、最大値Ｕ_tに相当する。 Confidence factor computing unit 207 obtains the maximum value U _t of the absolute value of the second score is a classification result (S203). In the example of the classification result data 1201a illustrated in FIG. 12, the absolute value “1” of the second score h ₁ (x ₃ ) corresponds to the maximum value U _t .

確信度算出部２０７は、パラメータｉに１を設定する（Ｓ２０５）。パラメータｉは、Ｓ２０７からＳ２１５までのルーチンの実行回数を計数するための変数であり、ｉによって学習事例を特定する。 The certainty factor calculation unit 207 sets the parameter i to 1 (S205). The parameter i is a variable for counting the number of executions of the routine from S207 to S215, and the learning case is specified by i.

確信度算出部２０７は、パラメータｉによって特定されるラベルｙ_iに分類結果ｈ_t（ｘ_i）を乗じた値が正であるか否かを判定する（Ｓ２０７）。当該値が正であることは、分類結果ｈ_t（ｘ_i）の評価が「正しい」であることに相当する。 The certainty factor calculation unit 207 determines whether the value obtained by multiplying the label y _i specified by the parameter i by the classification result h _t (x _i ) is positive (S207). The positive value corresponds to the evaluation of the classification result h _t (x _i ) being “correct”.

ラベルｙ_iに分類結果ｈ_t（ｘ_i）を乗じた値が正であると判定した場合には、パラメータｔ及びパラメータｉによって特定される重みｗ_t,iに分類結果ｈ_t（ｘ_i）の絶対値を乗じた値を、正に関する総和を算出するためのパラメータに加算する（Ｓ２０９）。そして、Ｓ２１５の処理に移る。 Label y _i to the classification result when the value obtained by multiplying h _t (x _i) is determined to be positive, the parameter t and weights w _t specified by the parameter _i, the classification result to the _{_i} h _t (x _i) A value obtained by multiplying the absolute value of is added to the parameter for calculating the sum of positive values (S209). Then, the process proceeds to S215.

一方、ラベルｙ_iに分類結果ｈ_t（ｘ_i）を乗じた値が正ではないと判定した場合には、確信度算出部２０７は、当該値が負であるか否かを判定する（Ｓ２１１）。当該値が負であることは、分類結果ｈ_t（ｘ_i）の評価が「誤り」であることに相当する。 On the other hand, when it is determined that the value obtained by multiplying the label y _i by the classification result h _t (x _i ) is not positive, the certainty factor calculation unit 207 determines whether or not the value is negative (S211). ). The negative value corresponds to the evaluation of the classification result h _t (x _i ) being “error”.

ラベルｙ_iに分類結果ｈ_t（ｘ_i）を乗じた値が負であると判定した場合には、パラメータｔ及びパラメータｉによって特定される重みｗ_t,iに分類結果ｈ_t（ｘ_i）の絶対値を乗じた値を、負に関する総和を算出するためのパラメータに加算する（Ｓ２１３）。そして、Ｓ２１５の処理に移る。 Label y _i to the classification result when the value obtained by multiplying h _t (x _i) is determined to be negative, the parameter t and weights w _t specified by the parameter _i, the classification result to the _{_i} h _t (x _i) A value obtained by multiplying the absolute value of is added to the parameter for calculating the total sum relating to negative (S213). Then, the process proceeds to S215.

一方、ラベルｙ_iに分類結果ｈ_t（ｘ_i）を乗じた値が負ではないと判定した場合、つまり当該値が０である場合には、そのままＳ２１５の処理に移る。 On the other hand, if it is determined that the value obtained by multiplying the label y _i by the classification result h _t (x _i ) is not negative, that is, if the value is 0, the process proceeds to S215 as it is.

確信度算出部２０７は、パラメータｉに１を加え（Ｓ２１５）、パラメータｉが学習事例の数ｍを超えたか否かを判定する（Ｓ２１７）。パラメータｉが学習事例の数ｍを超えていないと判定した場合には、Ｓ２０７へ戻り一連の処理を繰り返す。 The certainty factor calculation unit 207 adds 1 to the parameter i (S215), and determines whether the parameter i exceeds the number m of learning cases (S217). If it is determined that the parameter i does not exceed the number m of learning cases, the process returns to S207 and a series of processing is repeated.

パラメータｉが学習事例の数ｍを超えたと判定した場合には、確信度算出部２０７は、（正に関する総和／負に関する総和）の対数を求める（Ｓ２１９）。そして、確信度算出部２０７は、求めた対数に１／（２×Ｕ_t）を乗ずる（Ｓ２２１）。その結果、第ｔ仮説に対する確信度α_tが得られる。第ｔ仮説に対する確信度α_tは、確信度格納部１０８に格納される。 When it is determined that the parameter i has exceeded the number m of learning examples, the certainty factor calculation unit 207 obtains the logarithm of (total for positive / total for negative) (S219). Then, the certainty factor calculation unit 207 multiplies the obtained logarithm by 1 / (2 × U _t ) (S221). As a result, a certainty factor α _t for the t-th hypothesis is obtained. The certainty factor α _t for the t-th hypothesis is stored in the certainty factor storage unit 108.

図１３に、第１ラウンドにおける確信度算出の概要を示す。この例で、正に関する総和は、０．３３３３３×１＋０．３３３３３×０．６６６６７である。また、負に関する総和は、０．３３３３３×０．３３３３３である。そして、第１弱仮説に対する確信度α₁が、「０．８０４７２」となる。 FIG. 13 shows an outline of the certainty factor calculation in the first round. In this example, the sum for positive is 0.33333 × 1 + 0.33333 × 0.666667. Moreover, the sum total regarding negative is 0.33333 * 0.33333. The certainty factor α ₁ for the first weak hypothesis is “0.80472”.

確信度算出処理を終えると、図５に示したＳ１１３の処理に戻る。結合モデル更新部２０９は、前回のラウンドにおける結合モデルデータ８０１、今回のラウンドで求めた弱仮説データ１００１及び確信度α_tに基づいて、今回のラウンドにおける結合モデルデータ８０１を算出する（Ｓ１１３）。具体的には、各素性について、今回の弱仮説データ１００１の第１スコアに確信度α_tを乗じた値を、前回のラウンドにおける結合モデルデータ８０１の第３スコアに加える。そして、求められた和を今回のラウンドにおける結合モデルの第３スコアに設定する。 When the certainty calculation process is completed, the process returns to the process of S113 shown in FIG. The combined model update unit 209 calculates the combined model data 801 in the current round based on the combined model data 801 in the previous round, the weak hypothesis data 1001 obtained in the current round, and the certainty factor α _t (S113). Specifically, for each feature, a value obtained by multiplying the first score of the current weak hypothesis data 1001 by the certainty factor α _t is added to the third score of the combined model data 801 in the previous round. Then, the obtained sum is set as the third score of the combined model in the current round.

図１４に、第１ラウンドにおける結合モデルの更新の概要を示す。第１ラウンドにおける結合モデルデータ８０１ｂの素性ａに対する第３スコア「０．２６８２４」は、０．８０４７２×０．３３３３３＋０によって求められる。同様に、素性ｂに対する第３スコア「０．２６８２４」は、０．８０４７２×０．３３３３３＋０によって求められる。同様に、素性ｃに対する第３スコア「０．２６８２４」は、０．８０４７２×０．３３３３３＋０によって求められる。更に、素性ｄに対する第３スコア「−０．２６８２４」は、０．８０４７２×（−０．３３３３３）＋０によって求められる。 FIG. 14 shows an outline of the update of the combined model in the first round. The third score “0.26824” for the feature a of the combined model data 801b in the first round is obtained by 0.80472 × 0.333333 + 0. Similarly, the third score “0.26824” for the feature b is obtained by 0.80472 × 0.33333 + 0. Similarly, the third score “0.26824” for the feature c is obtained by 0.80472 × 0.33333 + 0. Further, the third score “−0.26824” for the feature d is obtained by 0.80472 × (−0.33333) +0.

続いて、重み更新部２１１は、重み更新処理を実行する（Ｓ１１５）。重み更新処理において、重み更新部２１１は、重みデータ格納部１０７に格納されている重みデータ９０１を更新する。 Subsequently, the weight update unit 211 executes a weight update process (S115). In the weight update process, the weight update unit 211 updates the weight data 901 stored in the weight data storage unit 107.

図１５に、重み更新処理フローを示す。重み更新部２１１は、上述した式（２）に従って、正規化のための係数Ｚ_t（α_t）を算出する（Ｓ３０１）。重み更新部２１１は、パラメータｉに１を設定する（Ｓ３０３）。パラメータｉは、Ｓ３０５及びＳ３０７までのルーチンの実行回数を計数するための変数であり、ｉによって学習事例を特定する。 FIG. 15 shows a weight update processing flow. The weight update unit 211 calculates a coefficient Z _t (α _t ) for normalization according to the above-described equation (2) (S301). The weight update unit 211 sets 1 to the parameter i (S303). The parameter i is a variable for counting the number of executions of the routines up to S305 and S307, and the learning case is specified by i.

重み更新部２１１は、上述した式（１）に従って、次の重みｗ_t+1,iを算出する（Ｓ３０５）。 The weight update unit 211 calculates the next weight w _{t + 1, i} according to the above-described equation (1) (S305).

重み更新部２１１は、パラメータｉに１を加え（Ｓ３０７）、パラメータｉが学習事例の数ｍを超えたか否かを判定する（Ｓ３０９）。パラメータｉが学習事例の数ｍを超えていないと判定した場合には、Ｓ３０５へ戻り上述した処理を繰り返す。 The weight updating unit 211 adds 1 to the parameter i (S307), and determines whether the parameter i exceeds the number m of learning cases (S309). If it is determined that the parameter i does not exceed the number m of learning cases, the process returns to S305 and the above-described processing is repeated.

パラメータｉが学習事例の数ｍを超えたと判定した場合には、重み更新処理を終え、図５に示したＳ１１７の処理に戻る。 If it is determined that the parameter i has exceeded the number m of learning cases, the weight update process is terminated, and the process returns to S117 shown in FIG.

図１６に、第１ラウンドにおける重み更新の概要を示す。図１６は、第１弱仮説による分類結果データ１２０１ａと、ラベルデータ７０１と、第１ラウンドにおける重みデータ９０１ａと、第１弱仮説に対する確信度α₁とに基づいて、第２ラウンドにおける重みデータ９０１ｂが生成される様子を示している。 FIG. 16 shows an outline of weight update in the first round. 16, the classification result data 1201a according to the first weak hypothesis, a label data 701, the weight data 901a in the first round, based on the confidence alpha ₁ for the first weak hypothesis, weight data 901b in the second round Is shown.

「正しい」の評価を得た学習事例ｘ₁に対応する重みは、「０．３３３３３」から「０．１９１１４」へ減っている。同様に、学習事例ｘ₃に対応する重みも、「０．３３３３３」から「０．２４９９５」へ減っている。一方、「誤り」の評価を得た学習事例ｘ₂に対応する重みは、「０．３３３３３」から「０．５５８９１」へ増えている。このように「正しい」の評価を得た学習事例に対する重みを減らし、「誤り」の評価を得た学習事例に対する重みを増やすことによって、次の弱学習処理において修正された弱仮説が得られるようになる。 The weight corresponding to the learning example x ₁ that has obtained the “correct” evaluation is reduced from “0.33333” to “0.19114”. Similarly, the weight corresponding to the learning example x ₃ is also decreased from “0.33333” to “0.24995”. On the other hand, the weight corresponding to the learning example x ₂ that has been evaluated as “error” has increased from “0.33333” to “0.55891”. In this way, by reducing the weight for learning cases that have obtained a “correct” evaluation and increasing the weight for learning cases that have received an “error” evaluation, it is possible to obtain a weak hypothesis that is corrected in the next weak learning process. become.

図５の説明に戻る。ブースティング部２０１は、パラメータｔに１を加える（Ｓ１１７）。そして、ブースティング部２０１は、パラメータｔが繰り返し回数Ｔを超えたか否かを判定する（Ｓ１１９）。パラメータｔが繰り返し回数Ｔを超えていないと判定した場合には、Ｓ１０９に戻って一連の処理を繰り返す。パラメータｔが繰り返し回数Ｔを超えたと判定した場合には、図４に示したモデル適用装置の出力部１１９ｂは、最後的な結合モデルデータ８０１を出力する（Ｓ１２１）。図２に示したモデル学習装置の場合には、Ｓ１２１の処理を省くようにしてもよい。 Returning to the description of FIG. The boosting unit 201 adds 1 to the parameter t (S117). Then, the boosting unit 201 determines whether or not the parameter t has exceeded the number of repetitions T (S119). If it is determined that the parameter t does not exceed the number of repetitions T, the process returns to S109 and a series of processing is repeated. When it is determined that the parameter t exceeds the number of repetitions T, the output unit 119b of the model application apparatus illustrated in FIG. 4 outputs the final combined model data 801 (S121). In the case of the model learning apparatus shown in FIG. 2, the process of S121 may be omitted.

図１７乃至図２１に、第２ラウンドにおける処理の概要を示す。まず、図１７に、第２ラウンドにおける弱学習処理の概要を示す。第２ラウンドでは、第１ラウンドで更新された重みデータ９０１ｂに基づいて、弱学習処理が行われる。 17 to 21 show an outline of processing in the second round. First, FIG. 17 shows an outline of weak learning processing in the second round. In the second round, weak learning processing is performed based on the weight data 901b updated in the first round.

第１ラウンドで生成された弱仮説を、第２弱仮説という。弱仮説データ１００１ｂは、第２弱仮説のデータである。 The weak hypothesis generated in the first round is referred to as the second weak hypothesis. The weak hypothesis data 1001b is data of the second weak hypothesis.

図１８に、第２弱仮説による分類の概要を示す。第２ラウンドでは、第２弱仮説に係る弱仮説データ１００１ｂを適用した分類が行われ、分類結果が得られる。分類結果データ１２０１ｂは、第２弱仮説による分類結果を示している。第２ラウンドでは、学習事例ｘ₂に対する評価が「正しい」に変わっている。 FIG. 18 shows an overview of classification based on the second weak hypothesis. In the second round, classification using the weak hypothesis data 1001b related to the second weak hypothesis is performed, and a classification result is obtained. The classification result data 1201b indicates the classification result based on the second weak hypothesis. In the second round, the evaluation for the learning example x ₂ is changed to “correct”.

図１９に、第２ラウンドにおける確信度算出の概要を示す。第２弱仮説に対するα₂は「１．１８６４７」であり、第１弱仮説に対する確信度α₁よりも大きい。 FIG. 19 shows an outline of the certainty factor calculation in the second round. Α ₂ for the second weak hypothesis is “1.18647”, which is larger than the certainty factor α ₁ for the first weak hypothesis.

図２０に、第２ラウンドにおける結合モデルの更新の概要を示す。図２０は、第２弱仮説に係る弱仮説データ１００１ｂと、第２弱仮説に対するα₂と基づいて、第１ラウンドにおける結合モデルデータ８０１ｂが第２ラウンドにおける結合モデルデータ８０１ｃに更新される様子を示している。 FIG. 20 shows an outline of the update of the combined model in the second round. FIG. 20 shows how the combined model data 801b in the first round is updated to the combined model data 801c in the second round based on the weak hypothesis data 1001b related to the second weak hypothesis and α ₂ for the second weak hypothesis. Show.

最後に、モデル適用部１１５におけるモデル適用処理について説明する。図２１に、モデル適用処理フローを示す。 Finally, the model application process in the model application unit 115 will be described. FIG. 21 shows a model application processing flow.

対象データ入力部１１１は、処理対象データを入力する（Ｓ４０１）。処理対象データは、対象データ格納部１１３に格納される。処理対象データは、学習事例データ６０１と同様に素性の組み合わせを含む。学習事例データ６０１に含まれる学習事例における素性の組み合わせ以外の未知の組み合わせを処理対象としてもよい。 The target data input unit 111 inputs processing target data (S401). The processing target data is stored in the target data storage unit 113. The processing target data includes a combination of features as in the learning case data 601. An unknown combination other than the combination of features in the learning case included in the learning case data 601 may be processed.

モデル適用部１１５は、結合モデルによる分類処理を実行する（Ｓ４０３）。つまり、モデル学習によって生成された結合モデルデータ８０１を用いて分類器による分類が行われる。分類結果は、分類結果格納部１１７に格納される。そして、出力部１１９は、分類結果格納部１１７に格納されている分類結果を出力する（Ｓ４０５）。 The model application unit 115 executes classification processing based on the combined model (S403). That is, classification by the classifier is performed using the combined model data 801 generated by model learning. The classification result is stored in the classification result storage unit 117. Then, the output unit 119 outputs the classification result stored in the classification result storage unit 117 (S405).

多くの場合には、最終的に生成された結合モデルデータ８０１が用いられる。但し、以下では、上述した第１ラウンドにおける結合モデルデータ８０１ｂと第２ラウンドにおける結合モデルデータ８０１ｃとの適性を比較するために、これらの結合モデルデータ８０１ｂ及び８０１ｃを用いたモデル適用の例を示す。 In many cases, the finally generated combined model data 801 is used. However, in the following, in order to compare the suitability of the combined model data 801b in the first round and the combined model data 801c in the second round, an example of model application using the combined model data 801b and 801c is shown. .

また、多くの場合には、学習事例における素性の組み合わせ以外の未知の組み合わせを処理対象とするが、以下では、改善の様子を説明しやすくするために、学習事例における組み合わせと同じ組み合わせを処理対象とする。 In many cases, unknown combinations other than the combination of features in the learning case are processed, but in the following, the same combination as the combination in the learning case is processed to make it easier to explain the improvement. And

図２２に、第１ラウンドにおける結合モデルデータ８０１ｂを用いたモデル適用の例を示す。処理対象データ２２０１に含まれる処理対象ｘ₁₁における素性の組み合わせは、学習事例ｘ₁に係る素性の組み合わせと同様である。従って、処理対象ｘ₁₁は、ラベル「＋１」に分類されるべきである。同じく処理対象ｘ₁₂における素性の組み合わせは、学習事例ｘ₂に係る素性の組み合わせと同様である。従って、処理対象ｘ₁₂は、ラベル「−１」に分類されるべきである。更に、処理対象ｘ₁₃における素性の組み合わせは、学習事例ｘ₃に係る素性の組み合わせと同様である。従って、処理対象ｘ₁₃は、ラベル「＋１」に分類されるべきである。 FIG. 22 shows an example of model application using the combined model data 801b in the first round. The combination of features in the processing target x ₁₁ included in the processing target data 2201 is the same as the combination of features related to the learning example x ₁ . Therefore, the processing target x ₁₁ should be classified into the label “+1”. Similarly, the combination of features in the processing target x ₁₂ is the same as the combination of features related to the learning example x ₂ . Therefore, the processing target x ₁₂ should be classified into the label “−1”. Further, the combination of features in the processing target x ₁₃ is the same as the combination of features related to the learning example x ₃ . Therefore, the processing target x ₁₃ should be classified into the label “+1”.

処理対象ｘ₁₁に対する分類結果である第２スコアＨ₁（ｘ₁₁）は、分類されるべきラベルと同様に正である。従って、評価は「正しい」である。また、処理対象ｘ₁₃に対する分類結果である第２スコアＨ₁（ｘ₁₃）も、分類されるべきラベルと同様に正である。従って、評価は「正しい」である。一方、処理対象ｘ₁₂に対する分類結果である第２スコアＨ₁（ｘ₁₂）は、期待される負の値になっていない。従って、評価は「誤り」である。つまり、処理対象うち１つは正しく分類されていない。 The second score H ₁ (x ₁₁ ), which is the classification result for the processing target x ₁₁ , is positive as is the label to be classified. Therefore, the evaluation is “correct”. Further, the second score H ₁ (x ₁₃ ), which is the classification result for the processing target x ₁₃ , is also positive like the label to be classified. Therefore, the evaluation is “correct”. On the other hand, the second score H ₁ (x ₁₂ ), which is the classification result for the processing target x _12, is not an expected negative value. Therefore, the evaluation is “error”. That is, one of the processing targets is not correctly classified.

図２３に、第２ラウンドにおける結合モデルデータ８０１ｃを用いたモデル適用の例を示す。処理対象データ２２０１は、図２２の場合と同様である。第２ラウンドにおける結合モデルデータ８０１ｃを用いた場合には、処理対象ｘ₁₁乃至処理対象ｘ₁₃に対する分類結果に対する評価が「正しい」である。つまり、処理対象が残らず正しく分類されている。この例は、ラウンドが増すと結合モデルの適性が高まることを示している。 FIG. 23 shows an example of model application using the combined model data 801c in the second round. The processing target data 2201 is the same as in the case of FIG. When using a binding model data 801c in the second round, evaluation of the classification results for the processed x ₁₁ to be processed x ₁₃ is "correct". In other words, all the processing objects remain and are correctly classified. This example shows that the suitability of the combined model increases with increasing rounds.

本実施の形態によれば、ラベルが示す所定の２値のいずれかに合致する傾向（この例では、第２スコアにおける正又は負の別）とその傾向の程度（この例では、第２スコアの絶対値）を示す分類結果（この例では、第２スコア）を求める弱仮説の確信度を算出する処理の負荷を軽減できる。 According to the present embodiment, the tendency (in this example, positive or negative in the second score) that matches one of the predetermined two values indicated by the label and the degree of the tendency (in this example, the second score) ) (The absolute value of the second hypothesis) can be reduced.

更に、ラベルが示す所定の２値のいずれかに合致する傾向とその傾向の程度を示す分類結果を求める弱仮説を結合させるブースティングにおける学習の収束性を担保できる。 Furthermore, it is possible to secure the convergence of learning in boosting in which a tendency that matches one of the predetermined two values indicated by the label and a weak hypothesis for obtaining a classification result indicating the degree of the tendency are combined.

以上本発明の一実施の形態を説明したが、本発明はこれに限定されるものではない。例えば、上述の機能ブロック構成はプログラムモジュール構成に一致しない場合もある。 Although one embodiment of the present invention has been described above, the present invention is not limited to this. For example, the functional block configuration described above may not match the program module configuration.

また、上で説明した各記憶領域の構成は一例であって、上記のような構成でなければならないわけではない。さらに、処理フローにおいても、処理結果が変わらなければ、処理の順番を入れ替えることや複数の処理を並列に実行させるようにしても良い。 Further, the configuration of each storage area described above is an example, and the above configuration is not necessarily required. Further, in the processing flow, if the processing result does not change, the processing order may be changed or a plurality of processes may be executed in parallel.

なお、上で述べたモデル学習装置は、コンピュータ装置であって、図２４に示すように、メモリ２５０１とＣＰＵ（Central Processing Unit）２５０３とハードディスク・ドライブ（ＨＤＤ：Hard Disk Drive）２５０５と表示装置２５０９に接続される表示制御部２５０７とリムーバブル・ディスク２５１１用のドライブ装置２５１３と入力装置２５１５とネットワークに接続するための通信制御部２５１７とがバス２５１９で接続されている。オペレーティング・システム（ＯＳ：Operating System）及び本実施例における処理を実施するためのアプリケーション・プログラムは、ＨＤＤ２５０５に格納されており、ＣＰＵ２５０３により実行される際にはＨＤＤ２５０５からメモリ２５０１に読み出される。ＣＰＵ２５０３は、アプリケーション・プログラムの処理内容に応じて表示制御部２５０７、通信制御部２５１７、ドライブ装置２５１３を制御して、所定の動作を行わせる。また、処理途中のデータについては、主としてメモリ２５０１に格納されるが、ＨＤＤ２５０５に格納されるようにしてもよい。本発明の実施例では、上で述べた処理を実施するためのアプリケーション・プログラムはコンピュータ読み取り可能なリムーバブル・ディスク２５１１に格納されて頒布され、ドライブ装置２５１３からＨＤＤ２５０５にインストールされる。インターネットなどのネットワーク及び通信制御部２５１７を経由して、ＨＤＤ２５０５にインストールされる場合もある。このようなコンピュータ装置は、上で述べたＣＰＵ２５０３、メモリ２５０１などのハードウエアとＯＳ及びアプリケーション・プログラムなどのプログラムとが有機的に協働することにより、上で述べたような各種機能を実現する。 The model learning device described above is a computer device, and as shown in FIG. 24, a memory 2501, a CPU (Central Processing Unit) 2503, a hard disk drive (HDD: Hard Disk Drive) 2505, and a display device 2509. A display control unit 2507 connected to the computer, a drive device 2513 for a removable disk 2511, an input device 2515, and a communication control unit 2517 for connecting to a network are connected by a bus 2519. An operating system (OS) and an application program for executing the processing in this embodiment are stored in the HDD 2505, and are read from the HDD 2505 to the memory 2501 when executed by the CPU 2503. The CPU 2503 controls the display control unit 2507, the communication control unit 2517, and the drive device 2513 according to the processing content of the application program, and performs a predetermined operation. Further, data in the middle of processing is mainly stored in the memory 2501, but may be stored in the HDD 2505. In the embodiment of the present invention, an application program for performing the above-described processing is stored in a computer-readable removable disk 2511 and distributed, and installed in the HDD 2505 from the drive device 2513. In some cases, the HDD 2505 may be installed via a network such as the Internet and the communication control unit 2517. Such a computer apparatus realizes various functions as described above by organically cooperating hardware such as the CPU 2503 and the memory 2501 described above and programs such as the OS and application programs. .

以上述べた本発明の実施の形態をまとめると、以下のようになる。 The embodiment of the present invention described above is summarized as follows.

本実施の形態に係る情報処理装置は、（Ａ）学習事例と当該学習事例に対する、所定の２値のうちのいずれかを示すラベルとを含む複数のセットと、各学習事例に対する係数値とに基づいて、仮説モデルを学習する学習部と、（Ｂ）（ｂ１）仮説モデルと各学習事例とに基づき、学習事例毎に上記２値のいずれかに対応する傾向及び当該傾向の程度を示す分類結果を求め、（ｂ２）学習事例のうち、分類結果における上記傾向が、対応するラベルに対応する学習事例を特定し、特定した各学習事例について、対応する係数値と、分類結果における上記程度との積を求め、当該積の第１合計を算出し、（ｂ３）学習事例のうち、分類結果における上記傾向が、対応するラベルに対応しない学習事例を特定し、特定した各学習事例について、対応する係数値と、分類結果における上記程度との積を求め、当該積の第２合計を算出し、（ｂ４）第２合計に対する第１合計の比の対数を、各分類結果における上記程度の絶対値のうちの最大値の２倍の値で除することによって、仮説モデルの確信度を算出する算出部とを含む。 The information processing apparatus according to the present embodiment includes (A) a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case. Based on the learning unit that learns the hypothesis model, and (B) (b1) the classification indicating the tendency corresponding to one of the above two values and the degree of the tendency for each learning case based on the hypothesis model and each learning case (B2) Among the learning cases, the trend in the classification result identifies the learning case corresponding to the corresponding label, and for each identified learning case, the corresponding coefficient value and the above degree in the classification result (B3) Among the learning cases, the above-mentioned tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and corresponds to each identified learning case. (B4) The logarithm of the ratio of the first sum to the second sum is calculated as the absolute value of the above degree in each classification result. A calculation unit that calculates the certainty factor of the hypothesis model by dividing by a value that is twice the maximum value among the values.

このようにすれば、ラベルが示す所定の２値のいずれかに対応する傾向とその傾向の程度を示す分類結果を求める仮説モデルの確信度を算出する処理の負荷を軽減できる。 In this way, it is possible to reduce the processing load for calculating the certainty factor of the hypothesis model for obtaining the classification result indicating the tendency corresponding to one of the predetermined two values indicated by the label and the degree of the tendency.

更に、上記算出部は、複数回繰り返す各ラウンドにおいて、当該ラウンドにおける各学習事例に対する係数値を用いて、当該ラウンドにおける仮説モデル及び当該仮説モデルの確信度を算出するようにしてもよい。また、各ラウンドにおける確信度に基づいて、当該ラウンドにおける仮説モデルを結合させることによって学習モデルを更新する第１更新部を含むようにしてもよい。また、確信度に基づいて、各学習事例に対する係数値を次のラウンドのための係数値に更新する第２更新部を含むようにしてもよい。 Further, the calculation unit may calculate a hypothesis model and a certainty factor of the hypothesis model in each round using a coefficient value for each learning case in the round in each round repeated a plurality of times. Moreover, you may make it include the 1st update part which updates a learning model by combining the hypothesis model in the said round based on the certainty in each round. Moreover, you may make it include the 2nd update part which updates the coefficient value with respect to each learning example to the coefficient value for the next round based on reliability.

このようにすれば、ラベルが示す所定の２値のいずれかに対応する傾向とその傾向の程度を示す分類結果を求める仮説モデルを結合させるブースティングにおける学習の収束性を担保できる。 In this way, it is possible to guarantee the convergence of learning in boosting that combines a tendency corresponding to one of the predetermined two values indicated by the label and a hypothesis model for obtaining a classification result indicating the degree of the tendency.

なお、上で述べた情報処理装置における処理をコンピュータに行わせるためのプログラムを作成することができ、当該プログラムは、例えばフレキシブルディスク、ＣＤ−ＲＯＭ、光磁気ディスク、半導体メモリ、ハードディスク等のコンピュータ読み取り可能な記憶媒体又は記憶装置に格納されるようにしてもよい。尚、中間的な処理結果は、一般的にメインメモリ等の記憶装置に一時保管される。 A program for causing a computer to perform the processing in the information processing apparatus described above can be created. The program can be read by a computer such as a flexible disk, a CD-ROM, a magneto-optical disk, a semiconductor memory, and a hard disk It may be stored in a possible storage medium or storage device. Note that intermediate processing results are generally temporarily stored in a storage device such as a main memory.

以上の実施例を含む実施形態に関し、さらに以下の付記を開示する。 The following supplementary notes are further disclosed with respect to the embodiments including the above examples.

（付記１）
学習事例と当該学習事例に対する、所定の２値のうちのいずれかを示すラベルとを含む複数のセットと、各学習事例に対する係数値とに基づいて、仮説モデルを学習する学習部と、
前記仮説モデルと前記各学習事例とに基づき、前記学習事例毎に前記２値のいずれかに対応する傾向及び当該傾向の程度を示す分類結果を求め、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応する学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第１合計を算出し、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応しない学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第２合計を算出し、
前記第２合計に対する前記第１合計の比の対数を、前記各分類結果における前記程度の絶対値のうちの最大値の２倍の値で除することによって、前記仮説モデルの確信度を算出する算出部と
を含む情報処理装置。 (Appendix 1)
A learning unit for learning a hypothesis model based on a plurality of sets including a learning case and a label indicating any one of predetermined two values for the learning case, and a coefficient value for each learning case;
Based on the hypothesis model and each learning case, obtain a classification result indicating the tendency corresponding to any of the two values and the degree of the tendency for each learning case,
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculate the first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculate the second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing apparatus including a calculation unit.

（付記２）
前記算出部は、複数回繰り返す各ラウンドにおいて、当該ラウンドにおける前記各学習事例に対する係数値を用いて、当該ラウンドにおける仮説モデル及び当該仮説モデルの確信度を算出し、
前記各ラウンドにおける前記確信度に基づいて、当該ラウンドにおける前記仮説モデルを結合させることによって学習モデルを更新する第１更新部と、
前記確信度に基づいて、各学習事例に対する係数値を次のラウンドのための係数値に更新する第２更新部と
を含む付記１記載の情報処理装置。 (Appendix 2)
The calculation unit calculates a hypothesis model and a certainty factor of the hypothesis model in the round using a coefficient value for each learning case in the round in each round repeated a plurality of times,
A first update unit that updates the learning model by combining the hypothesis model in the round based on the certainty factor in each round;
The information processing apparatus according to appendix 1, further comprising: a second updating unit that updates a coefficient value for each learning case to a coefficient value for the next round based on the certainty factor.

（付記３）
学習事例と当該学習事例に対する、所定の２値のうちのいずれかを示すラベルとを含む複数のセットと、各学習事例に対する係数値とに基づいて、仮説モデルを学習する第１ステップと、
前記仮説モデルと前記各学習事例とに基づき、前記学習事例毎に前記２値のいずれかに対応する傾向及び当該傾向の程度を示す分類結果を求める第２ステップと、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応する学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第１合計を算出する第３ステップと、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応しない学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第２合計を算出する第４ステップと、
前記第２合計に対する前記第１合計の比の対数を、前記各分類結果における前記程度の絶対値のうちの最大値の２倍の値で除することによって、前記仮説モデルの確信度を算出する第５ステップと
を含み、コンピュータにより実行される情報処理方法。 (Appendix 3)
A first step of learning a hypothesis model based on a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case;
A second step of obtaining a classification result indicating a tendency corresponding to one of the two values and a degree of the tendency for each learning case based on the hypothesis model and each learning case;
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing method including a fifth step and executed by a computer.

（付記４）
前記第１ステップ乃至前記第５ステップを繰り返す各ラウンドにおいて、当該ラウンドにおける前記各学習事例に対する係数値を用いて、当該ラウンドにおける仮説モデル及び当該仮説モデルの確信度を算出し、
更に、
前記各ラウンドにおける前記確信度に基づいて、当該ラウンドにおける前記仮説モデルを結合させることによって学習モデルを更新する第６ステップと、
前記確信度に基づいて、各学習事例に対する係数値を次のラウンドのための係数値に更新する第７ステップと
を含む付記３記載の情報処理方法。 (Appendix 4)
In each round of repeating the first step to the fifth step, using the coefficient value for each learning case in the round, calculate the hypothesis model in the round and the certainty of the hypothesis model,
Furthermore,
A sixth step of updating the learning model by combining the hypothesis model in the round based on the certainty factor in each round;
The information processing method according to supplementary note 3, including a seventh step of updating a coefficient value for each learning case to a coefficient value for the next round based on the certainty factor.

（付記５）
学習事例と当該学習事例に対する、所定の２値のうちのいずれかを示すラベルとを含む複数のセットと、各学習事例に対する係数値とに基づいて、仮説モデルを学習する第１ステップと、
前記仮説モデルと前記各学習事例とに基づき、前記学習事例毎に前記２値のいずれかに対応する傾向及び当該傾向の程度を示す分類結果を求める第２ステップと、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応する学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第１合計を算出する第３ステップと、
前記学習事例のうち、前記分類結果における前記傾向が、対応する前記ラベルに対応しない学習事例を特定し、特定した各学習事例について、対応する前記係数値と、前記分類結果における前記程度との積を求め、当該積の第２合計を算出する第４ステップと、
前記第２合計に対する前記第１合計の比の対数を、前記各分類結果における前記程度の絶対値のうちの最大値の２倍の値で除することによって、前記仮説モデルの確信度を算出する第５ステップと
をコンピュータに実行させる情報処理プログラム。 (Appendix 5)
A first step of learning a hypothesis model based on a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case;
A second step of obtaining a classification result indicating a tendency corresponding to one of the two values and a degree of the tendency for each learning case based on the hypothesis model and each learning case;
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing program for causing a computer to execute the fifth step.

１０１学習データ入力部１０３学習データ格納部
１０５モデル学習部１０６パラメータ記憶部
１０７重みデータ格納部１０８確信度格納部
１０９モデルデータ格納部１１１対象データ入力部
１１３対象データ格納部１１５モデル適用部
１１７分類結果格納部１１９出力部
１２１受付部２０１ブースティング部
２０３弱学習部２０５初期化部
２０７確信度算出部２０９結合モデル更新部
２１１重み更新部６０１学習事例データ
７０１ラベルデータ８０１結合モデルデータ
９０１重みデータ１００１弱仮説データ
１２０１分類結果データ２２０１処理対象データ DESCRIPTION OF SYMBOLS 101 Learning data input part 103 Learning data storage part 105 Model learning part 106 Parameter storage part 107 Weight data storage part 108 Certainty degree storage part 109 Model data storage part 111 Target data input part 113 Target data storage part 115 Model application part 117 Classification result Storage unit 119 Output unit 121 Reception unit 201 Boosting unit 203 Weak learning unit 205 Initialization unit 207 Certainty factor calculation unit 209 Combined model update unit 211 Weight update unit 601 Learning example data 701 Label data 801 Combined model data 901 Weight data 1001 Weak Hypothesis data 1201 Classification result data 2201 Processing target data

Claims

A learning unit for learning a hypothesis model based on a plurality of sets including a learning case and a label indicating any one of predetermined two values for the learning case, and a coefficient value for each learning case;
Based on the hypothesis model and each learning case, obtain a classification result indicating the tendency corresponding to any of the two values and the degree of the tendency for each learning case,
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculate the first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculate the second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing apparatus including a calculation unit.

The calculation unit calculates a hypothesis model and a certainty factor of the hypothesis model in the round using a coefficient value for each learning case in the round in each round repeated a plurality of times,
A first update unit that updates the learning model by combining the hypothesis model in the round based on the certainty factor in each round;
The information processing apparatus according to claim 1, further comprising: a second updating unit that updates a coefficient value for each learning case to a coefficient value for the next round based on the certainty factor.

A first step of learning a hypothesis model based on a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case;
A second step of obtaining a classification result indicating a tendency corresponding to one of the two values and a degree of the tendency for each learning case based on the hypothesis model and each learning case;
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing method including a fifth step and executed by a computer.

A first step of learning a hypothesis model based on a plurality of sets including a learning case and a label indicating one of predetermined two values for the learning case, and a coefficient value for each learning case;
A second step of obtaining a classification result indicating a tendency corresponding to one of the two values and a degree of the tendency for each learning case based on the hypothesis model and each learning case;
Among the learning cases, the tendency in the classification result identifies a learning case corresponding to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a first sum of the products,
Among the learning cases, the tendency in the classification result identifies a learning case that does not correspond to the corresponding label, and for each learning case specified, the product of the corresponding coefficient value and the degree in the classification result And calculating a second sum of the products,
The confidence of the hypothesis model is calculated by dividing the logarithm of the ratio of the first sum to the second sum by a value twice the maximum of the absolute values of the degree in each classification result. An information processing program for causing a computer to execute the fifth step.