JPH0962652A

JPH0962652A - Maximum likelihood estimation method

Info

Publication number: JPH0962652A
Application number: JP22081195A
Authority: JP
Inventors: Shuko Ueda; 修功上田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-08-29
Filing date: 1995-08-29
Publication date: 1997-03-07

Abstract

PROBLEM TO BE SOLVED: To provide a maximum likelihood estimation method capable of solving the problem of local optimality in EM algorithm and obtaining a further improved parameter estimated value by the EM algorithm even for the optional initial value of a parameter. SOLUTION: The logarithmic likelihood function based on observation data is calculated by using a combined probability density function where the combined probability density function of an observation data set and a non- observation data set is supplied parametrically, and the parameter for maximizing the function is obtained by a nonlinear numerical solution method.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、データの一部が欠
損または観測不可能なデータである非観測データを含む
不完全データである観測データからパラメータを推定す
る最尤推定方法に関し、更に詳しくは、混合密度関数の
パラメータ推定等の確率密度関数のパラメータ推定等の
ように主にパラメトリック統計の点推定の基本技術に有
効な最尤推定方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a maximum likelihood estimation method for estimating parameters from observed data which is incomplete data including non-observed data in which some data are missing or unobservable, and more particularly Relates to a maximum likelihood estimation method which is effective for a basic technique of point estimation of parametric statistics, such as parameter estimation of a probability density function such as parameter estimation of a mixed density function.

【０００２】[0002]

【従来の技術】今、ある確率モデルがパラメトリックモ
デル、ｐ（ｘ；Θ）として与えられ、未知パラメータを
データ集合Ｘ＝｛ｘ₁，…，ｘ_N｝から推定する場合、
最尤推定法では、次式で定義される対数尤度関数を最大
化するパラメータを推定値とする。この推定値を最尤推
定値と呼ぶ。2. Description of the Related Art Now, when a certain stochastic model is given as a parametric model, p (x; Θ), and unknown parameters are estimated from the data set X = {x ₁ , ..., X _N },
In the maximum likelihood estimation method, a parameter that maximizes the log-likelihood function defined by the following equation is used as an estimated value. This estimated value is called the maximum likelihood estimated value.

【０００３】[0003]

【数１】これは、データ集合Ｘが確率密度関数ｐ（ｘ；Θ）から
生じているので、逆に、データ集合Ｘが与えられた場
合、Ｘをもたらす可能性が最も高い値、つまり，ｐ
（Ｘ；Θ）が最も大きくなるΘが尤もらしいという尤度
原理に基づいている。ここに、対数をとっているのは、
数学的な取り扱いを容易にするためのもので本質的なも
のではない。[Equation 1] This is because, since the data set X originates from the probability density function p (x; Θ), conversely, given the data set X, the value that is most likely to yield X, that is, p
It is based on the likelihood principle that Θ with the largest (X; Θ) is likely. Here, the logarithm is taken
It is intended to facilitate mathematical handling and is not essential.

【０００４】次に、データ集合Ｘに欠損、もしくは、観
測不可能な部分がある場合の最尤推定について考える。
すなわち、データ集合Ｘが観測データ集合Ｘ_obsと非観
測データ集合Ｘ_misからなるとする。このとき、データ
集合Ｘは完全データ（集合）と呼ばれ、観測データ集合
Ｘ_obsは不完全データ（集合）と呼ばれる。完全データ
の確率密度関数がパラメトリックに、ｐ（Ｘ_obs，Ｘ
_mis；Θ）として与えられているものとすると、観測デ
ータに基づく対数尤度関数は、周辺分布からNext, the maximum likelihood estimation in the case where the data set X has a missing portion or an unobservable portion will be considered.
That is, it is assumed that the data set X consists of the observation data set X _obs and the non-observation data set X _mis . At this time, the data set X is called complete data (set), and the observation data set X _obs is called incomplete data (set). The parametric probability density function of complete data is p (X _obs , X
_mis ; Θ), the log-likelihood function based on the observed data is

【数２】Ｌ（Θ；Ｘ_obs）＝∫log ｐ（Ｘ_obs，Ｘ_mis；Θ）ｄＸ_mis （２）となる。L (Θ; X _obs ) = ∫log p (X _obs , X _mis ; Θ) dX _mis (2)

【０００５】最尤推定値はＬの最大化、つまり、尤度方
程式：The maximum likelihood estimate is the maximization of L, that is, the likelihood equation:

【数３】の解として得られるが、一般には非線形方程式故、解析
解を得ることは困難で、通常、何らかの反復アルゴリズ
ムにより数値的に解かれる。更に、不完全データからの
最尤推定の場合、尤度方程式を直接数値的に解く代わり
に、以下に述べるＥＭ（Expectation Maximization）ア
ルゴリズムと呼ばれるより一般的かつ効率的な公知の方
法がある（A.P.Dempster,N.M.Laird,and D.B.Rubin,"Ma
ximum likelihood from incomplete data via the EM a
lgorithm",Journal of RoyalStatistics,Ser.B,vol.39,
pp.1-38,1977）。(Equation 3) However, since it is generally a nonlinear equation, it is difficult to obtain an analytical solution, and it is usually solved numerically by some iterative algorithm. Further, in the case of maximum likelihood estimation from incomplete data, there is a more general and efficient known method called EM (Expectation Maximization) algorithm described below instead of directly solving the likelihood equation numerically (APDempster , NMLaird, and DBRubin, "Ma
ximum likelihood from incomplete data via the EM a
lgorithm ", Journal of RoyalStatistics, Ser.B, vol.39,
pp.1-38,1977).

【０００６】ＥＭアルゴリズムでは、パラメータは逐次
反復により推定される。今、Ｔ回目の反復後における推
定値をΘ^(t)とすると、Θ^(t+1)は次のＥおよびＭステ
ップにより得られる。In the EM algorithm, the parameters are estimated by successive iterations. Now, if the estimated value after the Tth iteration is Θ ^(t) , Θ ^{(t + 1)} is obtained by the following E and M steps.

【０００７】[0007]

【数４】を満たさなければならない。ここに、ｐ（Ｘ_mis｜Ｘ
_obs；Θ^(t)）は事後確率密度関数で、次式に示すよう
に完全データの確率密度関数から計算される。(Equation 4) Must be met. Where p (X _mis | X
_obs ; Θ ^(t) ) is the posterior probability density function, which is calculated from the complete data probability density function as shown in the following equation.

【０００８】[0008]

【数５】ＥＭアルゴリズムにおいて、逐次的なＱ関数のΘに関す
る最大化は対数尤度関数の逐次的な最大化と等価である
ことが理論的に証明されている（A.P.Dempster,N.M.Lai
rd,and D.B.Rubin,"Maximum likelihood from incomple
te data via theEM algorithm",Journal of RoyalStati
stics,Ser.B,vol.39,pp.1-38,1977）。(Equation 5) In the EM algorithm, it is theoretically proved that the maximization of Θ of the sequential Q function is equivalent to the maximization of the log-likelihood function (APDempster, NMLai
rd, and DBRubin, "Maximum likelihood from incomple
te data via theEM algorithm ", Journal of RoyalStati
stics, Ser.B, vol.39, pp.1-38, 1977).

【０００９】[0009]

【発明が解決しようとする課題】上述した従来のＥＭア
ルゴリズムは、対数尤度関数の局所最適解（極大値）に
収束するに過ぎず、必ずしも対数尤度関数の最大値を見
い出すわけではない。一般には、初期値設定に関する有
効な指針がないため、実際には、色々な初期値に対し
て、各々ＥＭアルゴリズムを実行し、その中で最も良い
解を選ぶという非効率な方法がとられている。The above-mentioned conventional EM algorithm merely converges to the local optimum solution (maximum value) of the log-likelihood function, and does not necessarily find the maximum value of the log-likelihood function. In general, since there is no effective guideline for initial value setting, the inefficient method of actually executing the EM algorithm for various initial values and selecting the best solution among them is taken. There is.

【００１０】本発明は、上記に鑑みてなされたもので、
その目的とするところは、ＥＭアルゴリズムにおける局
所最適性の問題を解決して、パラメータの任意の初期値
に対してもＥＭアルゴリズムより更に良いパラメータ推
定値を得ることができる最尤推定方法を提供することに
ある。The present invention has been made in view of the above,
The object is to provide a maximum likelihood estimation method that can solve the problem of local optimality in the EM algorithm and obtain a better parameter estimate than the EM algorithm for any initial value of the parameter. Especially.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するた
め、請求項１記載の本発明は、観測データ集合と非観測
データ集合の結合確率密度関数をパラメトリックに与
え、該結合確率密度関数を用いて、観測データに基づく
対数尤度関数を計算し、該関数を最大化するパラメータ
を非線形数値解法により求めることを要旨とする。In order to achieve the above object, the present invention according to claim 1 parametrically provides a joint probability density function of an observed data set and an unobserved data set, and uses the joint probability density function. Then, a log-likelihood function based on the observation data is calculated, and a parameter maximizing the function is obtained by a nonlinear numerical solution method.

【００１２】また、請求項２記載の本発明は、請求項１
記載の発明において、適当な初期パラメータ値に対し、
前記対数尤度関数の極大値が唯一となるように平滑化し
た第１次平滑化対数尤度関数を構成する第１ステップ
と、第１次平滑化対数尤度関数を最大化するパラメータ
を前記非線形数値解法で求め、第１次最適パラメータと
して設定する第２ステップと、第１次最適パラメータ値
を初期値として平滑化の度合いを第１ステップより弱め
た第２次平滑化対数尤度関数を構成する第３ステップ
と、第２次平滑化対数尤度関数上で上記と同様にして第
２次最適パラメータを求め、以降、平滑化の度合いがゼ
ロ、すなわち元の対数尤度関数となるまで、平滑化の度
合いを弱めながら同様な処理を行う平滑化工程を再帰的
に行う第４ステップとを実施することにより元の対数尤
度関数の最適パラメータ値を求めることを要旨とする。Further, the present invention according to claim 2 is based on claim 1.
In the described invention, for appropriate initial parameter values,
The first step of forming a first-order smoothed log-likelihood function smoothed so that the maximum value of the log-likelihood function is unique, and a parameter for maximizing the first-order smoothed log-likelihood function are A second step of setting the first optimum parameter by a non-linear numerical solution method and a second smoothed log-likelihood function in which the degree of smoothing is weakened from the first step using the first optimum parameter value as an initial value The second-order optimal parameter is obtained in the same manner as above on the third step of configuring and the second-order smoothed log-likelihood function until the degree of smoothing becomes zero, that is, the original log-likelihood function. The fourth step is to recursively perform a smoothing process in which a similar process is performed while weakening the degree of smoothing, and to obtain the optimum parameter value of the original log-likelihood function.

【００１３】更に、請求項３記載の本発明は、請求項２
記載の発明において、前記第１ステップないし第４ステ
ップでは、観測データと非観測データの結合確率密度関
数から計算される事後確率密度関数を用いて、平滑化対
数尤度関数を得ることを要旨とする。Further, the present invention according to claim 3 provides the invention according to claim 2.
In the invention described above, in the first to fourth steps, a smoothed log-likelihood function is obtained using a posterior probability density function calculated from a joint probability density function of observed data and non-observed data. To do.

【００１４】請求項４記載の本発明は、請求項１，２ま
たは３記載の発明において、前記非線形数値解法とし
て、逐次反復法を用いたことを要旨とする。A fourth aspect of the present invention is characterized in that, in the first, second or third aspect of the invention, the iterative method is used as the nonlinear numerical solution.

【００１５】また、請求項５記載の本発明は、請求項
１，２，３または４記載の発明において、ある平滑化対
数尤度関数の平滑化の度合いと次の平滑化対数尤度関数
の平滑化の度合いを等比級数的に漸減させることを要旨
とする。According to a fifth aspect of the present invention, in the invention according to the first, second, third or fourth aspect, the degree of smoothing of a certain smoothed log-likelihood function and the following smoothed log-likelihood function of The point is to gradually reduce the degree of smoothing in a geometric series.

【００１６】本発明の最尤推定方法において、前記平滑
化は、ＥＭアルゴリズムのＥステップのＱ関数の計算の
際、式（７）の代わりに次式で定義される新たな事後確
率密度関数を用いて実現する。In the maximum likelihood estimation method of the present invention, the smoothing uses a new posterior probability density function defined by the following equation instead of equation (7) when calculating the Q function of the E step of the EM algorithm. Realized by using.

【００１７】[0017]

【数６】を得る。一方、式（１０）の右辺第一項と式（４）を比
較すると、式（１０）の右辺第一項はＱ関数において、
事後確率密度関数ｐ（Ｘ_mis｜Ｘ_obs；Θ^(t)）をｆ
（Ｘ_mis｜Ｘ_obs）に置き換えたものとなっている。ま
た、式（１０）の右辺第二項がΘに無関係であることを
考慮すると、結局、Ｑ関数のΘに関する最大化は、ｐ
（Ｘ_mis｜Ｘ_obs；Θ^(t)）とｆ（Ｘ_mis｜Ｘ_obs）の
差異を除けば、式（１０）の右辺のΘに関する最大化と
等価であることがわかる。更に、ｐ（Ｘ_mis｜Ｘ_obs；
Θ^(t)）の代わりにｆ（Ｘ_mis｜Ｘ_obs）を用いてΘを
逐次推定するということは、式（２）の尤度関数を最大
化するパラメータを求める代わりに、式（１０）の左辺
で定義されるβでコントロールされた尤度関数Ｌ_β：(Equation 6) Get. On the other hand, comparing the first term on the right side of equation (10) with the equation (4), the first term on the right side of equation (10) is
The posterior probability density function p (X _mis | X _obs ; Θ ^(t) ) is f
It has been replaced with (X _mis | X _obs ). Also, considering that the second term on the right side of the equation (10) is independent of Θ, the maximization of Q function with respect to Θ is p
It can be seen that, except for the difference between (X _mis | X _obs ; Θ ^(t) ) and f (X _mis | X _obs ), it is equivalent to maximization of Θ on the right side of equation (10). Furthermore, p (X _mis | X _obs ;
Sequentially estimating Θ using f (X _mis | X _obs ) instead of Θ ^(t) ) means that instead of _obtaining the parameter maximizing the likelihood function of equation (2), equation (10) Likelihood function L _β controlled by β defined on the left side of:

【数７】を最大化するパラメータを逐次推定していることにな
る。そして、このβを１より小さい正値に設定すること
により、Ｌの平滑化が実現できる。明らかに、β＝１の
とき、Ｌ₁（Θ；Ｘ_obs）≡Ｌ（Θ；Ｘ_obs）、すなわ
ち、式（１１）は式（２）に一致し、βの値がゼロに近
いほど平滑化の度合いが大きくなる。(Equation 7) It means that the parameter that maximizes is successively estimated. Then, by setting β to a positive value smaller than 1, smoothing of L can be realized. Apparently, when β = 1, L ₁ (Θ; X _obs ) ≡L (Θ; X _obs ), that is, equation (11) agrees with equation (2), and the closer β is to zero, the smoother The degree of conversion becomes large.

【００１８】今、βの値をNow, the value of β is

【数８】と変化させ、βの値に対応した、平滑化の度合いの異な
る平滑化尤度関数群Ｌ_β ₁，Ｌ_β2，…，Ｌ_βmaxを得
たとする。このとき、Ｌ_β1が単峰となるようにβ₁の
値を設定すれば、その最大値を与える大域的最適パラメ
ータ（Θ₁ ^*とする）は容易に求まる。次に、Ｌ_β1と
Ｌ_β2との形状の類似性から、Ｌ_β2の最適パラメータ
（Θ₂ ^*）はΘ₁ ^*の近傍にあると考えられるので、Θ
₂ ^*はΘ₁ ^*を初期値として容易に求まる。以下を
β₃，β₄，…に対して逐次実行することにより、最終
的にＬ_βmax、すなわち、元の尤度関数を最大化する最
適パラメータを得ることができる。なお、経験則とし
て、βの値を等比級数的に減じていくと効率良く推定パ
ラメータを得ることが知られている。(Equation 8) It is assumed that smoothing likelihood function groups L _β ₁ , L _β2 , ..., L _βmax corresponding to the value of β and having different degrees of smoothing are obtained. At this time, if the value of β ₁ is set so that L _{β 1} has a single peak, the global optimum parameter (referred to as Θ ₁ ^* ) that gives the maximum value can be easily obtained. Next, from the similarity of the shapes of L _β1 and L _β2 , it is considered that the optimum parameter (θ ₂ ^* ) of L _β2 is in the vicinity of Θ ₁ ^*.
₂ ^* can be easily obtained with Θ ₁ ^* as the initial value. By sequentially executing the following for β ₃ , β ₄ , ..., L _βmax , that is, the optimum parameter that maximizes the original likelihood function can be finally obtained. As an empirical rule, it is known that an estimated parameter can be efficiently obtained by reducing the value of β in geometric progression.

【００１９】[0019]

【発明の実施の形態】以下、図面を用いて本発明の実施
の形態について説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００２０】図１は、本発明の一実施形態に係る最尤推
定方法を実施する装置の機能構成を示すブロック図であ
る。同図において、パラメータ推定部１は、訓練観測デ
ータを外部から与えられ、このデータを用いてパラメー
タを推定する。該パラメータ推定部１は、以下に示す手
順１から３を順次実行して推定パラメータを得る。FIG. 1 is a block diagram showing the functional arrangement of an apparatus for carrying out the maximum likelihood estimation method according to an embodiment of the present invention. In the figure, the parameter estimation unit 1 receives training observation data from the outside, and estimates parameters using this data. The parameter estimation unit 1 sequentially executes steps 1 to 3 shown below to obtain estimated parameters.

【００２１】（手順１）パラメータの初期値、およびβ
の値を適当に設定する。ｔ←０とする。(Procedure 1) Initial values of parameters and β
Set the value of to an appropriate value. Let t ← 0.

【００２２】（手順２）適当な収束条件を満たすまで、
手順２−１，２−２を繰り返す。(Procedure 2) Until an appropriate convergence condition is satisfied,
Repeat steps 2-1 and 2-2.

【００２３】[0023]

【数９】以上の手順１から３により、式（２）の尤度関数を最大
化するパラメータが推定される。[Equation 9] The parameters maximizing the likelihood function of the equation (2) are estimated by the above procedures 1 to 3.

【００２４】図２〜図４は本発明の有効性を実験的に示
すものである。実験では、解の探索が可視化可能な２個
のパラメータ推定（この場合、２次元パラメータ平面で
の探索となる）問題を用いた。具体的には、１次元のガ
ウス分布２個の混合分布：2 to 4 show experimentally the effectiveness of the present invention. In the experiment, a two-parameter estimation (in this case, a search in a two-dimensional parameter plane) problem in which the search for the solution can be visualized was used. Specifically, a mixture distribution of two one-dimensional Gaussian distributions:

【数１０】のパラメータ推定問題を用いた。この場合、推定すべき
パラメータはｍ₁，ｍ₂である。実験で用いた訓練デー
タＸ_obsは、上式でｍ₁ ^*＝−２，ｍ₂ ^*＝２とした混
合分布から人工的に１００個生成した。(Equation 10) The parameter estimation problem of is used. In this case, the parameters to be estimated are m ₁ and m ₂ . 100 pieces of training data X _obs used in the experiment were artificially generated from the mixture distribution with m ₁ ^* = − 2 and m ₂ ^* = 2 in the above equation.

【００２５】図２は、式（１３）に対する式（１１）の
β付きの尤度関数Ｌ_β（Θ；Ｘ_obs）の等高線を種々の
βの値に対して表示したものである。同図中のｘ印は各
βの値での平滑化尤度関数が最大となる（ｍ₁，ｍ₂）
座標を示している。前述したように、β＝１の時、式
（１１）は式（２）と等価故、図２のβ＝１の等高線は
元の尤度関数の等高線に対応している。図２から、値を
小さくするほど、尤度関数がより平滑化されているこ
と、およびβの値の差が小さい平滑化尤度関数間では、
関数の最大値を与える大域的最適パラメータの値が接近
していることが各々確認できる。FIG. 2 shows contour lines of the likelihood function L _β (θ; X _obs ) with β in the equation (11) with respect to the equation (13) for various values of β. The x mark in the figure shows the maximum smoothing likelihood function at each β value (m ₁ , m ₂ ).
The coordinates are shown. As described above, when β = 1, since the equation (11) is equivalent to the equation (2), the contour line of β = 1 in FIG. 2 corresponds to the contour line of the original likelihood function. From FIG. 2, it can be seen that the smaller the value is, the smoother the likelihood function is, and that between the smoothed likelihood functions in which the difference in the value of β is small,
It can be confirmed that the values of the global optimum parameters that give the maximum value of the function are close to each other.

【００２６】従来法（ＥＭアルゴリズム）では、図２
（ｆ）のβ＝１に対する尤度関数上でパラメータ推定
（尤度関数の最大値探索）を行っていたのに対し、本発
明による方法では、まず、図２の（ａ）で最大値探索を
行い、次いで、その推定パラメータ値を新たな初期値と
して、同図（ｂ）の尤度関数の最大値探索を行う。以
下、同図（ｆ）まで、同様な最大値探索を行い、最終的
に元の尤度関数の最大値を与える最適パラメータを推定
する。In the conventional method (EM algorithm), as shown in FIG.
In contrast to the parameter estimation (maximum value search of the likelihood function) performed on the likelihood function for β = 1 in (f), the method according to the present invention first searches for the maximum value in (a) of FIG. Then, using the estimated parameter value as a new initial value, the maximum value search of the likelihood function of FIG. Hereinafter, the same maximum value search is performed up to FIG. 6F, and the optimal parameter that finally gives the maximum value of the original likelihood function is estimated.

【００２７】図３、図４は、上記問題に対する本発明に
よる方法と従来手法との推定の過程および推定結果を比
較したものである。本発明による方法は、前述したよう
に、図２に示したような異なるβで逐次探索をしている
が、推定過程を便宜上β＝１、すなわち、元の尤度関数
上で表示している。図３は、初期値をFIGS. 3 and 4 compare the estimation process and the estimation result of the method according to the present invention and the conventional method for the above problem. As described above, the method according to the present invention performs the sequential search with different β as shown in FIG. 2, but the estimation process is represented by β = 1, that is, on the original likelihood function, for convenience. . Figure 3 shows the initial value

【数１１】とした場合の推定過程および推定結果を各々示す。従来
手法では、何れも局所最適値[Equation 11] The estimation process and the estimation results are shown. In the conventional method, all are local optimum values.

【数１２】の近傍に収束しているのに対し、本発明による方法で
は、大域的最適値(Equation 12) While the method according to the present invention converges to the neighborhood of

【数１３】の近傍に収束している。(Equation 13) Has converged near.

【００２８】本発明の最尤推定方法の典型的な応用例と
して、クラスタ解析がある。クラスタ解析とは多次元デ
ータの集合から幾つかの”塊（クラスタ）”を見い出す
ことである。これにより、データ集合を類似したグルー
プに大分類することができる。例えば、図５に示すよう
な２次元データ集合（各データは２つの属性からなる）
の場合、同図に示す楕円形のクラスタが見い出されるこ
とになる。各クラスタをガウス関数等の解析関数でパラ
メトリックに表現し、データ集合をそれら関数の線形和
で表すとき、各関数のパラメータおよび線形重みは、本
発明により、与えられたデータのみから自動的に推定す
ることができる。A typical application of the maximum likelihood estimation method of the present invention is cluster analysis. Cluster analysis is to find out some "clusters" from a set of multidimensional data. This allows the data sets to be roughly classified into similar groups. For example, a two-dimensional data set as shown in FIG. 5 (each data consists of two attributes)
In this case, the elliptical cluster shown in the figure will be found. When each cluster is parametrically expressed by an analytic function such as a Gaussian function and a data set is expressed by a linear sum of those functions, the parameters and linear weights of each function are automatically estimated from only given data according to the present invention. can do.

【００２９】[0029]

【発明の効果】以上説明したように、本発明によれば、
観測データ集合と非観測データ集合の結合確率密度関数
をパラメトリックに与え、該結合確率密度関数を用い
て、観測データに基づく対数尤度関数を計算し、該関数
を最大化するパラメータを非線形数値解法により求める
ことにより、ＥＭアルゴリズムでは局所最適解が得られ
るような初期値に対しても良好なパラメータ推定値を得
ることができる。As described above, according to the present invention,
Parametrically gives the joint probability density function of the observed data set and the non-observed data set, calculates the log-likelihood function based on the observed data by using the joint probability density function, and uses the nonlinear numerical solution method for the parameter that maximizes the function. According to the above, the EM algorithm can obtain a good parameter estimation value even for an initial value with which a local optimum solution is obtained.

[Brief description of drawings]

【図１】本発明の一実施形態に係る尤度推定方法を実施
する装置の機能構成を示すブロック図である。FIG. 1 is a block diagram showing a functional configuration of an apparatus that implements a likelihood estimation method according to an embodiment of the present invention.

【図２】式（１３）に対する式（１１）のβ付き尤度関
数の等高線を種々のβの値に対して表示した本発明の効
果を示すための図である。FIG. 2 is a diagram showing the effect of the present invention in which contour lines of the likelihood function with β in equation (11) with respect to equation (13) are displayed for various values of β.

【図３】本発明による方法と従来手法との推定の過程お
よび推定結果の比較を示す図である。FIG. 3 is a diagram showing a comparison of an estimation process and an estimation result between the method according to the present invention and a conventional method.

【図４】本発明による方法と従来手法との推定の過程お
よび推定結果の比較を示す図である。FIG. 4 is a diagram showing a comparison of an estimation process and an estimation result between the method according to the present invention and the conventional method.

【図５】２次元データ集合のクラスタ解析において見い
出された楕円形のクラスタを示す図である。FIG. 5 is a diagram showing an elliptical cluster found in a cluster analysis of a two-dimensional data set.

[Explanation of symbols]

１パラメータ推定部 1 parameter estimation unit

Claims

[Claims]

1. Parametrically providing a joint probability density function of an observed data set and an unobserved data set, calculating a log-likelihood function based on observed data using the joint probability density function, and maximizing the function. A maximum likelihood estimation method characterized in that parameters are obtained by a non-linear numerical solution.

2. A first smoothed value such that the maximum value of the log-likelihood function is unique to an appropriate initial parameter value.
A first step of forming a second-order smoothed log-likelihood function, a second step of obtaining a parameter that maximizes the first-order smoothed log-likelihood function by the nonlinear numerical solution method, and setting the parameter as a first-order optimum parameter; A third step of forming a second-order smoothed log-likelihood function in which the degree of smoothing is weakened from the first-step by using the first-order optimum parameter value as an initial value, and the second-order smoothed log-likelihood function described above The second-order optimal parameter is obtained in the same manner as, and thereafter, the degree of smoothing is zero,
That is, the optimum parameter value of the original log-likelihood function is obtained by performing the fourth step of recursively performing the smoothing process of performing the same processing while weakening the degree of smoothing until the original log-likelihood function is obtained. The maximum likelihood estimation method according to claim 1, wherein

3. The smoothed log-likelihood function is obtained in the first to fourth steps using a posterior probability density function calculated from a joint probability density function of observed data and non-observed data. The maximum likelihood estimation method according to claim 2.

4. The maximum likelihood estimation method according to claim 1, wherein an iterative method is used as the nonlinear numerical solution.

5. The smoothing degree of a certain smoothed log-likelihood function and the smoothing degree of the next smoothed log-likelihood function are reduced in a geometric progression. The maximum likelihood estimation method described in 3 or 4.