US20140214747A1 - Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program - Google Patents

Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program Download PDF

Info

Publication number
US20140214747A1
US20140214747A1 US14/242,915 US201414242915A US2014214747A1 US 20140214747 A1 US20140214747 A1 US 20140214747A1 US 201414242915 A US201414242915 A US 201414242915A US 2014214747 A1 US2014214747 A1 US 2014214747A1
Authority
US
United States
Prior art keywords
mixture
model
mixture model
components
denoted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/242,915
Inventor
Ryohei Fujimaki
Satoshi Morinaga
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US14/242,915 priority Critical patent/US20140214747A1/en
Publication of US20140214747A1 publication Critical patent/US20140214747A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Algebra (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Operations Research (AREA)
  • Medical Informatics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Complex Calculations (AREA)

Abstract

With respect to the model selection issue of a mixture model, the present invention performs high-speed model selection under an appropriate standard regarding the number of model candidates which exponentially increases as the number and the types to be mixed increase. A mixture model estimation device comprises: a data input unit to which data of a mixture model to be estimated, candidate values of the number of mixtures which are required for estimating the mixture model of the data, and types of components configuring the mixture model and parameters thereof, are input; a processing unit which sets the number of mixtures from the candidate values, calculates, with respect to the set number of mixtures, a variation probability of a hidden variable for a random variable which becomes a target for mixture model estimation of the data, and estimates the optimal mixture model by optimizing the types of the components and the parameters therefor using the calculated variation probability of the hidden variable so that the lower bound of the posterior probabilities of the model separated for each component of the mixture model can be maximized; and a model estimation result output unit which outputs the model estimation result obtained by the processing unit.

Description

  • This application is a Continuation application of U.S. application Ser. No. 13/824,857 filed May 1, 2013, which is a National Stage of PCT/JP2012/056862, filed Mar. 16, 2012, which claims the benefit of priority of Japanese Patent Application No. 2011-060732, filed Mar. 18, 2011, the disclosures of which are incorporated by reference in their entirety.
  • TECHNICAL FIELD
  • The present invention relates to a multivariate data mixture model estimation device, a mixture model estimation method, and a mixture model estimation program, and more particularly, to a multivariate data mixture model estimation device, a mixture model estimation method, and a mixture model estimation program for estimating the number, types, and parameters of models to be mixed.
  • BACKGROUND ART
  • A mixture model (mixture distributions) for representing data using a plurality of models is important in industrial applications. There are various examples thereof such as a mixture normal distribution model and a mixture hidden Markov model. For example, such a mixture model is industrially used for finding a dishonest medical bill based on an observed outlier (Non Patent Literature 1) or detecting a network failure (Non Patent Literature 2). In addition, other important application examples of mixture models include customer behavior clustering in marketing (study on the assumption that similar customers belong to the same model) and analysis on topics of articles (study on the assumption that articles of the same topic belong to the same model).
  • Generally, in the case where the number of mixture (mixture number) of a plurality of models constituting a mixture model (also called components) and the types of components are specified, well-known methods such as an EM algorithm (Non Patent Literature 3) and a variational Bayesian method (Non Patent Literature 4) can be used to specify parameters of distributions (models). It is necessary to determine a mixture number and component types for estimating such parameters. The issue of specifying such models is generally called “model selection issue” or “system identification issue,” and considered as an important issue for constructing reliable models. Therefore, many techniques relating to the issue have been proposed.
  • For example, methods of selecting a model that has a maximum posterior probability are known as methods for determining the number of models to be mixed. Methods proposed for that purpose are: 1) a method based on the amount of Bayesian information; 2) a method based on a variational Bayesian method (for example, Non Patent Literature 4); 3) a method based on nonparametric Bayesian estimation using a Dirichlet process (for example, Non Patent Literature 5); etc.
  • CITATION LIST Non Patent Literature
    • {NPL 1} Kenji Yamanishi, Jun-ichi Takeuchi, Graham Williams, and Peter Milne, “Online Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms”, Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2000), ACM Press, 2000, pp. 320-324.
    • {NPL 2} Kenji Yamanishi, and Yuko Maruyama, “Dynamic Syslog Mining for Network Failure Monitoring”, Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2005), ACM Press, 2005, pp. 499-508.
    • {NPL 3} A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm”, Journal of Royal Statical Society. Series B (Methodological), Vol. 39, No. 1, 1977, pp. 1-38.
    • {NPL 4} Adrian Corduneanu and Christopher M. Bishop, “Variational Bayesian Model Selection for Mixture Distributions”, In Artificial Intelligence and Statistics 2001, T. Jaakkola and T. Richardson (eds.), Morgan Kaufmann, pp. 27-34.
    • {NPL 5} Carl Edward Rasmussen, “The Infinite Gaussian Mixture Model”, in Advances in Neural Information Processing Systems 12, S. A. Solla, T. K. Leen and K.-R. Muller (eds.), MIT Press (2000), pp. 554-560.
    • {NPL 6} Ryohei Fujimaki, Satoshi Morinaga, Michinari Monmma, Kenji Aoki and Takayuki Nakata, “Linear Time Model Selection for Mixture of Heterogeneous Components”, Proceedings of the 1st Asian Conference on Machine Learning, 2009.
    SUMMARY OF INVENTION Technical Problem
  • According to the method 1), model selection is possible without establishing an assumption about an anterior distribution of a model. However, in this case, a Fischer information matrix of a mixture model becomes nonregular, and thus criteria cannot be correctly defined and a proper mixture number cannot be selected.
  • According to the methods 2) and 3), a mixture number is determined using a Direchlet distribution and a Direchlet process as an anterior distribution for a mixture ratio. However, in this case, it is difficult to select an optimal mixture number as compared with general methods in which a model resulting in a high model posterior probability is selected.
  • In addition, according to the methods 1) to 3), it is practically impossible to optimize types of models to be mixed because of the amount of calculation. As an example for figuring out the amount of calculation, selection of a mixture polynomial curve will now be explained.
  • A polynomial curve includes first to high-order terms such as a linear term (first-order curve term), a second-order curve term, and a third-order curve term. Therefore, if an optimal model is selected after searching a mixture number from 1 to Cmax and the order of a curve from first to Dmax, according to the above-described methods, it is necessary to calculate information criteria for all model candidates: one line and two second-order curves (mixture number=3), three third-order curves and two fourth-order curves (mixture number=5), etc. For example, if Cmax=10 and Dmax=10, the number of model candidates is about 100,000, and if Cmax=20 and Dmax=20, the number of model candidates is about ten billion and the complexity of each model candidate to be searched increases exponentially.
  • In addition to the above-mentioned methods, methods based on other model selection criteria such as an Akaike's information criterion and cross-validation have been proposed. However, any of the methods does not make it possible to avoid combinations of types of components.
  • Non Patent Literature 6 proposes a method of minimizing an expectation information criterion of a hidden variable with a minimum description length known as equivalent to a Bayesian information criterion to efficiently search the number and types of models to be mixed. However, in this method, a Fischer information matrix of a mixture model is nonregular due to the same reason as in the method 1), and thus the criterion itself is improper and optical model selection is impossible.
  • An object of the present invention is to solve the problem and to provide a mixture model estimation device, a mixture model estimation method, and a mixture model estimation program in which with respect to the model selection issue of mixture model, model selection can be rapidly done based on a proper criterion regarding the number of model candidates which increase exponentially as the number and types to be mixed increase.
  • Solution to Problem
  • A first aspect of the present invention provides a mixture model estimation device including: a data input unit for inputting data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; a processing unit which sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimates the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and a model estimation result output unit which outputs a model estimation result obtained by the processing unit.
  • A second aspect of the present invention provides a mixture model estimation method including: by using a data input unit, inputting data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; causing a processing unit to set the mixture number from the candidate values, calculate a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data, and optimally estimate the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and causing a model estimation result output unit to output a model estimation result obtained by the processing unit.
  • A third aspect of the present invention provides a mixture model estimation program for operating a computer as a mixture model estimation device including: a data input unit for inputting data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; a processing unit which sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimating the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and a model estimation result output unit which outputs a model estimation result obtained by the processing unit.
  • Advantageous Effect of the Invention
  • According to the present invention, with respect to the model selection issue of mixture model, model selection can be rapidly done based on a proper criterion regarding the number of model candidates which increase exponentially as the number and types to be mixed increase.
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 A figure illustrates a block diagram for showing the structure of a mixture model estimation device according to an embodiment of the present invention.
  • FIG. 2 A figure illustrates a flowchart for showing operations of the mixture model estimation device illustrated in FIG. 1.
  • DESCRIPTION OF EMBODIMENTS
  • Hereinafter, a mixture model estimation device, a mixture model estimation method, and a mixture model estimation program will be described in detail according to embodiments of the present invention with reference to the drawings.
  • The embodiments of the present invention propose a device and method for estimating a mixture model represented by P(X|θ) of equation 1 for input data (observed values).
  • [ Math . 1 ] P ( X | θ ) = c = 1 C π c P c ( X ; φ c S c ) ( 1 )
  • In equation 1, C denotes a mixture number, X denotes a random variable that is the target of mixture model estimation for input data, θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) denotes parameters of models (components), and S1, . . . , SC denote types of the components (π1, . . . , πC of the parameters θ denote mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC denote distribution parameters of components S1 to SC when the mixture number is 1 to C). In addition, for example, component candidates that can be components S1 to SC may be {normal distribution, logarithmic normal distribution, and exponential distribution} in the case of mixture distributions or may be {zeroth to third order curves} in the case of a mixture polynomial curve model. In addition, θ is a function of a mixture number C and component types S1, . . . , SC. However, a description of the function is omitted for conciseness.
  • Next, a hidden variable Z=(Z1, . . . , ZC) will be defined for the random variable X. ZC=1 means that X is data from the cth component, and ZC=0 means that X is data from other than the cth component. In addition, Σc=1 CZC=1. A pair of X and Z is called “a complete variable” (on the contrary, X is called “an incomplete variable”). A joint distribution of the complete variable is defined by P(X, Z|θ) as shown in equation 2.
  • [ Math . 2 ] P ( X , Z | θ ) = c = 1 C ( π c P c ( X ; φ c S c ) ) Z c ( 2 )
  • In the following description, N observed values (data) of the random variable X are denoted by xn (n=1, . . . , N), and N values of the hidden variable Z for the observed values xn are denoted by Zn (n=1, . . . , N). The posterior probabilities of the values zn of the hidden variable Z are expressed by P(zn|xn, θ) as shown in equation 3.

  • [Math. 3]

  • P(z n |x n,θ)∝πc P c(x nc S c )  (3)
  • Although a mixture model is described in the embodiment, the present invention is not limited thereto. For example, the present invention may easily be applied to similar models such as a hidden Markov model derived by expanding a mixture model. Similarly, although distribution of a target random variable X is described in the embodiment, the present invention is not limited thereto. For example, the present invention may be applied to a conditional model P(Y|X) (Y is a target random variable) such as a mixture regression model and a mixture classification model.
  • Referring to FIG. 1, according to an embodiment of the present invention, data (input data) 111 expressed by a plurality of models constituting a mixture model are input to a mixture model estimation device 110, and the mixture model estimation device 110 optimizes a mixture number and types of components for the input data 111 and outputs a model estimation result 112. The mixture model estimation device 110 includes a data input device (data input unit) 101, a mixture number setting unit 102, an initialization unit 103, a hidden variable variation probability calculation unit 104, a hidden variable variation probability storage unit 105, a model optimization unit 106, an optimization assessment unit 107, an optimal model selection unit 108, and a model estimation result output device (model estimation result output unit) 109.
  • The mixture number setting unit 102, the initialization unit 103, the hidden variable variation probability calculation unit 104, the model optimization unit 106, the optimization assessment unit 107, and the optimal model selection unit 108 are processing units of the present invention, which are constructed, for example, by a computer (a central processing unit (CPU), a processor, a data processing device, etc) operating according to control of a program. The hardware and software structures thereof may be any structures as long as functions thereof can be realized.
  • The data input device 101 is provided to input data 111, and when data are input to the data input device 101, parameters necessary for model estimation such as types and parameters of components and candidate values for mixture number are also input. As long as the data 111 and parameters necessary for model estimation can be input, the data input device 101 may be constructed in any structure. For example, the data input device 101 may be constructed using a device such as a communication device, a storage device, and a computer.
  • The mixture number setting unit 102 sets a model mixture number by selecting from input candidate values. Hereinafter, the set mixture number will be denoted by C.
  • The initialization unit 103 performs an initialization process for estimation. Initialization may be performed by any method. For example, types of components may be randomly set for each component, and according to the set types, parameters of each component may be randomly set and the variation probability of a hidden variable may be randomly set.
  • The hidden variable variation probability calculation unit 104 calculates the variation probability of a hidden variable. The parameters θ are calculated by the initialization unit 103 or the model optimization unit 106, and the hidden variable variation probability calculation unit 104 uses the calculated values.
  • A variation probability q(Z) of a hidden variable is calculated by solving an optimization problem expressed by equation 4.
  • [ Math . 4 ] q ( t ) = arg max q ( Z N ) { max q _ ( Z N ) Q ( t - 1 ) ( G ( H ( t - 1 ) , θ ( t - 1 ) , q ( Z N ) , q _ ( Z N ) ) ) } ( 4 )
  • ẐN=Z1, . . . , ZN denotes a hidden variable of data, and a superscript (t) is used to denote a value obtained through calculations after t iterations. In addition, a model is defined as H=(S1, . . . , SC). G to be optimized denotes the lower bound of a Bayesian posterior probability calculated by equation 5. In addition, the hidden variable variation probability storage unit 105 stores Q(t-1)={q(0), q(1), . . . q(t-1)} which is a set of hidden variable variation probabilities calculated by the previous iteration.
  • [ Math . 5 ] G ( H , θ , q ( Z N ) , q _ ( Z N ) ) = Z N q ( Z N ) { log P ( X N , Z N | θ ) - C - 1 2 log N - c = 1 C J c 2 ( log ( n = 1 N q _ ( Z nc ) ) + n = 1 N Z nc - n = 1 N q _ ( Z nc ) n = 1 N q _ ( Z nc ) ) - log q ( Z N ) } ( 5 )
  • The hidden variable variation probability storage unit 105 stores hidden variable variation probabilities calculated by the hidden variable variation probability calculation unit 104 for respective data (Q(t-1) mentioned in the previous paragraph is updated to Q(t)). As long as the hidden variable variation probability storage unit 105 is a storage device such as a memory capable of storing hidden variable variation probabilities calculated for respective data, the hidden variable variation probability storage unit 105 may have any structure. For example, the hidden variable variation probability storage unit 105 may be provided in or outside a computer.
  • The model optimization unit 106 reads the hidden variable variation probabilities Q(t) stored in the hidden variable variation probability storage unit 105 and calculates an optimal model H(t) and parameters θ(t) after t iterations by using equation 6.
  • [ Math . 6 ] H ( t ) , θ ( t ) = arg max H , θ { max q _ ( Z N ) Q ( t ) ( G ( H , θ , q ( t ) ( Z N ) , q _ ( Z N ) ) ) } ( 6 )
  • An important point of the above-described processing is that since an optimization function can be separated according to components, S1 to SC and parameter φ1 S1 to φC SC of G defined by equation 5 can be individually optimized without considering combinations of types of the components (without considering which types of S1 to SC are designated). Therefore, when the types of components are optimized, optimization can be performed without combinational explosion.
  • The optimization assessment unit 107 determines whether the lower bound of a model posterior probability calculated using equation 7 converges.
  • [ Math . 7 ] max q _ ( Z N ) Q ( t ) G ( H ( t ) , θ ( t ) , q ( t ) ( Z N ) , q _ ( Z N ) ) - max q _ ( Z N ) Q ( t - 1 ) G ( H ( t - 1 ) , θ ( t - 1 ) , q ( t - 1 ) ( Z N ) , q _ ( Z N ) ) ( 7 )
  • If it is determined that the lower bound of the model posterior probability does not converge, processes of the hidden variable variation probability calculation unit 104 to the optimization assessment unit 107 are repeated.
  • In this way, processes of the hidden variable variation probability calculation unit 104 to the optimization assessment unit 107 are repeated to optimize a model and parameters, and thus an appropriate model can be selected which maximizes the lower bound of the model posterior probability. Monotonous increase of the lower bound of the model posterior probability by the repeating processes is explained by equation 8.
  • [ Math . 8 ] max q _ ( Z N ) Q ( t ) G ( H ( t ) , θ H ( t ) , q ( t ) ( Z N ) , q _ ( Z N ) ) max q _ ( Z N ) Q ( t ) G ( H ( t - 1 ) , θ H ( t - 1 ) , q ( t ) ( Z N ) , q _ ( Z N ) ) max q _ ( Z N ) Q ( t - 1 ) G ( H ( t - 1 ) , θ H ( t - 1 ) , q ( t ) ( Z N ) , q _ ( Z N ) ) max q _ ( Z N ) Q ( t - 1 ) G ( H ( t - 1 ) , θ H ( t - 1 ) , q ( t - 1 ) ( Z N ) , q _ ( Z N ) ) ( 8 )
  • The types of components and parameters are optimized through the processes performed from the hidden variable variation probability calculation unit 104 to the optimization assessment unit 107 by using the mixture number C set by the mixture number setting unit 102.
  • If the maximized lower bound of the model posterior probability value (the first member of equation 7) is greater than the currently-set lower bound of the model posterior probability, the optimal model selection unit 108 sets the model as an optimized model. If an optimal mixture number is calculated after the lower bound of a model posterior probability (and types of components and parameters) is calculated for all the candidate values for mixture number, the procedure goes to the model estimation result output device 109, and if there is a mixture number candidate with which optimization is not yet performed, the procedure goes back to the mixture number setting unit 102.
  • The model estimation result output device 109 outputs a model estimation result 112 such as the optimized mixture number, types of components, and parameters. As long as the model estimation result output device 109 can output the model estimation result 112, the model estimation result output device 109 may have any structure. For example, the model estimation result output device 109 may be constructed using a device such as a communication device, a storage device, and a computer.
  • Referring to FIG. 2, operations of the mixture model estimation device 110 are briefly explained according to the embodiment.
  • First, data 111 is input to the data input device 101 (Step S100).
  • Next, the mixture number setting unit 102 selects a non-optimized mixture number from input candidate values for mixture number (Step S101).
  • Next, the initialization unit 103 initializes parameters and hidden variable variation probabilities for estimation with respect to the designated mixture number (Step S102)
  • Next, the hidden variable variation probability calculation unit 104 calculates hidden variable variation probabilities and stores the calculated variation probabilities in the hidden variable variation probability storage unit 105 (Step S103).
  • Next, the model optimization unit 106 estimates types and parameters of respective components (Step S104).
  • Next, the optimization assessment unit 107 determines whether the lower bound of a model posterior probability converges (Steps S105 and S106).
  • If it is determined that lower bound of the model posterior probability does not converge (Step S106: NO), Steps S103 to S106 are repeated at the hidden variable variation probability calculation unit 104, the model optimization unit 106, and the optimization assessment unit 107.
  • If it is determined that the lower bound of the model posterior probability converges (Step S106: YES), the optimal model selection unit 108 compares the lower bound of the model posterior probability of a currently-set optimal model (mixture number, types, and parameters) with the lower bound of the model posterior probability of a model obtained through calculations until Step 106, and sets one of the models having a larger lower bound as an optimal model (Step S107).
  • Next, it is determined whether there remains a non-estimated mixture number candidate (S108).
  • If there remains a non-estimated mixture number candidate (Step S108: YES), the procedures at Steps S101 to S108 are repeated at the mixture number setting unit 102, the initialization unit 103, the hidden variable variation probability calculation unit 104, the model optimization unit 106, the optimization assessment unit 107, and the optimal model selection unit 108.
  • If there does not remain a non-estimated mixture number candidate (Step 108: NO), the model estimation result output device 109 outputs a model estimation result 112, and the procedure ends (Step S109).
  • Therefore, according to the embodiment, all the number, types, and parameters of models to be mixed can be efficiently estimated by maximizing the lower bound of a model posterior probability. That is, the lower bound of a model posterior probability separated for respective components is optimized by repeating optimization processes so as to optimize the types and parameters of components and the number of the components.
  • In this way, with respect to the model selection issue of mixture model, model selection can be rapidly done based on a proper criterion regarding the number of model candidates which increase exponentially as the number and types to be mixed increase.
  • Hereinafter, models to which the mixture model estimation device of the embodiment is applicable, and application examples thereof will be specifically described.
  • Example 1 Mixture Distributions Having Different Independence Characteristics
  • If the mixture model estimation device of the embodiment is used, a mixture number and independence of each component can be rapidly optimized for a plurality of mixture distributions of multidimensional data that have different independence characteristics.
  • For example, in the case of a three-dimensional normal distribution, dimension-independent (dependent) eight combinations can be derived, and normal distributions independent of each other (positions of non-diagonal non-zero elements of a covariance matrix) can be derived as component candidates.
  • For example, if distribution estimation is performed on input data about check values (weights, blood pressures, blood sugar values, etc) of medical examinations conducted on people having different ages, genders, and life habits, dependence of the check values on ages, genders, and life habits can be automatically modeled. In addition to the modeling of such dependence, check item groups having different dependent relationships can extracted (clustering) by inspecting posterior probability values of a hidden variable to find out matching between data and their origin components.
  • In addition, the mixture model estimation device of the embodiment can be used for any multidimensional distributions as well as multidimensional normal distributions.
  • Example 2 Various Mixture Distributions
  • If the mixture model estimation device of the embodiment is used, a mixture number and types of component distributions can be optimized for a plurality of different mixture distributions.
  • For example, in the case of distribution candidates each including a normal distribution, a logarithmic normal distribution, and an exponential distribution, a mixture distribution in which the number and parameters of distributions are optimized can be calculated.
  • For example, application to operational risk estimation will be explained. Generally, in a risk distribution, a plurality of event groups having low risks (for example, office procedure misses which are modeled as a logarithmic normal distribution) are mixed with a low-frequency event group having high risks (for example, miss orders of stocks which are modeled as a normal distribution having a high mean value).
  • Although there are a plurality of types of risks (multivariate), the present invention can be used to automatically and properly determine the types, number, and parameters of distributions and thus estimate a risk distribution.
  • The mixture model estimation device of the embodiment is not limited to applications to particular distributions such as a normal distribution, a logarithmic normal distribution, and an exponential distribution, but can be applied to any types of distributions.
  • Example 3 Mixture Distributions of Different Stochastic Regression Functions
  • If the mixture model estimation device of the embodiment is used, a regression function relating to a mixture number and types of components can be rapidly optimized for mixture distributions of different stochastic regression functions.
  • For example, a regression-curve mixture model having a polynomial curve (or a curved surface in the case of multidimensional data) will now be explained. In this case, a polynomial curve having terms of different orders may be selected as a candidate of each component. If the mixture model estimation device of the present invention is used, a mixture number and orders of a polynomial curve of each component can be optimized.
  • The mixture model estimation device of the embodiment is not limited to applications to a polynomial curve, but can be applied to a mixture model having any regression functions of a plurality of types.
  • Example 4 Mixture Distributions of Different Stochastic Discriminant Functions
  • If the mixture model estimation device of the embodiment is used, a classifier function relating to a mixture number and each component can be optimized for mixture distributions of different discriminant functions.
  • For example, an explanation will be given of a failure diagnosis for identifying the types of failures of an automobile using sensor values obtained from automobile data. Since sensors to notice are determined according to failures, automobiles, running conditions, sensor values for which a classifier function is used are changed.
  • If the mixture model estimation device of the embodiment is used, although various data are involved, a classifier function using a plurality of sensor values can be automatically estimated (for example, sensor values for a component candidate can be determined).
  • Example 5 Mixture Distributions of Hidden MarKov Model Having Different Output Probabilities
  • If the mixture model estimation device of the embodiment is used, a hidden state number and the types of output probabilities, and parameters can be optimized for a hidden Markov model having different output probabilities.
  • For example, even in the case of different distributions the output probabilities of which are in normal distribution, logarithmic distribution, and exponential distribution due to hidden states, a hidden Markov model in which the number and parameters of distributions are optimized can be studied.
  • For example, although estimation of hidden states and output probabilities is important in voice recognition, voices measured under different environments result in different output probabilities due to different noise conditions. However, according to the embodiment, efficient model estimation is possible under such conditions.
  • The mixture model estimation device may be provided in the form of hardware, software, or a combination thereof. In this case, the structure of hardware or software is not limited to a particular structure but can be any form as long as the above-described functions can be provided.
  • The above-described embodiments and Examples are partially or entirely expressed in the following Supplementary Notes. However, the present invention is not limited thereto.
  • (Supplementary Note 1) A mixture model estimation device includes: a data input unit that inputs data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; a processing unit that sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimating the mixture model by optimizing the types and components of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and a model estimation result output unit which outputs a model estimation result obtained by the processing unit.
  • (Supplementary Note 2) In the mixture model estimation device of Supplementary Note 1, the processing unit obtains the mixture number of the mixture model optimally by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
  • (Supplementary Note 3) In the mixture model estimation device of Supplementary Note 1 or 2, if the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1 . . . , φC SC) (π1, . . . , φC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C), the mixture model is expressed by equation 1,
  • if the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable Z is defined by equation 2,
    if N data values of the random variable X are denoted by xn (n=1, . . . , N), and N values of the hidden variable Z for the values xn are denoted by zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3,
    wherein the processing unit calculates the variation probability of the hidden variable by solving an optimization problem expressed by equation 4 where ZN=Z1, . . . , ZN denotes the hidden variable, Q(t)={q(0), q(1), . . . , q(t)} (a superscript (t) means a value calculated after t iterations) denotes the variation probability of the hidden variable, H=(S1, . . . , SC) denotes the mixture model, and G denotes the lower bound of the model posterior probability; the processing unit calculates the lower bound of the model posterior probability by equation 5; the processing unit calculates an optimal mixture model H(t) and parameters θ(t) of components of the optimal mixture model after t iterations by using the variation probability of the hidden variable and equation 6; the processing unit determines whether the lower bound of the model posterior probability converges by using equation 7; if the processing unit determines that the lower bound of the model posterior probability does not converge, the processing unit repeats processes of equation 4 to equation 7, and if the processing unit determines that the lower bound converges, the processing unit compares a lower bound of a model posterior probability of a currently-set optimal mixture model with the lower bound of the model posterior probability obtained through calculation, and sets the lager value as the, wherein the processing unit repeats the processes of equation 4 to equation 7 for all the candidate values for the mixture number so as to estimate the mixture model optimally.
  • (Supplementary Note 4) In the mixture model estimation device of any one of Supplementary Notes 1 to 3, the mixture model includes a plurality of mixture distributions having different independence characteristics.
  • (Supplementary Note 5) In the mixture model estimation device of any one of Supplementary Notes 1 to 3, the mixture model includes a plurality of various mixture distributions.
  • (Supplementary Note 6) In the mixture model estimation device of any one of Supplementary Notes 1 to 3, the mixture model includes mixture distributions of different stochastic regression functions.
  • (Supplementary Note 7) In the mixture model estimation device of any one of Supplementary Notes 1 to 3, the mixture model includes mixture distributions of different stochastic discriminant functions.
  • (Supplementary Note 8) In the mixture model estimation device of any one of Supplementary Notes 1 to 3, the mixture model includes mixture distributions of a hidden Markov model having different output probabilities.
  • (Supplementary Note 9) A mixture model estimation method includes: by using an input unit, inputting data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; causing a processing unit to set the mixture number from the candidate values, calculate a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data, and optimally estimate the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and causing a model estimation result output unit to output a model estimation result obtained by the processing unit.
  • (Supplementary Note 10) In the mixture model estimation method of Supplementary Note 9, the processing unit obtains the mixture number of the mixture model optimally by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
  • (Supplementary Note 11) In the mixture model estimation method of Supplementary Note 10 or 11, if the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) (π1, . . . , πC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C), the mixture model is expressed by equation 1,
  • if the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable X is defined by equation 2,
    if N data values of the random variable X are denoted by Xn (n=1, . . . , N), and N values of the hidden variable Z for the values Xn are denoted by Zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3,
    wherein the processing unit calculates the variation probability of the hidden variable by solving an optimization problem expressed by equation 4 where Zn=Z1, . . . , ZN denotes the hidden variable, Q(t)={q(0), q(1), . . . , q(t)} (a superscript (t) means a value calculated after t iterations) denotes the variation probability of the hidden variable, H=(S1, . . . , SC) denotes the mixture model, and G denotes the lower bound of the model posterior probability; the processing unit calculates the model posterior probability by equation 5; the processing unit calculates an optimal mixture model H(t) and parameters θ(t) of components of the optimal mixture model after t iterations by using the variation probability of the hidden variable and equation 6; the processing unit determines whether the lower bound of the model posterior probability converges by using equation 7; if the processing unit determines that the lower bound of the model posterior probability does not converge, the processing unit repeats processes of equation 4 to equation 7, and if the processing unit determines that the lower bound converges, the processing unit compares a lower bound of a model posterior probability of a currently-set optimal mixture model with the lower bound of the model posterior probability obtained after the current iteration, wherein the processing unit repeats the processes of equation 4 to equation 7 for all the candidate values for the mixture number so as to estimate the mixture model optimally.
  • (Supplementary Note 12) In the mixture model estimation method of any one of Supplementary Notes 9 to 11, the mixture model includes a plurality of mixture distributions having different independence characteristics.
  • (Supplementary Note 13) In the mixture model estimation method of any one of Supplementary Notes 9 to 11, the mixture model includes a plurality of various mixture distributions.
  • (Supplementary Note 14) In the mixture model estimation method of any one of Supplementary Notes 9 to 11, the mixture model includes mixture distributions of different stochastic regression functions.
  • (Supplementary Note 15) In the mixture model estimation method of any one of Supplementary Notes 9 to 11, the mixture model includes mixture distributions of different stochastic discriminant functions.
  • (Supplementary Note 16) In the mixture model estimation method of any one of Supplementary Notes 9 to 11, the mixture model includes mixture distributions of a hidden Markov model having different output probabilities.
  • (Supplementary Note 17) A mixture model estimation program operates a computer as a mixture model estimation device including: a data input unit that inputs data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data; a processing unit that sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimates the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and a model estimation result output unit that outputs a model estimation result obtained by the processing unit.
  • (Supplementary Note 18) In the mixture model estimation program of Supplementary Note 17, the optimal mixture number of the mixture model is optimally obtained by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
  • (Supplementary Note 19) In the mixture model estimation program of Supplementary Note 17 or 18, if the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) (π1, . . . , πC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C), the mixture model is expressed by equation 1,
  • if the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable X is defined by equation 2,
    if N data values of the random variable X are denoted by Xn (n=1, . . . , N), and N values of the hidden variable Z for the values Xn are denoted by Zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3,
    wherein the processing unit calculates the variation probability of the hidden variable by solving an optimization problem expressed by equation 4 where Zn=Z1, . . . , ZN denotes the hidden variable, Q(t)={q(0), q(1), . . . , q(t)} (a superscript (t) means a value calculated after t iterations) denotes the variation probability of the hidden variable, H=(S1, . . . , SC) denotes the mixture model, and G denotes the lower bound of the model posterior probability; the processing unit calculates the model posterior probability by equation 5; the processing unit calculates an optimal mixture model H(t) and parameters θ(t) of components of the optimal mixture model after t iterations by using the variation probability of the hidden variable and equation 6; the processing unit determines whether the lower bound of the model posterior probability converges by using equation 7; if the processing unit determines that the lower bound of the model posterior probability does not converge, the processing unit repeats processes of equation 4 to equation 7, and if the processing unit determines that the lower bound converges, the processing unit compares a lower bound of a model posterior probability of a currently-set optimal mixture model with the lower bound of the model posterior probability obtained after the current iteration, wherein the processing unit repeats the processes of equation 4 to equation 7 for all the candidate values for the mixture number so as to estimate the mixture model optimally.
  • (Supplementary Note 20) In the mixture model estimation program of any one of Supplementary Notes 17 to 19, the mixture model includes a plurality of mixture distributions having different independence characteristics.
  • (Supplementary Note 21) In the mixture model estimation program of any one of Supplementary Notes 17 to 19, the mixture model includes a plurality of various mixture distributions.
  • (Supplementary Note 22) In the mixture model estimation program of any one of Supplementary Notes 17 to 19, the mixture model includes mixture distributions of different stochastic regression functions.
  • (Supplementary Note 23) In the mixture model estimation program of any one of Supplementary Notes 17 to 19, the mixture model includes mixture distributions of different stochastic discriminant functions.
  • (Supplementary Note 24) In the mixture model estimation program of any one of Supplementary Notes 17 to 19, the mixture model includes mixture distributions of a hidden Markov model having different output probabilities.
  • While the present invention has been described with reference to the embodiments and Examples thereof, the present invention is not limited to the embodiments and Examples. It will be understood by those of ordinary skill in the art that the structure and details of the present invention may be variously changed within the scope of the present invention.
  • The present application claims priority to Japanese Patent Application No.: 2011-060732, filed on Mar. 18, 2011, which is hereby incorporated by reference in its entirety.
  • INDUSTRIAL APPLICABILITY
  • As described above, the present invention can be used as a multivariate data mixture model estimation device, a mixture model estimation method, or a mixture model estimation program. For example, the present invention can be used as a device, a method, or a program for estimating a mixture model for a plurality of mixture distributions having different independence characteristics, a plurality of various mixture distributions, mixture distributions of different types of stochastic regression functions, mixture distributions of different types of stochastic discriminant functions, a hidden Markov model having different output probabilities, etc.
  • REFERENCE SINGS LIST
    • 101 data input device (date input unit)
    • 102 mixture number setting unit
    • 103 initialization unit
    • 104 hidden variable variation probability calculation unit
    • 105 hidden variable variation probability storage unit
    • 106 model optimization unit
    • 107 optimization assessment unit
    • 108 optimal model selection unit
    • 109 model estimation result output device (model estimation result output unit)
    • 110 mixture model estimation device
    • 111 input data
    • 112 model estimation result

Claims (20)

1. A mixture model estimation device comprising:
a data input unit that inputs data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data;
a processing unit comprising a computer hardware processor that sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimates the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and
a model estimation result output unit that outputs a model estimation result obtained by the processing unit.
2. The mixture model estimation device according to claim 1, wherein the processing unit obtains the mixture number of the mixture model optimally by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
3. The mixture model estimation device according to claim 1, wherein:
the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) (π1, . . . , πC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C),
the mixture model is expressed by equation 1:
[ Math . 1 ] P ( X | θ ) = c = 1 C π c P c ( X ; φ c S c ) ( 1 )
when the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable Z is defined by equation 2:
[ Math . 2 ] P ( X , Z | θ ) = c = 1 C ( π c P c ( X ; φ c S c ) ) Z c ( 2 )
when N data values of the random variable X are denoted by xn (n=1, . . . , N), and N values of the hidden variable Z for the values xn are denoted by zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3:

[Math. 3]

P(z n |x n,θ)∝πc P c(x nc S c )  (3)
4. The mixture model estimation device according to claim 1, wherein the mixture model comprises a plurality of mixture distributions having different independent characteristics.
5. The mixture model estimation device according claim 1, wherein the mixture model comprises a plurality of various mixture distributions.
6. The mixture model estimation device according to claim 1, wherein the mixture model comprises mixture distributions of different stochastic regression functions.
7. The mixture model estimation device according to claim 1, wherein the mixture model comprises mixture distributions of different stochastic discriminant functions.
8. The mixture model estimation device according to claim 1, wherein the mixture model comprises mixture distributions of a hidden Markov model having different output probabilities.
9. A mixture model estimation method comprising:
by using an input unit, inputting data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data;
causing a processing unit to set the mixture number from the candidate values, calculate a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data, and optimally estimate the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and
causing a model estimation result output unit to output a model estimation result obtained by the processing unit.
10. In the mixture model estimation method of claim 9, the processing unit obtains the mixture number of the mixture model optimally by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
11. In the mixture model estimation method of claim 9, wherein:
the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) (π1, . . . , πC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C), the mixture model is expressed by equation 1:
[ Math . 1 ] P ( X | θ ) = c = 1 C π c P c ( X ; φ c S c ) ( 1 )
the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable X is defined by equation 2:
[ Math . 2 ] P ( X , Z | θ ) = c = 1 C ( π c P c ( X ; φ c S c ) ) Z c ( 2 )
if N data values of the random variable X are denoted by Xn (n=1, . . . , N), and N values of the hidden variable Z for the values Xn are denoted by Zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3:

[Math. 3]

P(z n |x n,θ)∝πc P c(x nc S c )  (3)
12. In the mixture model estimation method of claim 9, the mixture model includes a plurality of mixture distributions having different independence characteristics.
13. In the mixture model estimation method of claim 9, the mixture model includes a plurality of various mixture distributions.
14. In the mixture model estimation method of claim 9, the mixture model includes mixture distributions of different stochastic regression functions.
15. In the mixture model estimation method of claim 9, the mixture model includes mixture distributions of different stochastic discriminant functions.
16. In the mixture model estimation method of claim 9, the mixture model includes mixture distributions of a hidden Markov model having different output probabilities.
17. A non-transitory computer-readable medium storing a computer-readable mixture model estimation program for operating a computer as a mixture model estimation device comprising:
an input unit that inputs data of a mixture model to be estimated, and candidate values for a mixture number, and types and parameters of components constituting the mixture model that are necessary for estimating the mixture model of the data;
a processing unit comprising a processor that sets the mixture number from the candidate values, calculates a variation probability of a hidden variable for a random variable which is a target for estimating the mixture model of the data with respect to the set mixture number, and optimally estimates the mixture model by optimizing the types and parameters of the components using the calculated variation probability of the hidden variable so that a lower bound of a model posterior probability separated for each of the components of the mixture model is maximized; and
a model estimation result output unit that outputs a model estimation result obtained by the processing unit.
18. In the mixture model estimation program of claim 17, the optimal mixture number of the mixture model is optimally obtained by calculating the lower bound of the model posterior probability and the types and parameters of the components for all the candidate values for the mixture number.
19. In the mixture model estimation program of claim 17, wherein
the mixture number is denoted by C, the random variable is denoted by X, the types of the components are denoted by S1, . . . , SC, and the parameters of the components are denoted by θ=(π1, . . . , πC, φ1 S1, . . . , φC SC) (π1, . . . , πC are mixture ratios when the mixture number is 1 to C, and φ1 S1, . . . , φC SC are parameters of distributions of components S1 to SC when the mixture number is 1 to C), the mixture model is expressed by equation 1:
[ Math . 1 ] P ( X | θ ) = c = 1 C π c P c ( X ; φ c S c ) ( 1 )
when the hidden variable for the random variable X is denoted by Z=(Z1, . . . , ZC), a joint distribution of a complete variable that is a pair of the random variable X and the hidden variable X is defined by equation 2:
[ Math . 2 ] P ( X , Z | θ ) = c = 1 C ( π c P c ( X ; φ c S c ) ) Z c ( 2 )
when N data values of the random variable X are denoted by Xn (n=1, . . . , N), and N values of the hidden variable Z for the values Xn are denoted by Zn (n=1, . . . , N), a posterior probability of the hidden variable Z is expressed by equation 3:

[Math. 3]

P(z n |x n,θ)∝πc P c(x nc S c )  (3)
20. The mixture model estimation device according to claim 1, wherein the processing unit calculates the variation probability of the hidden variable by solving an optimization problem expressed by a first equation, calculates the lower bound of the model posterior probability by a second equation, calculates an optimal mixture model H(t) and parameters θ(t) of components of the optimal mixture model after t iterations by using the variation probability of the hidden variable by a third equation, and determines whether the lower bound of the model posterior probability converges by using a fourth equation, wherein when the processing unit determines that the lower bound of the model posterior probability does not converge, the processing unit repeats processes of first to fourth equation, and if the processing unit determines that the lower bound converges, the processing unit compares a lower bound of a model posterior probability of a currently-set optimal mixture model with the lower bound of the model posterior probability obtained through calculations, and sets the larger value as the optimal mixture model, and the processing unit repeats the processes of first to fourth equation for all the candidate values for the mixture number so as to estimate the mixture model optimally.
US14/242,915 2011-03-18 2014-04-02 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program Abandoned US20140214747A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/242,915 US20140214747A1 (en) 2011-03-18 2014-04-02 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
JP2011-060732 2011-03-18
JP2011060732 2011-03-18
PCT/JP2012/056862 WO2012128207A1 (en) 2011-03-18 2012-03-16 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program
US201313824857A 2013-05-01 2013-05-01
US14/242,915 US20140214747A1 (en) 2011-03-18 2014-04-02 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US13/824,857 Continuation US8731881B2 (en) 2011-03-18 2012-03-16 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program
PCT/JP2012/056862 Continuation WO2012128207A1 (en) 2011-03-18 2012-03-16 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program

Publications (1)

Publication Number Publication Date
US20140214747A1 true US20140214747A1 (en) 2014-07-31

Family

ID=46879354

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/824,857 Active US8731881B2 (en) 2011-03-18 2012-03-16 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program
US14/242,915 Abandoned US20140214747A1 (en) 2011-03-18 2014-04-02 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/824,857 Active US8731881B2 (en) 2011-03-18 2012-03-16 Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program

Country Status (7)

Country Link
US (2) US8731881B2 (en)
EP (1) EP2687994A4 (en)
JP (1) JP5403456B2 (en)
KR (1) KR101329904B1 (en)
CN (1) CN103221945B (en)
SG (1) SG189314A1 (en)
WO (1) WO2012128207A1 (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9043261B2 (en) 2012-05-31 2015-05-26 Nec Corporation Latent variable model estimation apparatus, and method
US9324026B2 (en) 2013-09-20 2016-04-26 Nec Corporation Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, supply amount prediction device, supply amount prediction method, and recording medium
US9355196B2 (en) * 2013-10-29 2016-05-31 Nec Corporation Model estimation device and model estimation method
US9489632B2 (en) * 2013-10-29 2016-11-08 Nec Corporation Model estimation device, model estimation method, and information storage medium
JP6380404B2 (en) * 2013-11-05 2018-08-29 日本電気株式会社 Model estimation apparatus, model estimation method, and model estimation program
JP6525002B2 (en) * 2014-04-28 2019-06-05 日本電気株式会社 Maintenance time determination apparatus, deterioration prediction system, deterioration prediction method, and recording medium
CN104200090B (en) * 2014-08-27 2017-07-14 百度在线网络技术(北京)有限公司 Forecasting Methodology and device based on multi-source heterogeneous data
CN106156856A (en) * 2015-03-31 2016-11-23 日本电气株式会社 The method and apparatus selected for mixed model
CN108369584B (en) 2015-11-25 2022-07-08 圆点数据公司 Information processing system, descriptor creation method, and descriptor creation program
JP6636883B2 (en) 2016-09-06 2020-01-29 株式会社東芝 Evaluation apparatus, evaluation method, and evaluation program
EP3605363A4 (en) 2017-03-30 2020-02-26 Nec Corporation Information processing system, feature value explanation method and feature value explanation program
JPWO2019069507A1 (en) 2017-10-05 2020-11-05 ドットデータ インコーポレイテッド Feature generator, feature generator and feature generator
CN110298478B (en) * 2019-05-16 2022-12-30 中国人民解放军海军工程大学 Optimization method and device for supplementary storage scheme in modular storage mode
JP7388230B2 (en) * 2020-02-17 2023-11-29 富士通株式会社 Mixture performance optimization device, mixture performance optimization program, mixture performance optimization method, and mixed refrigerant
CN111612102B (en) * 2020-06-05 2023-02-07 华侨大学 Satellite image data clustering method, device and equipment based on local feature selection

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255542A1 (en) * 2003-02-18 2007-11-01 Nec Corporation Method of detecting abnormal behavior

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001202358A (en) 2000-01-21 2001-07-27 Nippon Telegr & Teleph Corp <Ntt> Bayesian inference method for mixed model and recording medium with recorded bayesian inference program for mixed model
US7657102B2 (en) * 2003-08-27 2010-02-02 Microsoft Corp. System and method for fast on-line learning of transformed hidden Markov models
JP4902378B2 (en) 2007-02-06 2012-03-21 日本放送協会 Mixed model initial value calculation device and mixed model initial value calculation program
JP5068228B2 (en) 2008-08-04 2012-11-07 日本電信電話株式会社 Non-negative matrix decomposition numerical calculation method, non-negative matrix decomposition numerical calculation apparatus, program, and storage medium
JP5332647B2 (en) 2009-01-23 2013-11-06 日本電気株式会社 Model selection apparatus, model selection apparatus selection method, and program
JP5170698B2 (en) 2009-04-27 2013-03-27 独立行政法人産業技術総合研究所 Stochastic reasoner
JP5704162B2 (en) * 2010-03-03 2015-04-22 日本電気株式会社 Model selection device, model selection method, and model selection program
US8504491B2 (en) * 2010-05-25 2013-08-06 Microsoft Corporation Variational EM algorithm for mixture modeling with component-dependent partitions

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070255542A1 (en) * 2003-02-18 2007-11-01 Nec Corporation Method of detecting abnormal behavior

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Tianbing Xu, et al. “Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State” IEEE 8th Int’l Conf. on Data Mining, pp. 658-667 (2008). *

Also Published As

Publication number Publication date
KR20130080060A (en) 2013-07-11
SG189314A1 (en) 2013-05-31
JPWO2012128207A1 (en) 2014-07-24
WO2012128207A1 (en) 2012-09-27
KR101329904B1 (en) 2013-11-14
US20130211801A1 (en) 2013-08-15
US8731881B2 (en) 2014-05-20
EP2687994A4 (en) 2014-08-06
EP2687994A1 (en) 2014-01-22
JP5403456B2 (en) 2014-01-29
CN103221945A (en) 2013-07-24
CN103221945B (en) 2016-09-14

Similar Documents

Publication Publication Date Title
US8731881B2 (en) Multivariate data mixture model estimation device, mixture model estimation method, and mixture model estimation program
Chen et al. Efficient ant colony optimization for image feature selection
US8417648B2 (en) Change analysis
Zapranis et al. Principles of neural model identification, selection and adequacy: with applications to financial econometrics
US9311729B2 (en) Information processing apparatus, information processing method, and program
WO2017139147A1 (en) Ranking causal anomalies via temporal and dynamic analysis on vanishing correlations
US9852378B2 (en) Information processing apparatus and information processing method to estimate cause-effect relationship between variables
Bello et al. Data quality measures based on granular computing for multi-label classification
US20140343903A1 (en) Factorial hidden markov models estimation device, method, and program
Turhan et al. A multivariate analysis of static code attributes for defect prediction
Fu et al. Quasi-Newton Hamiltonian Monte Carlo.
Little et al. A multiscale spectral method for learning number of clusters
Lee et al. Classification of high dimensionality data through feature selection using Markov blanket
Matilla-Garcia Nonlinear dynamics in energy futures
Maua et al. Hidden Markov models with set-valued parameters
Luca et al. Point process models for novelty detection on spatial point patterns and their extremes
Morton et al. Variational Bayesian learning for mixture autoregressive models with uncertain-order
Liu et al. Maximum likelihood evidential reasoning-based hierarchical inference with incomplete data
Blachnik Comparison of various feature selection methods in application to prototype best rules
Luebke et al. Linear dimension reduction in classification: adaptive procedure for optimum results
US9489632B2 (en) Model estimation device, model estimation method, and information storage medium
Njah et al. A new equilibrium criterion for learning the cardinality of latent variables
Sedghi et al. Simultaneous estimation of sub-model number and parameters for mixture probability principal component regression
Epstein The Relationship between Common Feature Selection Metrics and Accuracy
US11822564B1 (en) Graphical user interface enabling interactive visualizations using a meta-database constructed from autonomously scanned disparate and heterogeneous sources

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION