WO2011108632A1 - モデル選択装置、モデル選択方法及びモデル選択プログラム - Google Patents
モデル選択装置、モデル選択方法及びモデル選択プログラム Download PDFInfo
- Publication number
- WO2011108632A1 WO2011108632A1 PCT/JP2011/054883 JP2011054883W WO2011108632A1 WO 2011108632 A1 WO2011108632 A1 WO 2011108632A1 JP 2011054883 W JP2011054883 W JP 2011054883W WO 2011108632 A1 WO2011108632 A1 WO 2011108632A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- distribution
- criterion
- complete data
- component
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- the present invention relates to a data model selection apparatus, and more particularly to model selection characterized by realizing high-speed model selection for a complex mixed distribution model by optimizing an expected value of a conditional information criterion.
- the present invention relates to a device, a model selection method, and a model selection program.
- the mixed distribution is a model that expresses the distribution of data by a plurality of distributions, and is an important model for industrial data modeling.
- Such models include various models such as a mixed normal distribution and a mixed hidden Markov model.
- Non-Patent Document 1 When the number of mixtures and the type of each component are specified, it is possible to specify the parameters of the distribution using a known technique such as an EM algorithm (for example, Non-Patent Document 1). is there.
- model selection problem or “system fixed problem”. This is an extremely important problem for constructing a reliable model, and a plurality of techniques have been proposed as related techniques.
- MDL minimum description length
- AIC Akaike's Information Criterion
- the model selection method using the information criterion is a method of selecting a model that optimizes the value of the information criterion for data from among model candidates. It is known that a model for optimizing the information criterion has excellent statistical properties such as consistency with a true distribution in the case of MDL and minimum prediction error in the case of AIC.
- model selection method using the information criterion it is possible in principle to perform model selection for any model candidate by calculating the information criterion value for all model candidates.
- the number of model candidates becomes enormous, it is virtually impossible to calculate.
- the polynomial curve includes a straight line (primary curve), a quadratic curve, a cubic curve, and a plurality of orders.
- Patent Document 1 performs model selection based on the information criterion at high speed by repeatedly optimizing the expected information criterion for complete data including hidden variables for various mixed distribution models. The technology to do is disclosed.
- An object of the present invention is to provide a model selection device, a model selection method, and a model selection program that solves the above-described problems and realizes high-speed model selection even for a model having dependencies between components. .
- Another object of the present invention is to provide a model selection device, a model selection method, and a model selection program that realize high-speed model selection even when the number of component candidates increases rapidly with respect to parameters.
- the first model selection device includes model optimization means for optimizing a model with respect to a mixture distribution, and the model optimization means relates to an information criterion for complete data, and a posterior distribution of hidden variables of complete data. , The complete data expected information criterion for a set of component models and parameters satisfying a predetermined condition is optimized.
- the first model selection method of the present invention includes a model optimization step for optimizing a model with respect to a mixture distribution, and the model optimization step relates to an information criterion for complete data, and performs a posteriori of hidden variables of complete data. For the distribution, the expected data criterion of the complete data for the model and parameter set of the component satisfying the predetermined condition is optimized.
- a first model selection program causes a computer to execute a model optimization process for optimizing a model with respect to a mixture distribution.
- the model optimization process relates to an information criterion for complete data, and is a hidden variable of complete data.
- the expected information criterion of complete data for a set of component models and parameters satisfying a predetermined condition is optimized.
- the present invention in the estimation of the mixture distribution, it is possible to realize high-speed model selection even for a model having a dependency between components.
- FIG. 1 is a block diagram showing a configuration of a model selection device 100 according to the first embodiment of the present invention.
- a model selection apparatus 100 includes a data input unit 101, a mixture number setting unit 102, a distribution initialization processing unit 103, a model optimization processing unit 104, and a mixture number loop end.
- the determination processing unit 105, the optimum distribution selection processing unit 106, and the model selection result output unit 107 are provided.
- the model selection apparatus 100 optimizes the number of mixtures, the types and parameters of each component, etc., with respect to the input data 108, and outputs the result as a model selection result 109.
- the data input unit 101 is a functional unit for inputting the input data 108.
- the input data 108 includes information necessary for model selection, such as the types and parameters of each component to be mixed, and candidate values for the number of mixtures.
- the mixture number setting unit 102 has a function of selecting and setting the mixture number of the model from the input candidate number of the mixture number.
- the number of mixtures set by the mixture setting unit 102 is referred to as K.
- the distribution initialization processing unit 103 has a function of performing initialization processing for estimation. Note that initialization can be performed by any method. For example, a method of setting a value of a hidden variable corresponding to data at random can be considered.
- the model optimization processing unit 104 has a function of optimizing the model with respect to the mixture distribution of the mixture number set by the mixture number setting unit 102.
- the model optimization processing unit 104 is configured as a model optimization processing unit 200 shown in FIG. 2 or a model optimization processing unit 300 shown in FIG. 3, and details thereof will be described later.
- the mixture number loop end determination processing unit 105 has a function of determining whether or not an optimum information amount reference value is calculated for all input candidate values for the mixture number.
- the optimum distribution selection processing unit 106 has a function of comparing the information amount reference values calculated for all the mixture number candidate values and selecting the number of mixtures for which the information amount criterion is optimal. Note that the optimum model information amount value for each mixture number is calculated by the model optimization processing unit 104 as described later. Further, since the model optimization processing unit 104 optimizes the types and parameters of each component with respect to the optimal number of mixtures, it is selected as the optimal distribution.
- the model selection result output unit 107 has a function of outputting an optimal number of mixtures, component types, parameters, and the like as a model selection result 109.
- the model optimization processing unit 200 includes a hidden variable posterior distribution calculation processing unit 201, an update parameter setting unit 202, a conditional expected information criterion optimization processing unit 203, and an independent parameter setting loop end determination unit 204. , An information amount reference calculation processing unit 205 and an optimality determination processing unit 206 are provided.
- the hidden variable posterior distribution calculation processing unit 201 has a function of calculating a posterior distribution related to a hidden variable indicating to which component of the mixed distribution each input data belongs.
- the update parameter setting unit 202 stores rules for partially updating the model and parameter candidates of each component, and is an optimization target among the partial models and parameters. Has a function to select.
- the conditional expected information criterion minimization processing unit 203 relates to the posterior distribution calculated by the hidden variable posterior distribution calculation processing unit 201, and the expected information criterion of complete data for the model and parameters selected by the update parameter setting unit 202. It has a function to optimize.
- complete data refers to a set of input data and corresponding hidden variables. Input data is called incomplete data. Note that any optimization method can be used for the optimization method.
- the update parameter setting loop end determination unit 204 determines whether the conditional expected information criterion minimization processing has been performed on all of the partially updated model / parameter pairs stored in the update parameter setting unit 202. It has a function to judge.
- the information criterion calculation processing unit 205 has a function of calculating an information criterion value for incomplete data for the updated model.
- the optimality determination processing unit 206 has a function of comparing the information amount reference value calculated in this loop with the information amount reference value calculated in the previous loop to determine whether the optimization processing has converged. Have.
- the update parameter setting unit 202 sets partial models and parameters, and optimizes the conditional expected information amount criterion, so that the number of candidates can be reduced even for complex model candidates. It is a point that can prevent the enormous amount of.
- ⁇ _ ⁇ d 0 ⁇ ⁇ ⁇ D * (D ⁇ 1) / 2 ⁇ D * (D -1) It is necessary to perform parameter estimation for the 2Cd component candidates and select an optimal component.
- the model optimization processing unit 300 is different from the model optimization processing unit 200 in that the connection order of the hidden variable posterior distribution calculation processing unit 201 and the update parameter setting unit 202 is different, and the update parameter.
- the difference is that the setting loop end determination unit 204 is not provided. In this process, regardless of whether or not all update parameters are updated by the update parameter setting unit, the process ends when the optimization determination processing unit 206 determines that the optimization has been completed.
- FIG. 4 is a flowchart showing the processing operation of the model selection device 100 according to the present embodiment.
- model selection apparatus 100 generally operates as follows.
- the mixture number setting unit 102 selects a mixture number that has not yet been optimized among the input candidate values for the mixture number. Set (step S402).
- the distribution initialization processing unit 103 initializes the specified number of mixtures necessary for optimization (step S403).
- the model optimization processing unit 104 estimates an optimal model for the designated number of mixtures (step S404). Details of processing when the model optimization processing unit 200 and the model optimization processing unit 300 are used as the model optimization processing unit 104 will be described later.
- the mixture number loop end determination processing unit 105 determines whether optimization has been completed for all the mixture number candidate values and the information amount reference value has been calculated (step S405).
- step S401 to step S404 is repeated ("NO" in step S405).
- the optimum distribution selection processing unit 106 compares the optimized information amount standard value for each number of mixtures, and optimizes the number of mixtures whose value is optimum. Is selected as the correct model (step S406). For the selected model, the component types and parameters are optimized in the processing from step S402 to step S405, and a distribution having the optimal number of mixtures and component types is acquired.
- the model selection result output unit 107 outputs the model selection result 109 (step S407).
- FIG. 5 is a flowchart showing the processing operation of the model optimization processing unit 200 according to this embodiment.
- the optimization processing unit 200 generally operates as follows.
- the output of the distribution initialization processing unit 103 is received, and the posterior distribution calculation processing unit 201 of the hidden variable calculates the posterior distribution of the hidden variable (step S501).
- the update parameter setting unit 202 selects a model and parameter set that is stored in the update parameter setting unit 202 and is independent of other models and parameters (step S502).
- conditional expected information criterion optimization processing unit 203 estimates a model and parameters that minimize the conditional expected information criterion for the model and parameters selected by the update parameter setting unit 202 (step S503). ).
- the update parameter setting unit 204 determines whether or not all the independent models and parameter pairs stored in the update parameter setting unit 202 have been updated (step S504).
- step S501 to step S504 If there are still sets that have not been updated, the processing from step S501 to step S504 is repeated ("NO" in step S504).
- step S504 If no updated set remains (step S504 “YES”), the information criterion calculation unit 205 calculates an information criterion value for the updated model (step S505).
- the optimality determination processing unit 206 compares the information amount reference value calculated in this loop with the information amount reference value calculated in the previous loop to determine whether the optimization processing has converged. (Step S506 ") If the information amount reference value has converged (" YES "in Step S507), the process of the optimization processing unit 200 is terminated. If the information amount reference value has not converged, (“NO” in step S507), the processing from S501 to S506 is repeated.
- FIG. 6 is a flowchart showing the operation of the model optimization processing unit 300 according to the present embodiment.
- the optimization processing unit 300 according to the present embodiment has the processing order of steps S501 and S502 shown in FIG. 5 reversed (steps S601 and 602). The difference is that the processing of S504 shown in FIG. 5 is not included.
- the mixture distribution to be learned is expressed by the following equation (1) with respect to the random variable X corresponding to the input data.
- (pi) k represents the mixture ratio regarding a kth component
- ⁇ ( ⁇ 1 ,..., ⁇ K ).
- the distribution P (X; ⁇ k ) of each component is an element of a set S of component candidates.
- the equation (1) can be a mixture of a plurality of different distributions such as a normal distribution and an exponential distribution. Equation (1) is a framework for modeling the data distribution, but the following configuration is similarly established for model selection to which teacher data is given, such as a regression distribution and a data classification distribution.
- the MDL standard is a standard for selecting, as an optimal model, a model that minimizes the sum of the data description length and the model description length expressed by the equation (2).
- the MDL criterion calculation method is stored in the information criterion calculation processing unit 106, and the MDL criterion value of the distribution is calculated by Expression (2).
- l represents a description length function
- M represents a model.
- x i represents one point of data
- X is a random variable corresponding to the data.
- M) can be calculated as in Equation (3) or Equation (4). Is possible.
- log is a logarithm with a base of 2 and ln is a natural logarithm.
- ⁇ represents that the parameter is a maximum likelihood estimator.
- I ( ⁇ ) is a Fisher information matrix. Note that various description methods have been proposed for the description length functions l (x N
- posterior distribution calculation processing unit 201 of the hidden variables calculation of expected values for the posterior probability of the hidden variables when data x N is given it is stored. Note that the posterior probability varies depending on P (X; ⁇ ) and can be calculated by any known method. In the following, it is assumed that Ez [A] represents an expected value related to the posterior probability of the hidden variable of the argument A.
- l (x N , z N ; M) can use an arbitrary description length function, similarly to l (x N
- Formula (6) is mentioned corresponding to Formula (3).
- M k is the dimension of ⁇ k
- N k is the number of data belonging to the k-th cluster, and can be calculated by Expression (7).
- P (z i ; ⁇ k ) represents the probability that the cluster assignment related to the i-th data takes “1” or “0”.
- the expected description length calculated by the conditional expected information criterion minimization processing unit 203 is a variable of the model and parameter selected by the update parameter setting unit 202 for l (x N , z N ; M). When other parameters are fixed, this is an amount that takes an expected value with respect to the posterior probability of the hidden variable, and is calculated as Ez [l (x N , z N ; M)]. Since the models and parameters set as update parameters are independent of each other, considering the conditional expected description length for the selected model and parameters, optimization is possible for each model and parameter. As a parameter estimation method for the distribution for each component, any known technique such as maximum likelihood estimation or moment method can be used.
- the update parameter setting unit 202 it is stored as a model to determine whether any two dimensions of each component are independent. Thus, by sequentially selecting whether each of the two dimensions is independent, it is possible to perform learning at high speed even in a situation where the combination of independence increases rapidly as the dimension increases.
- the optimal distribution differs for each dimension. Therefore, when considering a mixed distribution, it is necessary to consider the simultaneous distribution, but the combination of which dimension is what distribution A problem occurs. Therefore, by setting which model the peripheral distribution of each dimension of each component is in the update parameter setting unit 202 and updating the optimal distribution for each dimension, it is possible to select a model at high speed.
- model selection apparatus 100 proposed in the present invention it is possible to simultaneously estimate not only the peripheral distribution but also the copula expressing the correlation between the peripheral distributions.
- Model and attribute selection of mixed identification model using different attributes By using the model selection apparatus 100 proposed in the present invention, it is possible to perform high-speed model selection with respect to the model and attribute selection of the mixed identification model using different attributes.
- attribute 1 has valid information for data identification
- attribute 2 has valid information for identification
- different valid attributes may be estimated for each component.
- the combination of which attribute is used and which attribute is not used becomes enormous.
- the use of each attribute in each component is used as a model and parameter to be set by the update parameter setting unit 202, so that an attribute effective for identification in each component can be estimated at high speed. Is possible.
- FIG. 7 is a block diagram illustrating a hardware configuration example of the model selection device 100.
- the model selection device 100 has a hardware configuration similar to that of a general computer device, and includes a data work area including a memory such as a CPU (Central Processing Unit) 801 and a RAM (Random Access Memory). And a main storage unit 802 used for a temporary data saving area, a communication unit 803 that transmits and receives data via a network, an input / output interface that transmits and receives data by connecting to the input device 805, the output device 806, and the storage device 807 A unit 804 and a system bus 808 for interconnecting the above components.
- the storage device 807 is realized by, for example, a hard disk device including a non-volatile memory such as a ROM (Read Only Memory), a magnetic disk, and a semiconductor memory.
- the selection result output unit 107 implements its functions by implementing circuit components, which are hardware components such as LSI (Large Scale Integration), in which a program is incorporated, as well as realizing its operation in hardware.
- the program to be provided can be realized in software by storing the program to be stored in the storage device 807, loading the program into the main storage unit 802, and executing it by the CPU 801.
- a plurality of components are formed as a single member, and a single component is formed of a plurality of members. It may be that a certain component is a part of another component, a part of a certain component overlaps with a part of another component, or the like.
- the plurality of procedures of the method and the computer program of the present invention are not limited to being executed at different timings. For this reason, another procedure may occur during the execution of a certain procedure, or some or all of the execution timing of a certain procedure and the execution timing of another procedure may overlap.
- the model optimization means includes: A model characterized by optimizing the expected information criterion of the complete data for a set of parameters and a model of a component satisfying a predetermined condition for the posterior distribution of the hidden variable of the complete data with respect to the information criterion of the complete data Selection device.
- the model optimization means comprises: Means for calculating a posterior distribution of hidden variables for calculating a posterior distribution of hidden variables of the data; Update parameter setting means for selecting a model of a component that satisfies a predetermined condition and a set of parameters; Concerning the posterior distribution calculated by the posterior distribution calculation means of the hidden variable, the conditional expected information criterion minimum for optimizing the expected information criterion for complete data for the model and parameter set selected by the update parameter setting means And An information criterion calculation means for calculating an information criterion value for incomplete data for a model updated by optimization by the conditional expected information criterion minimization means; Determining the optimality of
- the model optimization means comprises: When there are a plurality of combinations of component models and parameters satisfying a predetermined condition, the update parameter setting means and the conditional expectation information are optimized until the expected information criterion for the complete data is optimized for all of the pairs.
- the model selection apparatus according to claim 2, further comprising an update parameter setting loop end determination unit that repeatedly performs the processing by the amount criterion minimization unit.
- Appendix 4 The model selection apparatus according to any one of appendix 1 to appendix 3, wherein the information amount criterion is an MDL criterion.
- Appendix 7 The model selection apparatus according to any one of appendix 1 to appendix 4, wherein the number of blends and the kind of the marginal distribution of each component are optimized with respect to a mixture distribution of a plurality of different marginal distributions.
- Appendix 8 The model selection apparatus according to any one of appendix 1 to appendix 4, wherein an attribute effective for discrimination in each component is optimized with respect to a model and attribute selection of a mixed discrimination model using different attributes.
- a model optimization step for optimizing the model for the mixture distribution includes: A model characterized by optimizing the expected information criterion of the complete data for a set of parameters and a model of a component satisfying a predetermined condition for the posterior distribution of the hidden variable of the complete data with respect to the information criterion of the complete data Selection method.
- the model optimization step comprises: A hidden variable posterior distribution calculating step for calculating a posterior distribution related to the hidden variable of the data; An update parameter setting step for selecting a model of a component satisfying a predetermined condition and a set of parameters; Concerning the posterior distribution calculated in the posterior distribution calculation step of the hidden variable, the conditional expected information criterion minimum that optimizes the expected information criterion for the complete data for the model and parameter set selected in the update parameter setting step Step, An information criterion calculation step for calculating an information criterion value for incomplete data for the model updated by optimization by the conditional expected information criterion minimization step; Determining the optimality of the information amount cri
- the model optimization step comprises: When there are a plurality of component model and parameter pairs satisfying a predetermined condition, the update parameter setting step and the conditional expectation information are optimized until the expected information criterion is optimized for complete data for all of the pairs.
- Appendix 12 The model selection method according to any one of appendix 9 to appendix 11, wherein the information amount criterion is an MDL criterion.
- Appendix 14 The model selection method according to any one of appendix 9 to appendix 12, wherein the number of blends and the independence of each component are optimized for a mixture distribution of a plurality of distributions having different independence with respect to multidimensional data .
- Appendix 15 The model selection method according to any one of appendix 9 to appendix 12, wherein the number of blends and the kind of the marginal distribution of each component are optimized with respect to a mixture distribution of a plurality of different marginal distributions.
- Appendix 16 The model selection method according to any one of appendix 9 to appendix 12, wherein an attribute effective for discrimination in each component is optimized with respect to a model and attribute selection of a mixed discrimination model using different attributes.
- the model optimization process is: A model characterized by optimizing the expected information criterion of the complete data for a set of parameters and a model of a component satisfying a predetermined condition for the posterior distribution of the hidden variable of the complete data with respect to the information criterion of the complete data Selection program.
- the model optimization process is: A posterior distribution calculation process of a hidden variable for calculating a posterior distribution related to the hidden variable of the data; Update parameter setting processing for selecting a model of a component that satisfies a predetermined condition and a set of parameters; Concerning the posterior distribution calculated by the posterior distribution calculation process of the hidden variable, the conditional expected information criterion minimum that optimizes the expected information criterion for the complete data for the model and parameter set selected in the update parameter setting process Processing, An information criterion calculation process for calculating an information criterion value for incomplete data for a model updated by optimization by the conditional expected information criterion minimization process
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Algebra (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
本発明の目的は、上述の課題を解決し、コンポーネント間に依存性があるモデルに対しても、高速なモデル選択を実現するモデル選択装置、モデル選択方法及びモデル選択プログラムを提供することである。
本発明の第1の実施の形態について、図面を参照して詳細に説明する。以下の図において、本発明の本質に関わらない部分の構成については省略してあり、図示されていない。
次に、図面を参照して、本実施の形態の動作について詳細に説明する。
本発明で提案するモデル選択装置100を利用すると、多次元データに対して独立性の異なる複数の分布の混合分布に関し、混合の数及び各コンポーネントの独立性を高速に最適化する事が可能である。
本発明で提案するモデル選択装置を利用すると、複数の異なる周辺分布の混合分布に関して、混合の数及び各コンポーネントの周辺分布の種類を最適化する事が可能である。
本発明で提案するモデル選択装置100を利用すると、異なる属性を用いた混合識別モデルのモデル及び属性選択に関して、高速なモデル選択を行う事が可能である。
次に本実施の形態の効果について説明する。
混合分布に対してモデルを最適化するモデル最適化手段を備え、
前記モデル最適化手段は、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択装置。
混合数の候補値から最適化が行われていない候補値を選択する混合数設定手段と、
前記混合数設定手段で選択された混合数を用いてデータの初期化処理を実施する分布初期化手段と、
全ての混合数の候補値に対して、最適な情報量基準の値が計算されているかを判定し、計算されていないと判定した場合、前記混合数設定手段及び前記分布初期化手段、前記最適化手段による処理を再度行わせる混合数ループ終了判定手段と、
全ての混合数の候補値に対して計算された情報量基準の値を比較し、情報量基準が最適である混合数を選択する最適分布選択手段とを備え、
前記モデル最適化手段が、
前記データの隠れ変数に関する事後分布を計算する隠れ変数の事後分布計算手段と、
所定の条件を満たすコンポーネントのモデル及びパラメータの組を選択する更新パラメータ設定手段と、
前記隠れ変数の事後分布計算手段で計算された事後分布に関し、前記更新パラメータ設定手段にて選択したモデル及びパラメータの組について、完全データに対する期待情報量基準を最適化する条件付期待情報量基準最小化手段と、
前記条件付期待情報量基準最小化手段による最適化により更新されたモデルについて、不完全データに対する情報量基準の値を計算する情報量基準計算手段と、
前記情報量基準計算手段にて計算した情報量基準の値の最適性を判定し、最適でないと判定した場合に、再度最適化処理を行う最適性判定手段と、
を含むことを特徴とする付記1に記載のモデル選択装置。
前記モデル最適化手段が、
所定の条件を満たすコンポーネントのモデル及びパラメータの組が複数ある場合に、当該組のすべてについて、完全データに対する期待情報量基準の最適化が行われるまで、前記更新パラメータ設定手段と前記条件付期待情報量基準最小化手段による処理を繰り返し行わせる更新パラメータ設定ループ終了判定手段を含むことを特徴とする付記2に記載のモデル選択装置。
前記情報量基準は、MDL基準であることを特徴とする付記1から付記3の何れか1項に記載のモデル選択装置。
前記所定の条件は、他のモデル及びパラメータと独立であることを特徴とする付記1から付記4の何れか1項に記載のモデル選択装置。
多次元データに対して独立性の異なる複数の分布の混合分布に関し、混合の数及び各コンポーネントの独立性を最適化することを特徴とする付記1から付記4の何れか1項にモデル選択装置。
複数の異なる周辺分布の混合分布に関して、混合の数及び各コンポーネントの周辺分布の種類を最適化することを特徴とする付記1から付記4の何れか1項にモデル選択装置。
異なる属性を用いた混合識別モデルのモデル及び属性選択に関して、各コンポーネントで識別に有効な属性を最適化することを特徴とする付記1から付記4の何れか1項にモデル選択装置。
混合分布に対してモデルを最適化するモデル最適化ステップを有し、
前記モデル最適化ステップは、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択方法。
混合数の候補値から最適化が行われていない候補値を選択する混合数設定ステップと、
前記混合数設定ステップで選択された混合数を用いてデータの初期化処理を実施する分布初期化ステップと、
全ての混合数の候補値に対して、最適な情報量基準の値が計算されているかを判定し、計算されていないと判定した場合、前記混合数設定ステップ、前記分布初期化ステップ及び前記最適化ステップによる処理を再度行わせる混合数ループ終了判定ステップと、
全ての混合数の候補値に対して計算された情報量基準の値を比較し、情報量基準が最適である混合数を選択する最適分布選択ステップとを有し、
前記モデル最適化ステップが、
前記データの隠れ変数に関する事後分布を計算する隠れ変数の事後分布計算ステップと、
所定の条件を満たすコンポーネントのモデル及びパラメータの組を選択する更新パラメータ設定ステップと、
前記隠れ変数の事後分布計算ステップで計算された事後分布に関し、前記更新パラメータ設定ステップにて選択したモデル及びパラメータの組について、完全データに対する期待情報量基準を最適化する条件付期待情報量基準最小化ステップと、
前記条件付期待情報量基準最小化ステップによる最適化により更新されたモデルについて、不完全データに対する情報量基準の値を計算する情報量基準計算ステップと、
前記情報量基準計算ステップにて計算した情報量基準の値の最適性を判定し、最適でないと判定した場合に、再度最適化処理を行う最適性判定ステップと、
を含むことを特徴とする付記9に記載のモデル選択方法。
前記モデル最適化ステップが、
所定の条件を満たすコンポーネントのモデル及びパラメータの組が複数ある場合に、当該組のすべてについて、完全データに対する期待情報量基準の最適化が行われるまで、前記更新パラメータ設定ステップと前記条件付期待情報量基準最小化ステップによる処理を繰り返し行わせる更新パラメータ設定ループ終了判定ステップを含むことを特徴とする付記10に記載のモデル選択方法。
前記情報量基準は、MDL基準であることを特徴とする付記9から付記11の何れか1項に記載のモデル選択方法。
前記所定の条件は、他のモデル及びパラメータと独立であることを特徴とする付記9から付記12の何れか1項に記載のモデル選択方法。
多次元データに対して独立性の異なる複数の分布の混合分布に関し、混合の数及び各コンポーネントの独立性を最適化することを特徴とする付記9から付記12の何れか1項にモデル選択方法。
複数の異なる周辺分布の混合分布に関して、混合の数及び各コンポーネントの周辺分布の種類を最適化することを特徴とする付記9から付記12の何れか1項にモデル選択方法。
異なる属性を用いた混合識別モデルのモデル及び属性選択に関して、各コンポーネントで識別に有効な属性を最適化することを特徴とする付記9から付記12の何れか1項にモデル選択方法。
混合分布に対してモデルを最適化するモデル最適化処理をコンピュータに実行させ、
前記モデル最適化処理は、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択プログラム。
混合数の候補値から最適化が行われていない候補値を選択する混合数設定処理と、
前記混合数設定処理で選択された混合数を用いてデータの初期化処理を実施する分布初期化処理と、
全ての混合数の候補値に対して、最適な情報量基準の値が計算されているかを判定し、計算されていないと判定した場合、前記混合数設定処理、前記分布初期化処理及び前記最適化処理による処理を再度行わせる混合数ループ終了判定処理と、
全ての混合数の候補値に対して計算された情報量基準の値を比較し、情報量基準が最適である混合数を選択する最適分布選択処理とをコンピュータに実行させ、
前記モデル最適化処理が、
前記データの隠れ変数に関する事後分布を計算する隠れ変数の事後分布計算処理と、
所定の条件を満たすコンポーネントのモデル及びパラメータの組を選択する更新パラメータ設定処理と、
前記隠れ変数の事後分布計算処理で計算された事後分布に関し、前記更新パラメータ設定処理にて選択したモデル及びパラメータの組について、完全データに対する期待情報量基準を最適化する条件付期待情報量基準最小化処理と、
前記条件付期待情報量基準最小化処理による最適化により更新されたモデルについて、不完全データに対する情報量基準の値を計算する情報量基準計算処理と、
前記情報量基準計算処理にて計算した情報量基準の値の最適性を判定し、最適でないと判定した場合に、再度最適化処理を行う最適性判定処理と、をコンピュータに実行させるモデル選択プログラム。
Claims (10)
- 混合分布に対してモデルを最適化するモデル最適化手段を備え、
前記モデル最適化手段は、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択装置。 - 混合数の候補値から最適化が行われていない候補値を選択する混合数設定手段と、
前記混合数設定手段で選択された混合数を用いてデータの初期化処理を実施する分布初期化手段と、
全ての混合数の候補値に対して、最適な情報量基準の値が計算されているかを判定し、計算されていないと判定した場合、前記混合数設定手段及び前記分布初期化手段、前記最適化手段による処理を再度行わせる混合数ループ終了判定手段と、
全ての混合数の候補値に対して計算された情報量基準の値を比較し、情報量基準が最適である混合数を選択する最適分布選択手段とを備え、
前記モデル最適化手段が、
前記データの隠れ変数に関する事後分布を計算する隠れ変数の事後分布計算手段と、
所定の条件を満たすコンポーネントのモデル及びパラメータの組を選択する更新パラメータ設定手段と、
前記隠れ変数の事後分布計算手段で計算された事後分布に関し、前記更新パラメータ設定手段にて選択したモデル及びパラメータの組について、完全データに対する期待情報量基準を最適化する条件付期待情報量基準最小化手段と、
前記条件付期待情報量基準最小化手段による最適化により更新されたモデルについて、不完全データに対する情報量基準の値を計算する情報量基準計算手段と、
前記情報量基準計算手段にて計算した情報量基準の値の最適性を判定し、最適でないと判定した場合に、再度最適化処理を行う最適性判定手段と、
を含むことを特徴とする請求項1に記載のモデル選択装置。 - 前記モデル最適化手段が、
所定の条件を満たすコンポーネントのモデル及びパラメータの組が複数ある場合に、当該組のすべてについて、完全データに対する期待情報量基準の最適化が行われるまで、前記更新パラメータ設定手段と前記条件付期待情報量基準最小化手段による処理を繰り返し行わせる更新パラメータ設定ループ終了判定手段を含むことを特徴とする請求項2に記載のモデル選択装置。 - 前記情報量基準は、MDL基準であることを特徴とする請求項1から請求項3の何れか1項に記載のモデル選択装置。
- 前記所定の条件は、他のモデル及びパラメータと独立であることを特徴とする請求項1から請求項4の何れか1項に記載のモデル選択装置。
- 多次元データに対して独立性の異なる複数の分布の混合分布に関し、混合の数及び各コンポーネントの独立性を最適化することを特徴とする請求項1から請求項4の何れか1項にモデル選択装置。
- 複数の異なる周辺分布の混合分布に関して、混合の数及び各コンポーネントの周辺分布の種類を最適化することを特徴とする請求項1から請求項4の何れか1項にモデル選択装置。
- 異なる属性を用いた混合識別モデルのモデル及び属性選択に関して、各コンポーネントで識別に有効な属性を最適化することを特徴とする請求項1から請求項4の何れか1項にモデル選択装置。
- 混合分布に対してモデルを最適化するモデル最適化ステップを有し、
前記モデル最適化ステップは、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択方法。 - 混合分布に対してモデルを最適化するモデル最適化処理をコンピュータに実行させ、
前記モデル最適化処理は、
完全データの情報量基準に関し、前記完全データの隠れ変数の事後分布について、所定の条件を満たすコンポーネントのモデルおよびパラメータの組に対する前記完全データの期待情報量基準を最適化することを特徴とするモデル選択プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/582,385 US9208436B2 (en) | 2010-03-03 | 2011-03-03 | Model selection device, model selection method and model selection program |
JP2012503237A JP5704162B2 (ja) | 2010-03-03 | 2011-03-03 | モデル選択装置、モデル選択方法及びモデル選択プログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-046725 | 2010-03-03 | ||
JP2010046725 | 2010-03-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011108632A1 true WO2011108632A1 (ja) | 2011-09-09 |
Family
ID=44542279
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/054883 WO2011108632A1 (ja) | 2010-03-03 | 2011-03-03 | モデル選択装置、モデル選択方法及びモデル選択プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US9208436B2 (ja) |
JP (1) | JP5704162B2 (ja) |
WO (1) | WO2011108632A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5403456B2 (ja) * | 2011-03-18 | 2014-01-29 | 日本電気株式会社 | 多変量データの混合モデル推定装置、混合モデル推定方法および混合モデル推定プログラム |
WO2015068330A1 (ja) * | 2013-11-05 | 2015-05-14 | 日本電気株式会社 | モデル推定装置、モデル推定方法およびモデル推定プログラム |
CN106156856A (zh) * | 2015-03-31 | 2016-11-23 | 日本电气株式会社 | 用于混合模型选择的方法和装置 |
US11360873B2 (en) | 2016-09-06 | 2022-06-14 | Kabushiki Kaisha Toshiba | Evaluation device, evaluation method, and evaluation program |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8909582B2 (en) * | 2013-02-04 | 2014-12-09 | Nec Corporation | Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, and recording medium |
US9489346B1 (en) * | 2013-03-14 | 2016-11-08 | The Mathworks, Inc. | Methods and systems for early stop simulated likelihood ratio test |
US20140344183A1 (en) * | 2013-05-20 | 2014-11-20 | Nec Corporation | Latent feature models estimation device, method, and program |
US9489632B2 (en) * | 2013-10-29 | 2016-11-08 | Nec Corporation | Model estimation device, model estimation method, and information storage medium |
US9355196B2 (en) * | 2013-10-29 | 2016-05-31 | Nec Corporation | Model estimation device and model estimation method |
JP6669075B2 (ja) | 2014-10-28 | 2020-03-18 | 日本電気株式会社 | 領域線形モデル最適化システム、方法およびプログラム |
TWI735001B (zh) * | 2019-06-28 | 2021-08-01 | 鴻海精密工業股份有限公司 | 數據模型選擇優化方法、裝置、電腦裝置及存儲介質 |
CN112149708A (zh) * | 2019-06-28 | 2020-12-29 | 富泰华工业(深圳)有限公司 | 数据模型选择优化方法、装置、计算机装置及存储介质 |
CN111177657B (zh) * | 2019-12-31 | 2023-09-08 | 北京顺丰同城科技有限公司 | 需求确定方法、系统、电子设备及存储介质 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005234214A (ja) * | 2004-02-19 | 2005-09-02 | Nippon Telegr & Teleph Corp <Ntt> | 音声認識用音響モデル生成方法及び装置、音声認識用音響モデル生成プログラムを記録した記録媒体 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7474997B2 (en) * | 2003-04-16 | 2009-01-06 | Sony Corporation | Construction and selection of a finite mixture model for use in clustering and vector quantization |
JP2009013503A (ja) | 2008-09-29 | 2009-01-22 | Showa Denko Kk | 切削加工用アルミニウム合金押出材、アルミニウム合金製切削加工品及び自動車部品用バルブ材 |
-
2011
- 2011-03-03 US US13/582,385 patent/US9208436B2/en active Active
- 2011-03-03 JP JP2012503237A patent/JP5704162B2/ja active Active
- 2011-03-03 WO PCT/JP2011/054883 patent/WO2011108632A1/ja active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005234214A (ja) * | 2004-02-19 | 2005-09-02 | Nippon Telegr & Teleph Corp <Ntt> | 音声認識用音響モデル生成方法及び装置、音声認識用音響モデル生成プログラムを記録した記録媒体 |
Non-Patent Citations (2)
Title |
---|
HIROSHI TENMOTO ET AL.: "Optimal Selection of the Number of Components in Classifiers Based on Mixture Models", IEICE TECHNICAL REPORT, vol. 98, no. 127, 19 June 1998 (1998-06-19), pages 39 - 43 * |
RYOHEI FUJIMAKI ET AL.: "Senkei Jikan Ishu Kongo Model Sentaku no Tameno Kitai Johoryo Kijun Saishoka-ho", TECHNICAL REPORT ON INFORMATION-BASED INDUCTION SCIENCES 2009, 19 October 2009 (2009-10-19), pages 312 - 319, XP008171188 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5403456B2 (ja) * | 2011-03-18 | 2014-01-29 | 日本電気株式会社 | 多変量データの混合モデル推定装置、混合モデル推定方法および混合モデル推定プログラム |
WO2015068330A1 (ja) * | 2013-11-05 | 2015-05-14 | 日本電気株式会社 | モデル推定装置、モデル推定方法およびモデル推定プログラム |
JPWO2015068330A1 (ja) * | 2013-11-05 | 2017-03-09 | 日本電気株式会社 | モデル推定装置、モデル推定方法およびモデル推定プログラム |
CN106156856A (zh) * | 2015-03-31 | 2016-11-23 | 日本电气株式会社 | 用于混合模型选择的方法和装置 |
US11360873B2 (en) | 2016-09-06 | 2022-06-14 | Kabushiki Kaisha Toshiba | Evaluation device, evaluation method, and evaluation program |
Also Published As
Publication number | Publication date |
---|---|
US9208436B2 (en) | 2015-12-08 |
JPWO2011108632A1 (ja) | 2013-06-27 |
US20120323834A1 (en) | 2012-12-20 |
JP5704162B2 (ja) | 2015-04-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5704162B2 (ja) | モデル選択装置、モデル選択方法及びモデル選択プログラム | |
Bartoli et al. | Adaptive modeling strategy for constrained global optimization with application to aerodynamic wing design | |
Ilievski et al. | Efficient hyperparameter optimization for deep learning algorithms using deterministic rbf surrogates | |
Di Marzio et al. | Kernel density estimation on the torus | |
Stute et al. | Nonparametric checks for single-index models | |
Angelikopoulos et al. | X-TMCMC: Adaptive kriging for Bayesian inverse modeling | |
KR20160143548A (ko) | 인공 신경 네트워크를 자동으로 조정하는 방법 및 장치 | |
JPH06348292A (ja) | 音声認識システム | |
JP2011034177A (ja) | 情報処理装置および情報処理方法、並びにプログラム | |
CN114202072A (zh) | 量子体系下的期望值估计方法及系统 | |
Scrucca | Genetic algorithms for subset selection in model-based clustering | |
JP2007200302A (ja) | 収束基準を利用する多目的最適化のためのモデルベースおよび遺伝ベースの子孫生成の組み合わせ | |
JP5332647B2 (ja) | モデル選択装置、モデル選択装置の選択方法及びプログラム | |
US11797507B2 (en) | Relation-enhancement knowledge graph embedding method and system | |
JP2022549844A (ja) | 加重平均近傍埋め込みの学習 | |
JP2012181579A (ja) | パターン分類の学習装置 | |
Bosman et al. | IDEAs based on the normal kernels probability density function | |
US7133811B2 (en) | Staged mixture modeling | |
Markov et al. | Implementation and learning of quantum hidden markov models | |
Braun | trustOptim: An R package for trust region optimization with sparse Hessians | |
CN114399025A (zh) | 一种图神经网络解释方法、系统、终端以及存储介质 | |
Huang et al. | Evaluating aleatoric uncertainty via conditional generative models | |
Li et al. | Sparse model identification and learning for ultra-high-dimensional additive partially linear models | |
US20230289501A1 (en) | Reducing Resources in Quantum Circuits | |
Mitchell et al. | Self expanding neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11750741 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012503237 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13582385 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11750741 Country of ref document: EP Kind code of ref document: A1 |