CN111369002A - Gibbs parameter sampling method applied to random point mode finite hybrid model - Google Patents

Gibbs parameter sampling method applied to random point mode finite hybrid model Download PDF

Info

Publication number
CN111369002A
CN111369002A CN202010105441.3A CN202010105441A CN111369002A CN 111369002 A CN111369002 A CN 111369002A CN 202010105441 A CN202010105441 A CN 202010105441A CN 111369002 A CN111369002 A CN 111369002A
Authority
CN
China
Prior art keywords
distribution
model
parameter
random point
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010105441.3A
Other languages
Chinese (zh)
Inventor
刘伟峰
王志
黄梓龙
丁禹心
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202010105441.3A priority Critical patent/CN111369002A/en
Publication of CN111369002A publication Critical patent/CN111369002A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Software Systems (AREA)
  • Algebra (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Operations Research (AREA)
  • Databases & Information Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to a Gibbs parameter sampling method applied to a random point mode finite hybrid model. Firstly, constructing a random point mode finite mixture model and a random point mode likelihood function, then constructing parameter prior distribution of the random point mode finite mixture model, and obtaining posterior distribution of model parameters according to the model parameter prior distribution; and finally, estimating the number of distribution elements and the model parameter value in the mixed distribution by adopting a sampling algorithm combining a Gibbs sampling algorithm and a Bayesian information criterion. Compared with the traditional FMM, the method only describes the characteristic randomness of the data, and the random point mode distribution function also describes the cardinality randomness of the data; a Gibbs sampling algorithm is adopted to sample data to obtain model parameters on the basis of RPP-FMM, and the situation that parameter estimation may always fall into a local extreme point and a global extreme point cannot be obtained is avoided. The method effectively improves the modeling precision and the parameter estimation precision.

Description

Gibbs parameter sampling method applied to random point mode finite hybrid model
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to a Gibbs parameter sampling method applied to a random point pattern finite hybrid model.
Background
Finite Mixture Modeling (FMM) is a statistical modeling tool that provides an efficient mathematical method for modeling complex densities with simple densities. The core problem of the finite mixture model is two: selection of the density of the mixture components and parameter estimation of the mixture model. The gaussian mixture model has become a limited mixture model which is commonly applied at present by virtue of the characteristics of simple form, convenient calculation and the like. However, most of the obtained actual data has non-linear and non-gaussian characteristics and is limited to the fitting capability of gaussian distribution, so that the gaussian mixture model cannot completely, accurately and effectively describe the complex data. According to the number of the mixed models and the unknown distribution parameters, the problems about the limited mixed models can be classified into problems about supervised learning, unsupervised learning and nonparametric models. Currently, mainstream learning algorithms are classified into deterministic learning algorithms and non-deterministic learning algorithms. The deterministic learning algorithm is a maximum likelihood estimation algorithm represented by an Expectation Maximization (EM) algorithm, and the non-deterministic learning algorithm is mainly a Bayes (Bayes) learning algorithm represented by a Markov chain. The research of FMM mainly includes two aspects: and (4) estimating the number of distribution elements of the mixed model and corresponding model parameters. The actually obtained data mostly has non-gaussian characteristics, and a gaussian mixture model is generally adopted for approximation. The parameter estimation of the finite mixture model may be obtained by a learning algorithm of the parameters.
It is worth mentioning that in the conventional FMM distribution, each data point is assumed to be independent, and therefore, the data likelihood function model is obtained by multiplying all data point likelihood functions, which cannot characterize the random characteristics of the data base (number of data points), and even in some cases, may generate contradictory estimation results.
Disclosure of Invention
The invention aims to provide a Gibbs parameter sampling method applied to a finite mixed model in a random point mode.
In order to characterize the randomness of the data base (number of data points), the method introduces a random point mode finite mixture model (RPP-FMM). The random point pattern distribution function also describes the cardinality of the data, as compared to a conventional FMM which only describes the characteristic randomness of the data. In the past, an EM algorithm and a Gibbs sampling algorithm are proposed to solve the relevant problems, and the EM algorithm is easily influenced by an initial value. In addition, since the EM algorithm belongs to a deterministic algorithm, for a given initial value, the parameter estimation may always fall into a local extreme point, and a global extreme point cannot be obtained. The Gibbs sampling algorithm belongs to a random sampling algorithm, and the influence of an initial value is relatively small.
The invention provides a Gibbs parameter sampling algorithm based on a random point mode finite mixture model (RPP-FMM), and the basic idea is to obtain model parameters by utilizing a Markov chain for constructing a random point mode, so that the modeling precision and the parameter estimation precision are further improved.
The method specifically comprises the following steps:
step (1), constructing a random point mode finite mixed model;
the point-mode mixture model with K random sources is represented as:
f(Xn|Θ)=π1f(Xn1)+π2f(Xn2)+…+πKf(XnK);Xnrepresents the nth random point mode observed data, N is 1,2, …, N, N is the number of random point mode observed data,
Figure BDA0002388396160000029
Figure BDA00023883961600000210
a finite set space representing R, which is a real number space;
parameter set theta ═ pi of point mode hybrid model12,…,πK12,…,θK}∈(R+×Θ)K,R+Representing a positive real space; { theta ]12,…,θKIs a parameter variable in a random point mode distribution function, { pi }12,…,πKIs the mixing weight, pikIs a mixed weight of the kth distribution element and satisfies pik≥0,
Figure BDA0002388396160000021
Step (2), constructing a random point mode likelihood function;
for independent nth random point mode observation data
Figure BDA0002388396160000022
The likelihood function is represented as:
Figure BDA0002388396160000023
Figure BDA0002388396160000024
are independent of each other XnLikelihood function of f (x)nk) Represents XnSingle point data x innA distribution function of (a); missing variable en={en,1,en,2,…,en,K},en,kK is a K-th dimension missing variable in the missing variables, and K is 1,2, …, K, and is used to indicate the single point data x in the point patternnThe point pattern category of (1); e.g. of the typenAnd XnComplete data of composition (X)n,en),
Figure BDA0002388396160000025
X1:NSet of N random point pattern observations, e1:NRepresenting a set of N missing variables.
The parametric posterior distribution of the point-mode hybrid model is represented as:
Figure BDA0002388396160000026
p (theta) is prior distribution of parameters, p (theta | X) is posterior distribution of parameters, and regular constant is
Figure BDA0002388396160000027
Figure BDA0002388396160000028
Representing a likelihood function of a random point pattern.
For the ith data x of the obtained nth random point mode observation datan,i,xn,i∈XnDefining a K-dimensional indicator variable, each dimension indicating one of the mixture distributionsThe missing variable of K dimension is only one dimension of 1, and other dimensions are 0 and are represented as en=[en,1,…,en,K]TAnd satisfy en,k∈{0,1},
Figure BDA0002388396160000031
Wherein en,kData x is represented by 1nFrom the distribution element f (x)nk) Generating; taking Gaussian mixture distribution as characteristic distribution of point pattern, and the corresponding point pattern parameter set is thetak={ρkkkWhere ρ iskIs the base number, mukIs mean value, ΣkIs a covariance matrix.
Step (3), establishing prior distribution of parameters of a finite mixed model in a random point mode;
in the case of Gaussian mixture, the prior parameter is
Figure BDA0002388396160000032
Decomposing the prior distribution p (Θ) according to a bayesian formula: p (Θ) ═ p (π)1:K)p(ρ1:K1:K)p(Σ1:K1:K1:K1:K)p(μ1:K1:K1:K1:K) (ii) a Dirichlet distributions are used as the conjugate priors of the classification distributions.
If the proportion of each distribution element is unknown, adopting equivalent Dirichlet distribution:
Figure BDA0002388396160000033
radix
Figure BDA0002388396160000034
lkExpressing the cardinality of the kth random point mode, wherein the prior distribution of the cardinality obeys Poisson distribution;
a priori obeying Weibull distribution of inverse covariance matrices
Figure BDA0002388396160000035
W (V, β) denotes Weihich with respect to parameters V and βThe bit distribution, V is a positive definite matrix and β is a degree of freedom.
Mean value a gaussian distribution is used as the conjugate prior of the RPP-FMM mean value,
Figure BDA0002388396160000036
Figure BDA0002388396160000037
is the known mean of each random dot pattern.
Step (4), obtaining posterior distribution of the model parameters according to prior distribution of the model parameters;
the posterior mixed weight follows Dirichlet distribution, the posterior mean value follows Gaussian distribution, and the posterior variance follows Weight distribution;
posterior distribution of parameters
Figure BDA0002388396160000038
Mixing weight { pi12,…,πKSatisfy dirichlet distribution:
p(π12…,πK)=Dir(α1+l12+l2,…,αK+lK) Constant αk>0,lkThe number of observation data belonging to the kth randomly distributed element is 1,2, … and K;
missing variable { en,1,…,en,KMissing data is estimated according to Bayes' formula:
Figure BDA0002388396160000039
f(xnk) Representing random point pattern observed data XnSingle point data x innDistribution function of pkRepresenting the cardinality of the kth random point pattern observed data,
Figure BDA0002388396160000041
indicating variable value of each dimension of k random point mode observation dataThe sum of (1);
radix distribution:
Figure BDA0002388396160000042
covariance ΣkThe inverse of the covariance follows a weichi distribution:
Figure BDA0002388396160000043
α therein0And β0Is a normal number, two regulating parameters M0>0,N0>0。
Mean value μkSatisfies the parameter of ξk,∑k-gaussian distribution of samples from:
p(μk)=N(μk;ξk,∑k),
Figure BDA0002388396160000044
ξkmean value of Gaussian distribution, ∑kIs the covariance of the gaussian distribution.
Step (5), estimating the number of distribution elements and the model parameter value in mixed distribution by adopting a sampling algorithm combining a Gibbs sampling algorithm and a Bayesian information criterion;
the Bayesian information criterion is defined as: BIC (m)kk,Xk)=-2logL(Θk,mk|Xk)+Mklnnk(ii) a Wherein M iskIs the number of independent parameters, logL (Θ)k,mk|Xk) Representing a parameter set ΘkAnd number of elements mkA log-likelihood function of;
Mk=3mk+2,
Figure BDA0002388396160000045
in the alternative model class, the model that minimizes the bayesian information is the preferred model, and the parameter estimation derivation of the preferred model is:
Figure BDA0002388396160000046
parameter mkkFrom BIC (m)kk) Is obtained from the minimum likelihood function of (a) to obtain the parameter set thetakAnd the number of distribution elements mk
With parameter set ΘkAnd number of distribution elements mkAnd estimating a model parameter value according to a Gibbs sampling algorithm, wherein the specific method comprises the following steps:
a. initialization
Figure BDA0002388396160000047
θk={ρkkkFrom a conditional density p (θ)i-i) Sampling in;
b. from
Figure BDA0002388396160000048
Middle sampling to obtain
Figure BDA0002388396160000049
And so on, from
Figure BDA00023883961600000410
Obtained by intermediate sampling
Figure BDA00023883961600000411
c. To realize from
Figure BDA00023883961600000412
To
Figure BDA00023883961600000413
Jumping of (2);
d. repeating a-c to obtain a Markov chain;
the deduced Markov chain can reflect the probability characteristics of the posterior distribution, and the stable point in the chain is often the extreme point in the distribution and can be used as the final model parameter estimation value.
The beneficial effects of the invention include: in order to characterize the random characteristics of the data base (the number of data points), the invention introduces a random point mode finite mixture model (RPP-FMM); compared with the traditional FMM which only describes the characteristic randomness of the data, the random point mode distribution function also describes the cardinality randomness of the data; a Gibbs sampling algorithm is adopted to sample data on the basis of RPP-FMM to obtain model parameters, the Gibbs sampling algorithm belongs to a random sampling algorithm, the influence of initial values is relatively small, and the situation that parameter estimation may always fall into a local extreme point and a global extreme point cannot be obtained for a given initial value by an EM algorithm can be effectively avoided. The Gibbs parameter sampling method applied to the random point mode finite mixture model (RPP-FMM) provided by the invention has the basic idea that a Markov chain for constructing the random point mode is used for obtaining model parameters, so that the modeling precision and the parameter estimation precision are further improved.
Detailed Description
A Gibbs parameter sampling method applied to a finite mixed model of a random point mode comprises the following specific steps:
step (1), constructing a random point mode finite mixed model according to the characteristics of the random point mode:
the point-mode mixture model with K random sources is represented as:
f(Xn|Θ)=π1f(Xn1)+π2f(Xn2)+…+πKf(XnK);
Xnrepresents the nth random point mode observed data, N is 1,2, …, N, N is the number of random point mode observed data,
Figure BDA0002388396160000051
Figure BDA0002388396160000052
a finite set space representing R, which is a real number space;
parameter set theta ═ pi of point mode hybrid model12,…,πK12,…,θK}∈(R+×Θ)K,R+Representing a positive real space; { theta ]12,…,θKIs a parameter variable in a random point mode distribution function, { pi }12,…,πKIs the mixing weight, pikIs a mixed weight of the kth distribution element and satisfies pik≥0,
Figure BDA0002388396160000053
Step (2) observing data for independent nth random point mode
Figure BDA0002388396160000054
The likelihood function is represented as:
Figure BDA0002388396160000055
Figure BDA0002388396160000056
are independent of each other XnLikelihood function of f (x)nk) Represents XnSingle point data x innA distribution function of (a); missing variable en={en,1,en,2,…,en,K},en,kK is a K-th dimension missing variable in the missing variables, and K is 1,2, …, K, and is used to indicate the single point data x in the point patternnThe point pattern category of (1); e.g. of the typenAnd XnComplete data of composition (X)n,en),
Figure BDA0002388396160000061
X1:NSet of N random point pattern observations, e1:NRepresenting a set of N missing variables.
The parametric posterior distribution of the point-mode hybrid model is represented as:
Figure BDA0002388396160000062
p (theta) is prior distribution of parameters, p (theta | X) is posterior distribution of parameters, and regular constant is
Figure BDA0002388396160000063
Figure BDA0002388396160000064
Representing a likelihood function of a random point pattern.
For the ith data x of the obtained nth random point mode observation datan,i,xn,i∈XnIt is not known which distribution the observation resulted from; thus, an indicator variable is defined for K dimensions, each dimension indicating a distribution element in the mixed distribution; obviously, an observation can only be generated from one distribution element, so that the missing variable of K dimension is only 1 in one dimension, and 0 in other dimensions, which is denoted as en=[en,1,…,en,K]TAnd satisfy en,k∈{0,1},
Figure BDA0002388396160000065
Wherein en,kData x is represented by 1nFrom the distribution element f (x)nk) Generating; the Gaussian mixture distribution has good fitting performance, and is taken as the characteristic distribution of the point mode, and the parameter set of the corresponding point mode is thetak={ρkkkWhere ρ iskIs the base number, mukIs mean value, ΣkIs a covariance matrix.
Step (3), establishing prior distribution of parameters of a finite mixed model in a random point mode;
in the case of Gaussian mixture, the prior parameter is
Figure BDA0002388396160000066
Decomposing the prior distribution p (Θ) according to Bayes formula: p (Θ) ═ p (π)1:K)p(ρ1:K1:K)p(Σ1:K1:K1:K1:K)p(μ1:K1:K1:K1:K) (ii) a Because the mixed weight reflects the proportion of the observed number of each component, the classified distribution is usually adopted as a prior distribution model of the mixed weight; therefore, a Dirichlet (Dirichlet) distribution is adopted as a conjugate prior of the classification distribution; if the proportion of each distribution element is unknown, the simplest prior distribution can adopt equivalent DirichletThunder (Dirichlet) distribution:
Figure BDA0002388396160000067
radix
Figure BDA0002388396160000068
lkExpressing the cardinality of the kth random point mode, wherein the prior distribution of the cardinality obeys Poisson distribution;
a priori compliance with a Weibull (Wishart) distribution of an inverse matrix of covariance
Figure BDA0002388396160000069
W (V, β) represents the Wishart (wishirt) distribution with respect to parameters V and β, V being a positive definite matrix, β being a degree of freedom;
mean value a gaussian distribution is used as the conjugate prior of the RPP-FMM mean value,
Figure BDA00023883961600000610
Figure BDA00023883961600000611
is the known mean of each random dot pattern.
Step (4), obtaining posterior distribution of the model parameters according to prior distribution of the model parameters;
the posterior mixed weight follows Dirichlet (Dirichlet) distribution, the posterior mean follows Gaussian distribution, and the posterior variance follows Weihicet (Wishart) distribution; posterior distribution of parameters
Figure BDA0002388396160000071
Mixing weight { pi12,…,πKSatisfy Dirichlet distribution:
p(π12…,πK)=Dir(α1+l12+l2,…,αK+lK) Constant αk>0,lkThe number of observation data belonging to the kth randomly distributed element is 1,2, … and K;
missing variable { en,1,…,en,KMissing data is estimated according to Bayes' formula:
Figure BDA0002388396160000072
f(xnk) Representing random point pattern observed data XnSingle point data x innDistribution function of pkRepresenting the cardinality of the kth random point pattern observed data,
Figure BDA0002388396160000073
representing the sum of the indicated variable values in each dimension of the kth random point pattern observation data;
radix distribution:
Figure BDA0002388396160000074
covariance ΣkThe inverse of the covariance follows a weixilt (wishirt) distribution:
Figure BDA0002388396160000075
α therein0And β0Is a normal number, two regulating parameters M0>0,N0>0;
Mean value μkSatisfies the parameter of ξk,∑k-gaussian distribution of samples from:
p(μk)=N(μk;ξk,∑k),
Figure BDA0002388396160000076
ξkmean value of Gaussian distribution, ∑kIs the covariance of the gaussian distribution.
Step (5), estimating the number of distribution elements in the mixed distribution by combining a model estimation algorithm of Bayesian Information Criterion (BIC);
in the parameter estimation problem of the hybrid model, how to estimate the number of distribution elements (or the order of the model) is one of the important contents of the inference statistics. The MCMC method of reversible jump adopts a method of simultaneously sampling a model order and parameters, determines the model order by using a method of a maximum posterior criterion, and is a non-deterministic method. A Gibbs parameter sampling algorithm based on a random point mode finite hybrid model is a sampling algorithm combining a Gibbs sampling algorithm and BIC. Approximating a posterior distribution p (Θ | X) by a markov chain monte carlo test method using Gibbs sampling; the Gibbs sampling generally needs to know the conditional probability of one attribute of a sample under all other attributes, and then deduces sample values of other attributes by using the conditional chain, so that the Gibbs sampling samples posterior distribution samples of parameters under the condition that given sample data and prior distribution of each parameter are known; the derived markov chain may reflect the probabilistic characteristics of the a posteriori distribution, with the stable points in the chain often being extreme points in the distribution, as the final estimate.
The Gibbs sampling algorithm is specifically as follows:
a. initialization
Figure BDA0002388396160000081
θk={ρkkkFrom a conditional density p (θ)i-i) Sampling in;
b. from
Figure BDA0002388396160000082
Middle sampling to obtain
Figure BDA0002388396160000083
And so on, from
Figure BDA0002388396160000084
Obtained by intermediate sampling
Figure BDA0002388396160000085
c. To realize from
Figure BDA0002388396160000086
To
Figure BDA0002388396160000087
Jumping of (2);
d. and repeating a-c to obtain a Markov chain.
On the basis of Gibbs sampling, the matching degree of RPP-FMM and real data distribution is evaluated by combining Bayesian Information Criterion (BIC), so that more information can be expressed by using a simple model.
The deduced Markov chain can reflect the probability characteristics of the posterior distribution, and the stable point in the chain is often the extreme point in the distribution and can be used as the final model parameter estimation value.
The Bayesian information criterion is defined as: BIC (m)kk,Xk)=-2logL(Θk,mk|Xk)+Mklnnk(ii) a Wherein M iskIs the number of independent parameters, logL (Θ)k,mk|Xk) Representing a parameter set ΘkAnd number of elements mkA log-likelihood function of;
Mk=3mk+2,
Figure BDA0002388396160000088
in the alternative model class, the model that minimizes bayesian information is the preferred model, and the parameter estimates for the model are derived as:
Figure BDA0002388396160000089
parameter mkkFrom BIC (m)kk) Is obtained from the minimum likelihood function of (a).

Claims (2)

1. The Gibbs parameter sampling method applied to the finite mixed model of the random point mode is characterized by comprising the following steps of:
step (1), constructing a random point mode finite mixed model;
the point-mode mixture model with K random sources is represented as:
f(Xn|Θ)=π1f(Xn1)+π2f(Xn2)+…+πKf(XnK);Xnrepresents the nth random point mode observed data, N is 1,2, …, N, N is the number of random point mode observed data,
Figure FDA0002388396150000011
Figure FDA0002388396150000012
a finite set space representing R, which is a real number space;
parameter set theta ═ pi of point mode hybrid model12,…,πK12,…,θK}∈(R+×Θ)K,R+Representing a positive real space; { theta ]12,…,θKIs a parameter variable in a random point mode distribution function, { pi }12,…,πKIs the mixing weight, pikIs a mixed weight of the kth distribution element and satisfies pik≥0,
Figure FDA0002388396150000013
Step (2), constructing a random point mode likelihood function;
for independent nth random point mode observation data
Figure FDA0002388396150000014
The likelihood function is represented as:
Figure FDA0002388396150000015
Figure FDA0002388396150000016
are independent of each other XnLikelihood function of f (x)nk) Represents XnSingle point data x innA distribution function of (a); missing variable en={en,1,en,2,…,en,K},en,kK is a K-th dimension missing variable in the missing variables, and K is 1,2, …, K, and is used to indicate the single point data x in the point patternnThe point pattern category of (1); e.g. of the typenAnd XnComplete data of composition (X)n,en),
Figure FDA0002388396150000017
X1:NSet of N random point pattern observations, e1:NRepresents a set of N missing variables;
the parametric posterior distribution of the point-mode hybrid model is represented as:
Figure FDA0002388396150000018
p (theta) is prior distribution of parameters, p (theta | X) is posterior distribution of parameters, and regular constant is
Figure FDA0002388396150000019
Figure FDA00023883961500000110
A likelihood function representing a pattern of random points;
for the ith data x of the obtained nth random point mode observation datan,i,xn,i∈XnDefining an indicating variable of K dimensions, wherein each dimension indicates a distribution element in the mixed distribution, the missing variable of the K dimensions is 1 in one dimension, and the other dimensions are 0 and are expressed as en=[en,1,…,en,K]TAnd satisfy en,k∈{0,1},
Figure FDA0002388396150000021
Wherein en,kData x is represented by 1nFrom the distribution element f (x)nk) Generating; taking Gaussian mixture distribution as characteristic distribution of point pattern, and the corresponding point pattern parameter set is thetak={ρkkkWhere ρ iskIs the base number, mukIs mean value, ΣkIs a covariance matrix;
step (3), establishing prior distribution of parameters of a finite mixed model in a random point mode;
in the case of Gaussian mixture, the prior parameter is
Figure FDA0002388396150000022
Decomposing the prior distribution p (Θ) according to a bayesian formula: p (Θ) ═ p (π)1:K)p(ρ1:K1:K)p(Σ1:K1:K1:K1:K)p(μ1:K1:K1:K1:K) (ii) a Dirichlet distribution is adopted as the conjugate prior of classification distribution;
if the proportion of each distribution element is unknown, adopting equivalent Dirichlet distribution:
Figure FDA0002388396150000023
radix
Figure FDA0002388396150000024
lkExpressing the cardinality of the kth random point mode, wherein the prior distribution of the cardinality obeys Poisson distribution;
a priori obeying Weibull distribution of inverse covariance matrices
Figure FDA0002388396150000025
W (V, β) represents the weicht distribution with respect to parameters V and β, V being a positive definite matrix, β being a degree of freedom;
mean value a gaussian distribution is used as the conjugate prior of the RPP-FMM mean value,
Figure FDA0002388396150000026
Figure FDA0002388396150000027
is the known mean of each random point pattern;
step (4), obtaining posterior distribution of the model parameters according to prior distribution of the model parameters;
the posterior mixed weight follows Dirichlet distribution, the posterior mean value follows Gaussian distribution, and the posterior variance follows Weight distribution;
posterior distribution of parameters
Figure FDA0002388396150000028
Mixing weight { pi12,…,πKSatisfy dirichlet distribution:
p(π12…,πK)=Dir(α1+l12+l2,…,αK+lK) Constant αk>0,lkThe number of observation data belonging to the kth randomly distributed element is 1,2, … and K;
missing variable { en,1,…,en,KMissing data is estimated according to Bayes' formula:
Figure FDA0002388396150000029
f(xnk) Representing random point pattern observed data XnSingle point data x innDistribution function of pkRepresenting the cardinality of the kth random point pattern observed data,
Figure FDA0002388396150000031
representing the sum of the indicated variable values in each dimension of the kth random point pattern observation data;
radix distribution:
Figure FDA0002388396150000032
covariance ΣkThe inverse of the covariance follows a weichi distribution:
Figure FDA0002388396150000033
α therein0And β0Is a normal number, two regulating parameters M0>0,N0>0;
Mean value μkSatisfies the parameter of ξk,∑k-gaussian distribution of samples from:
p(μk)=N(μk;ξkk),
Figure FDA0002388396150000034
ξkis the mean of a Gaussian distribution, sigmakCovariance as gaussian distribution;
step (5), estimating the number of distribution elements and the model parameter value in mixed distribution by adopting a sampling algorithm combining a Gibbs sampling algorithm and a Bayesian information criterion;
the Bayesian information criterion is defined as: BIC (m)kk,Xk)=-2logL(Θk,mk|Xk)+Mklnnk(ii) a Wherein M iskIs the number of independent parameters, logL (Θ)k,mk|Xk) Representing a parameter set ΘkAnd number of elements mkA log-likelihood function of;
Mk=3mk+2,
Figure FDA0002388396150000035
in the alternative model class, the model that minimizes the bayesian information is the preferred model, and the parameter estimation derivation of the preferred model is:
Figure FDA0002388396150000036
parameter mkkFrom BIC (m)kk) Is obtained from the minimum likelihood function of (a) to obtain the parameter set thetakAnd the number of distribution elements mk
With parameter set ΘkAnd number of distribution elements mkAnd estimating the model parameter value according to a Gibbs sampling algorithm.
2. The Gibbs parameter sampling method applied to the finite mixture model with random point mode as claimed in claim 1, wherein the Gibbs sampling algorithm in step (5) estimates the model parameter values by the following specific method:
a. initialization
Figure FDA0002388396150000037
From the conditional density p (theta)i-i) Sampling in;
b. from
Figure FDA0002388396150000041
Middle sampling to obtain
Figure FDA0002388396150000042
And so on, from
Figure FDA0002388396150000043
Obtained by intermediate sampling
Figure FDA0002388396150000044
c. To realize from
Figure FDA0002388396150000045
To
Figure FDA0002388396150000046
Jumping of (2);
d. repeating a-c to obtain a Markov chain;
the stable point in the markov chain is the extreme point in the distribution, which is used as the final model parameter estimation value.
CN202010105441.3A 2020-02-20 2020-02-20 Gibbs parameter sampling method applied to random point mode finite hybrid model Pending CN111369002A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010105441.3A CN111369002A (en) 2020-02-20 2020-02-20 Gibbs parameter sampling method applied to random point mode finite hybrid model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010105441.3A CN111369002A (en) 2020-02-20 2020-02-20 Gibbs parameter sampling method applied to random point mode finite hybrid model

Publications (1)

Publication Number Publication Date
CN111369002A true CN111369002A (en) 2020-07-03

Family

ID=71206202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010105441.3A Pending CN111369002A (en) 2020-02-20 2020-02-20 Gibbs parameter sampling method applied to random point mode finite hybrid model

Country Status (1)

Country Link
CN (1) CN111369002A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814342A (en) * 2020-07-16 2020-10-23 中国人民解放军空军工程大学 Complex equipment reliability hybrid model and construction method thereof
CN115508624A (en) * 2022-11-23 2022-12-23 中国人民解放军国防科技大学 Electromagnetic spectrum map construction method, device and equipment based on residual Kriging method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814342A (en) * 2020-07-16 2020-10-23 中国人民解放军空军工程大学 Complex equipment reliability hybrid model and construction method thereof
CN111814342B (en) * 2020-07-16 2022-10-11 中国人民解放军空军工程大学 Complex equipment reliability hybrid model and construction method thereof
CN115508624A (en) * 2022-11-23 2022-12-23 中国人民解放军国防科技大学 Electromagnetic spectrum map construction method, device and equipment based on residual Kriging method

Similar Documents

Publication Publication Date Title
Møller Shot noise Cox processes
Baek et al. Mixtures of factor analyzers with common factor loadings: Applications to the clustering and visualization of high-dimensional data
Bühlmann et al. Analyzing bagging
Gaffney et al. Curve clustering with random effects regression mixtures
Feelders Credit scoring and reject inference with mixture models
CN111369002A (en) Gibbs parameter sampling method applied to random point mode finite hybrid model
Feelders Credit scoring and reject inference with mixture models
Scott et al. Nonparametric Bayesian testing for monotonicity
Trapp et al. Learning deep mixtures of gaussian process experts using sum-product networks
Wu et al. A bayesian method for guessing the extreme values in a data set?
Jank Stochastic variants of EM: Monte Carlo, quasi-Monte Carlo and more
Li et al. Bayesian classification of multiclass functional data
Lu et al. Likelihood based confidence intervals for the tail index
Bernardo et al. Non-centered parameterisations for hierarchical models and data augmentation
Legried et al. Rates of convergence in the two-island and isolation-with-migration models
Bordes et al. EM and stochastic EM algorithms for reliability mixture models under random censoring
Terejanu Tutorial on Monte Carlo Techniques
Trianasari et al. Bivariate beta mixture model with correlations
Quintana et al. Nonparametric bayesian assessment of the order of dependence for binary sequences
Biau et al. Density estimation by the penalized combinatorial method
Wang et al. Gibbs Parameter Sampling Algorithm Based on Finite Mixture Model of Random Point Pattern
Kaygusuz et al. Bootstrap in Gaussian Mixture models and Performance assesement
Eideh Parametric prediction of finite population total under Informative sampling and nonignorable nonresponse
Liu et al. SpAM: Sparse additive models
CN108268469B (en) Text classification algorithm based on mixed polynomial distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200703

RJ01 Rejection of invention patent application after publication