您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the Royal Statistical Society: Series B > 2007

Controlling the reinforcement in Bayesian non-parametric mixture models

成果类型：

Article

署名作者：

Lijoi, Antonio; Mena, Ramses H.; Prunster, Igor

署名单位：

University of Pavia; Universidad Nacional Autonoma de Mexico; Collegio Carlo Alberto; University of Turin; University of Turin

刊物名称：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY

ISSN/ISSBN：

1369-7412

DOI：

10.1111/j.1467-9868.2007.00609.x

发表日期：

2007

页码：

715-740

关键词：

inference

摘要：

The paper deals with the problem of determining the number of components in a mixture model. We take a Bayesian non-parametric approach and adopt a hierarchical model with a suitable non-parametric prior for the latent structure. A commonly used model for such a problem is the mixture of Dirichlet process model. Here, we replace the Dirichlet process with a more general non-parametric prior obtained from a generalized gamma process. The basic feature of this model is that it yields a partition structure for the latent variables which is of Gibbs type. This relates to the well-known (exchangeable) product partition models. If compared with the usual mixture of Dirichlet process model the advantage of the generalization that we are examining relies on the availability of an additional parameter a belonging to the interval (0,1): it is shown that such a parameter greatly influences the clustering behaviour of the model. A value of a that is close to 1 generates a large number of clusters, most of which are of small size. Then, a reinforcement mechanism which is driven by (T acts on the mass allocation by penalizing clusters of small size and favouring those few groups containing a large number of elements. These features turn out to be very useful in the context of mixture modelling. Since it is difficult to specify a priori the reinforcement rate, it is reasonable to specify a prior for sigma. Hence, the strength of the reinforcement mechanism is controlled by the data.

来源URL：

访问原文