您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > The Annals of Statistics > 2018 > 4期

AN MCMC APPROACH TO EMPIRICAL BAYES INFERENCE AND BAYESIAN SENSITIVITY ANALYSIS VIA EMPIRICAL PROCESSES

成果类型：

Article

署名作者：

Doss, Hani; Park, Yeonhee

署名单位：

State University System of Florida; University of Florida; University of Texas System; UTMD Anderson Cancer Center

刊物名称：

ANNALS OF STATISTICS

ISSN/ISSBN：

0090-5364

DOI：

10.1214/17-AOS1597

发表日期：

2018

页码：

1630-1663

关键词：

chain monte-carlo variable selection regression estimators

摘要：

Consider a Bayesian situation in which we observe Y similar to p(theta), where theta is an element of Theta and we have a family {vh, h is an element of H} of potential prior distributions on Theta. Let g be a real-valued function of theta, and let I-g(h) be the posterior expectation of g(theta) when the prior is v(h) . We are interested in two problems: (i) selecting a particular value of h, and (ii) estimating the family of posterior expectations {I-g(h), h is an element of H}. Let m(y)(h) be the marginal likelihood of the hyperparameter h: m(y)(h) = integral p(theta)(y)v(h)(d theta). The empirical Bayes estimate of h is, by definition, the value of h that maximizes m(y)(h). It turns out that it is typically possible to use Markov chain Monte Carlo to form point estimates for m(y)(h) and I-g(h) for each individual h in a continuum, and also confidence intervals for m(y)(h) and I-g(h) that are valid pointwise. However, we are interested in forming estimates, with confidence statements, of the entire families of integrals {my(h), h is an element of H} and {I-g(h), h is an element of H}: we need estimates of the first family in order to carry out empirical Bayes inference, and we need estimates of the second family in order to do Bayesian sensitivity analysis. We establish strong consistency and functional central limit theorems for estimates of these families by using tools from empirical process theory. We give two applications, one to latent Dirichlet allocation, which is used in topic modeling, and the other is to a model for Bayesian variable selection in linear regression.