您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the American Statistical Association > 2023 > 544期

Prior-Preconditioned Conjugate Gradient Method for Accelerated Gibbs Sampling in Large n, Large p'' Bayesian Sparse Regression

成果类型：

Article

署名作者：

Nishimura, Akihiko; Suchard, Marc A.

署名单位：

Johns Hopkins University; University of California System; University of California Los Angeles

刊物名称：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

ISSN/ISSBN：

0162-1459

DOI：

10.1080/01621459.2022.2057859

发表日期：

2023

页码：

2468-2481

关键词：

VARIABLE SELECTION horseshoe inference Iterations EQUATIONS models

摘要：

In a modern observational study based on healthcare databases, the number of observations and of predictors typically range in the order of 10(5)-10(6) and of 10(4) -10(5). Despite the large sample size, data rarely provide sufficient information to reliably estimate such a large number of parameters. Sparse regression techniques provide potential solutions, one notable approach being the Bayesian method based on shrinkage priors. In the large n and large psetting, however, the required posterior computation encounters a bottleneck at repeated sampling from a high-dimensional Gaussian distribution, whose precision matrix Phi is expensive to compute and factorize. In this article, we present a novel algorithm to speed up this bottleneck based on the following observation: We can cheaply generate a random vector b such that the solution to the linear system Phi beta = b has the desired Gaussian distribution. We can then solve the linear system by the conjugate gradient (CG) algorithm through matrix-vector multiplications by Phi; this involves no explicit factorization or calculation of Phi itself. Rapid convergence of CG in this context is guaranteed by the theory of prior-preconditioning we develop. We apply our algorithm to a clinically relevant large-scale observational study with n = 72,489 patients and p = 22,175 clinical covariates, designed to assess the relative risk of adverse events from two alternative blood anti-coagulants. Our algorithm demonstrates an order of magnitude speed-up in posterior inference, in our case cutting the computation time from two weeks to less than a day. Supplementary materials for this article are available online.