您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 统计学 > Journal of the Royal Statistical Society: Series B > 2018 > 5期

An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond

成果类型：

Article

署名作者：

Liang, Faming; Jia, Bochao; Xue, Jingnan; Li, Qizhai; Luo, Ye

署名单位：

Purdue University System; Purdue University; State University System of Florida; University of Florida; Chinese Academy of Sciences

刊物名称：

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY

ISSN/ISSBN：

1369-7412

DOI：

10.1111/rssb.12279

发表日期：

2018

页码：

899-926

关键词：

VARIABLE SELECTION em algorithm maximum-likelihood gene-expression regression-models linear-regression data augmentation Graphical Models microarray data incomplete data

摘要：

Missing data are frequently encountered in high dimensional problems, but they are usually difficult to deal with by using standard algorithms, such as the expectation-maximization algorithm and its variants. To tackle this difficulty, some problem-specific algorithms have been developed in the literature, but there still lacks a general algorithm. This work is to fill the gap: we propose a general algorithm for high dimensional missing data problems. The algorithm works by iterating between an imputation step and a regularized optimization step. At the imputation step, the missing data are imputed conditionally on the observed data and the current estimates of parameters and, at the regularized optimization step, a consistent estimate is found via the regularization approach for the minimizer of a Kullback-Leibler divergence defined on the pseudocomplete data. For high dimensional problems, the consistent estimate can be found under sparsity constraints. The consistency of the averaged estimate for the true parameter can be established under quite general conditions. The algorithm is illustrated by using high dimensional Gaussian graphical models, high dimensional variable selection and a random-coefficient model.