STRUCTURED SUBCOMPOSITION SELECTION IN REGRESSION AND ITS APPLICATION TO MICROBIOME DATA ANALYSIS

成果类型:
Article
署名作者:
Wang, Tao; Zhao, Hongyu
署名单位:
Shanghai Jiao Tong University; Shanghai Jiao Tong University; Yale University
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/16-AOAS1017
发表日期:
2017
页码:
771-791
关键词:
VARIABLE SELECTION compositional data gut microbiota obesity regularization sparsity
摘要:
Compositional data arise naturally in many practical problems and the analysis of such data presents many statistical challenges, especially in high dimensions. In this article, we consider the problem of subcomposition selection in regression with compositional covariates, where the relationships among the covariates can be represented by a tree with leaf nodes corresponding to covariates. Assuming that the tree structure is available as prior knowledge, we adopt a symmetric version of the linear log contrast model, and propose a tree-guided regularization method for this structured subcomposition selection. Our method is based on a novel penalty function that incorporates the tree structure information node-by-node, encouraging the selection of subcompositions at subtree levels. We show that this optimization problem can be formulated as a generalized lasso problem, the solution of which can be computed efficiently using existing algorithms. An application to a human gut microbiome study and simulations are presented to compare the performance of the proposed method with an l(1) regularization method where the tree structure information is not utilized.
来源URL: