Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA
成果类型:
Article
署名作者:
Jung, Yoonsuh; Huang, Jianhua Z.; Hu, Jianhua
署名单位:
University of Waikato; Texas A&M University System; Texas A&M University College Station; Capital University of Economics & Business; University of Texas System; UTMD Anderson Cancer Center
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2014.928217
发表日期:
2014
页码:
1355-1367
关键词:
gene-environment interactions
multiple-sclerosis
mm algorithms
selection
susceptibility
glioma
Lasso
gwas
摘要:
In genome-wide association studies, the primary task is to detect biomarkers in the form of single nucleotide polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs compared to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently, the most commonly used approach is still to analyze one SNP at a time. In this article, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the L-1-penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a majorization-minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a multiple sclerosis dataset and simulated datasets and shows promise in biomarker detection.