An improved and explicit surrogate variable analysis procedure by coefficient adjustment
成果类型:
Article
署名作者:
Lee, Seunggeun; Sun, Wei; Wright, Fred A.; Zou, Fei
署名单位:
University of Michigan System; University of Michigan; Fred Hutchinson Cancer Center; North Carolina State University; State University System of Florida; University of Florida
刊物名称:
BIOMETRIKA
ISSN/ISSBN:
0006-3444
DOI:
10.1093/biomet/asx018
发表日期:
2017
页码:
303316
关键词:
principal-components-analysis
gene-expression data
unwanted variation
array data
CONVERGENCE
methylation
Consistency
prediction
dependence
scores
摘要:
Unobserved environmental, demographic and technical factors can adversely affect the estimation and testing of the effects of primary variables. Surrogate variable analysis, proposed to tackle this problem, has been widely used in genomic studies. To estimate hidden factors that are correlated with the primary variables, surrogate variable analysis performs principal component analysis either on a subset of features or on all features, but weighting each differently. However, existing approaches may fail to identify hidden factors that are strongly correlated with the primary variables, and the extra step of feature selection and weight calculation makes the theoretical investigation of surrogate variable analysis challenging. In this paper, we propose an improved surrogate variable analysis, using all measured features, that has a natural connection with restricted least squares, which allows us to study its theoretical properties. Simulation studies and real-data analysis show that the method is competitive with state-of-the-art methods.
来源URL: