A Geometric Perspective on the Power of Principal Component Association Tests in Multiple Phenotype Studies

成果类型:
Article
署名作者:
Liu, Zhonghua; Lin, Xihong
署名单位:
University of Hong Kong; Harvard University; Harvard T.H. Chan School of Public Health
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2018.1513363
发表日期:
2019
页码:
975-990
关键词:
Asymptotic Optimality genetic-variants fishers method genome traits loci metaanalysis hypotheses statistics pleiotropy
摘要:
Joint analysis of multiple phenotypes can increase statistical power in genetic association studies. Principal component analysis, as a popular dimension reduction method, especially when the number of phenotypes is high dimensional, has been proposed to analyze multiple correlated phenotypes. It has been empirically observed that the first PC, which summarizes the largest amount of variance, can be less powerful than higher-order PCs and other commonly used methods in detecting genetic association signals. In this article, we investigate the properties of PCA-based multiple phenotype analysis from a geometric perspective by introducing a novel concept called principal angle. A particular PC is powerful if its principal angle is 0 degrees and is powerless if its principal angle is 90 degrees. Without prior knowledge about the true principal angle, each PC can be powerless. We propose linear, nonlinear, and data-adaptive omnibus tests by combining PCs. We demonstrate that the Wald test is a special quadratic PC-based test. We show that the omnibus PC test is robust and powerful in a wide range of scenarios. We study the properties of the proposed methods using power analysis and eigen-analysis. The subtle differences and close connections between these combined PC methods are illustrated graphically in terms of their rejection boundaries. Our proposed tests have convex acceptance regions and hence are admissible. The p-values for the proposed tests can be efficiently calculated analytically and the proposed tests have been implemented in a publicly available R package MPAT. We conduct simulation studies in both low- and high-dimensional settings with various signal vectors and correlation structures. We apply the proposed tests to the joint analysis of metabolic syndrome-related phenotypes with datasets collected from four international consortia to demonstrate the effectiveness of the proposed combined PC testing procedures. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.