An Incomplete-Data Quasi-Likelihood Approach to Haplotype-Based Genetic Association Studies on Related Individuals

成果类型:
Article
署名作者:
Wang, Zuoheng; McPeek, Mary Sara
署名单位:
University of Chicago; University of Chicago
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1198/jasa.2009.tm08507
发表日期:
2009
页码:
1251-1260
关键词:
LONGITUDINAL DATA-ANALYSIS Linkage Disequilibrium estimating equations asymptotics
摘要:
We propose an incomplete-data, quasi-likelihood framework for estimation and score tests that accommodates both dependent and partially observed data. The motivation comes from genetic association studies, where we address the problems of estimating haplotype frequencies and testing association between a disease and haplotypes of multiple. tightly linked genetic markers, using case-control samples containing, related individuals. We consider a more general setting in which the complete data are dependent with marginal distributions following a generalized linear model. We form a vector, Z, whose elements are conditional expectations of the elements of the complete-data vector. given selected functions of the incomplete data. Assuming that the covariance matrix of Z is available. we create in optimal linear estimating function based on Z. which we solve by an iterative method. This approach addresses key difficulties in haplotype frequency estimation and testing problems in related individuals: (a) dependence that is known but can be complicated (b) data that are incomplete for structural reasons, as well as possibly missing, with different amounts of information for different observations; (c) the need for computational speed to analyze large numbers of markers; and (d) a well-established null model but an alternative model that is unknown and is difficult to Specify fully in related individuals. For haplotype analysis, we give sufficient conditions for consistency and asymptotic normality of the estimator and asymptotic chi(2) null distribution of the score test. We apply the method to test for association of haplotypes with alcoholism in the GAW 14 COGA data set.