Exact Decoding of a Sequentially Markov Coalescent Model in Genetics

成果类型:
Article
署名作者:
Ki, Caleb; Terhorst, Jonathan
署名单位:
University of Michigan System; University of Michigan
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2023.2252570
发表日期:
2024
页码:
2242-2255
关键词:
genome-wide association population history Linkage Disequilibrium bayesian-analysis inference recombination imputation DYNAMICS rates size
摘要:
In statistical genetics, the sequentially Markov coalescent (SMC) is an important family of models for approximating the distribution of genetic variation data under complex evolutionary models. Methods based on SMC are widely used in genetics and evolutionary biology, with significant applications to genotype phasing and imputation, recombination rate estimation, and inferring population history. SMC allows for likelihood-based inference using hidden Markov models (HMMs), where the latent variable represents a genealogy. Because genealogies are continuous, while HMMs are discrete, SMC requires discretizing the space of trees in a way that is awkward and creates bias. In this work, we propose a method that circumvents this requirement, enabling SMC-based inference to be performed in the natural setting of a continuous state space. We derive fast, exact procedures for frequentist and Bayesian inference using SMC. Compared to existing methods, ours requires minimal user intervention or parameter tuning, no numerical optimization or E-M, and is faster and more accurate. Supplementary materials for this article are available online.