Efficiently Inferring the Demographic History of Many Populations With Allele Count Data
成果类型:
Article
署名作者:
Kamm, Jack; Terhorst, Jonathan; Durbin, Richard; Song, Yun S.
署名单位:
Wellcome Trust Sanger Institute; University of Cambridge; University of Michigan System; University of Michigan; University of California System; University of California Berkeley; University of California System; University of California Berkeley; Chan Zuckerberg Initiative (CZI)
刊物名称:
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
ISSN/ISSBN:
0162-1459
DOI:
10.1080/01621459.2019.1635482
发表日期:
2020
页码:
1472-1487
关键词:
genome sequence
Multiple populations
frequency-spectrum
size histories
dna-sequences
coalescent
inference
models
neanderthal
diversity
摘要:
The sample frequency spectrum (SFS), or histogram of allele counts, is an important summary statistic in evolutionary biology, and is often used to infer the history of population size changes, migrations, and other demographic events affecting a set of populations. The expected multipopulation SFS under a given demographic model can be efficiently computed when the populations in the model are related by a tree, scaling to hundreds of populations. Admixture, back-migration, and introgression are common natural processes that violate the assumption of a tree-like population history, however, and until now the expected SFS could be computed for only a handful of populations when the demographic history is not a tree. In this article, we present a new method for efficiently computing the expected SFS and linear functionals of it, for demographies described by general directed acyclic graphs. This method can scale to more populations than previously possible for complex demographic histories including admixture. We apply our method to an 8-population SFS to estimate the timing and strength of a proposed basal Eurasian admixture event in human history. We implement and release our method in a new open-source software package momi2.