DISTANCE-BASED SPECIES TREE ESTIMATION UNDER THE COALESCENT: INFORMATION-THEORETIC TRADE-OFF BETWEEN NUMBER OF LOCI AND SEQUENCE LENGTH

成果类型:
Article
署名作者:
Mossel, Elchanan; Roch, Sebastien
署名单位:
Massachusetts Institute of Technology (MIT); University of Wisconsin System; University of Wisconsin Madison
刊物名称:
ANNALS OF APPLIED PROBABILITY
ISSN/ISSBN:
1050-5164
DOI:
10.1214/16-AAP1273
发表日期:
2017
页码:
2926-2955
关键词:
higher criticism logs suffice gene trees reconstruction phylogenies mixtures models alignment HISTORY bounds
摘要:
We consider the reconstruction of a phylogeny from multiple genes under the multispecies coalescent. We establish a connection with the sparse signal detection problem, where one seeks to distinguish between a distribution and a mixture of the distribution and a sparse signal. Using this connection, we derive an information-theoretic trade-off between the number of genes, m, needed for an accurate reconstruction and the sequence length, k, of the genes. Specifically, we show that to detect a branch of length f, one needs m = Theta (1/[ f(2) root k]) genes.
来源URL: