Quantifying constraint in the human mitochondrial genome

成果类型:
Article
署名作者:
Lake, Nicole J.; Ma, Kaiyue; Liu, Wei; Battle, Stephanie L.; Laricchia, Kristen M.; Tiao, Grace; Puiu, Daniela; Ng, Kenneth K.; Cohen, Justin; Compton, Alison G.; Cowie, Shannon; Christodoulou, John; Thorburn, David R.; Zhao, Hongyu; Arking, Dan E.; Sunyaev, Shamil R.; Lek, Monkol
署名单位:
Yale University; Murdoch Children's Research Institute; Royal Children's Hospital Melbourne; Yale University; Johns Hopkins University; University System of Maryland; Bowie State University; Harvard University; Massachusetts Institute of Technology (MIT); Broad Institute; Harvard University; Harvard University Medical Affiliates; Massachusetts General Hospital; Johns Hopkins University; University of Melbourne; Murdoch Children's Research Institute; Victorian Clinical Genetics Services; Royal Children's Hospital Melbourne; Yale University; Harvard University; Harvard Medical School; Harvard University; Harvard Medical School; Harvard University; Harvard Medical School
刊物名称:
Nature
ISSN/ISSBN:
0028-6106
DOI:
10.1038/s41586-024-08048-x
发表日期:
2024-11-14
关键词:
dna mutations heteroplasmy bottleneck selection EVOLUTION sequence
摘要:
Mitochondrial DNA (mtDNA) has an important yet often overlooked role in health and disease. Constraint models quantify the removal of deleterious variation from the population by selection and represent powerful tools for identifying genetic variation that underlies human phenotypes1-4. However, nuclear constraint models are not applicable to mtDNA, owing to its distinct features. Here we describe the development of a mitochondrial genome constraint model and its application to the Genome Aggregation Database (gnomAD), a large-scale population dataset that reports mtDNA variation across 56,434 human participants5. Specifically, we analyse constraint by comparing the observed variation in gnomAD to that expected under neutrality, which was calculated using a mtDNA mutational model and observed maximum heteroplasmy-level data. Our results highlight strong depletion of expected variation, which suggests that many deleterious mtDNA variants remain undetected. To aid their discovery, we compute constraint metrics for every mitochondrial protein, tRNA and rRNA gene, which revealed a range of intolerance to variation. We further characterize the most constrained regions within genes through regional constraint and identify the most constrained sites within the entire mitochondrial genome through local constraint, which showed enrichment of pathogenic variation. Constraint also clustered in three-dimensional structures, which provided insight into functionally important domains and their disease relevance. Notably, we identify constraint at often overlooked sites, including in rRNA and noncoding regions. Last, we demonstrate that these metrics can improve the discovery of deleterious variation that underlies rare and common phenotypes. Development of a constraint model specifically for mitochondrial DNA and applied to data from the Genome Aggregation Database provides insights into which sites in the mitochondrial genome are important for health and disease.