Integrating gene annotation with orthology inference at scale
成果类型:
Article
署名作者:
Kirilenko, Bogdan M.; Munegowda, Chetan; Osipova, Ekaterina; Jebb, David; Sharma, Virag; Blumer, Moritz; Morales, Ariadna E.; Ahmed, Alexis-Walid; Kontopoulos, Dimitrios-Georgios; Hilgers, Leon; Lindblad-Toh, Kerstin; Karlsson, Elinor K.; Hiller, Michael
署名单位:
Max Planck Society; Max Planck Society; Leibniz Association; Senckenberg Gesellschaft fur Naturforschung (SGN); Goethe University Frankfurt; Uppsala University; Harvard University; Massachusetts Institute of Technology (MIT); Broad Institute; University of Massachusetts System; University of Massachusetts Worcester; UMass Chan Medical School; University of Massachusetts System; University of Massachusetts Worcester; UMass Chan Medical School; University of Limerick
刊物名称:
SCIENCE
ISSN/ISSBN:
0036-14107
DOI:
10.1126/science.abn3107
发表日期:
2023-04-28
页码:
368-+
关键词:
evolutionary
duplication
alignments
prediction
improves
database
genomes
eggnog
摘要:
Annotating coding genes and inferring orthologs are two classical challenges in genomics and evolutionary biology that have traditionally been approached separately, limiting scalability. We present TOGA (Tool to infer Orthologs from Genome Alignments), a method that integrates structural gene annotation and orthology inference. TOGA implements a different paradigm to infer orthologous loci, improves ortholog detection and annotation of conserved genes compared with state-of-the-art methods, and handles even highly fragmented assemblies. TOGA scales to hundreds of genomes, which we demonstrate by applying it to 488 placental mammal and 501 bird assemblies, creating the largest comparative gene resources so far. Additionally, TOGA detects gene losses, enables selection screens, and automatically provides a superior measure of mammalian genome quality. TOGA is a powerful and scalable method to annotate and compare genes in the genomic era.