您的位置: 首页 > 全球经管学术 > 顶刊追踪 > 顶尖期刊 > 综合性期刊 > Proceedings of the National Academy of Sciences of the United States of America > 2024 > 24期

TULIP: A transformer-based unsupervised language model for interacting peptides and T cell receptors that generalizes to unseen epitopes

成果类型：

Article

署名作者：

Meynard-Piganeau, Barthelemy; Feinauer, Christoph; Weigt, Martin; Walczak, Aleksandra M.; Mora, Thierry

署名单位：

Centre National de la Recherche Scientifique (CNRS); Sorbonne Universite; Bocconi University; Centre National de la Recherche Scientifique (CNRS); Universite Paris Cite; Universite PSL; Ecole Normale Superieure (ENS); Sorbonne Universite

刊物名称：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA

ISSN/ISSBN：

0027-10916

DOI：

10.1073/pnas.2316401121

发表日期：

2024-06-11

关键词：

specificity antigen repertoire activation

摘要：

The accurate prediction of binding between T cell receptors (TCR) and their cognate epitopes is key to understanding the adaptive immune response and developing immunotherapies. Current methods face two significant limitations: the shortage of comprehensive high-quality data and the bias introduced by the selection of the negative training data commonly used in the supervised learning approaches. We propose a method, Transformer -based Unsupervised Language model for Interacting Peptides and T cell receptors (TULIP), that addresses both limitations by leveraging incomplete data and unsupervised learning and using the transformer architecture of language models. Our model is flexible and integrates all possible data sources, regardless of their quality or completeness. We demonstrate the existence of a bias introduced by the sampling procedure used in previous supervised approaches, emphasizing the need for an unsupervised approach. TULIP recognizes the specific TCRs binding an epitope, performing well on unseen epitopes. Our model outperforms state -of -the -art models and offers a promising direction for the development of more accurate TCR epitope recognition models.