A NOVEL FRAMEWORK TO QUANTIFY UNCERTAINTY IN PEPTIDE-TANDEM MASS SPECTRUM MATCHES WITH APPLICATION TO NANOBODY PEPTIDE IDENTIFICATION
成果类型:
Article
署名作者:
McKennan, Chris; Sang, Zhe; Shi, Yi
署名单位:
Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; Icahn School of Medicine at Mount Sinai
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/24-AOAS1975
发表日期:
2025
页码:
614-636
关键词:
nonparametric-estimation
SEQUENCES
versatile
VALUES
摘要:
Nanobodies are small antibody fragments derived from camelids that selectively bind to antigens. These proteins have marked physicochemical properties that support advanced therapeutics, including possible treatments for SARS-CoV-2. To realize their potential, bottom-up proteomics via liquid chromatography-tandem mass spectrometry (LC-MS/MS) has been proposed to identify antigen-specific nanobodies at the proteome scale, where a critical component of this pipeline is matching nanobody peptides to their begotten tandem mass spectra. While peptide-spectrum matching is a well-studied problem, we show the sequence similarity between nanobody peptides violates key assumptions necessary to infer nanobody peptide-spectrum matches (PSMs) with the standard target-decoy paradigm and prove these violations beget inflated error rates. To address these issues, we develop a novel framework and method that treats peptide-spectrum matching as a Bayesian model selection problem with an incomplete model space, which are, to our knowledge, the first to account for all sources of PSM error without relying on the aforementioned assumptions. Our work also demonstrates how to leverage novel retention time and spectrum prediction tools to develop accurate and discriminating data-generating models and, to our knowledge, provides the first rigorous description of MS/MS spectrum noise. We illustrate our method's superior performance on simulated and real nanobody data.
来源URL: