A SPARSE NEGATIVE BINOMIAL CLASSIFIER WITH COVARIATE ADJUSTMENT FOR RNA-SEQ DATA

成果类型:
Article
署名作者:
Rahman, Tanbin; Huang, Hsin-En; Li, Yujia; Tai, An-Shun; Hseih, Wen-Ping; McClung, Colleen A.; Tseng, George
署名单位:
Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; National Tsing Hua University; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh
刊物名称:
ANNALS OF APPLIED STATISTICS
ISSN/ISSBN:
1932-6157
DOI:
10.1214/21-AOAS1532
发表日期:
2022
页码:
1071-1089
关键词:
differential expression analysis selection
摘要:
Supervised machine learning methods have been increasingly used in biomedical research and clinical practice. In transcriptomic applications, RNA-seq data have become dominating and have gradually replaced traditional microarray, due to their reduced background noise and increased digital precision. Most existing machine learning methods are, however, designed for continuous intensities of microarray and are not suitable for RNA-seq count data. In this paper we develop a negative binomial model via generalized linear model framework with double regularization for gene and covariate sparsity to accommodate three key elements: adequate modeling of count data with overdispersion, gene selection and adjustment for covariate effect. The proposed sparse negative binomial classifier (snbClass) is evaluated in simulations and two real applications of multidisease postmortem brain tissue RNA-seq data and cervical tumor miRNA-seq data to demonstrate its superior performance in prediction accuracy and feature selection.
来源URL: