-
作者:Motwani, Keshav; Bacher, Rhonda; Molstad, Aaron j.
作者单位:University of Washington; University of Washington Seattle; State University System of Florida; University of Florida; State University System of Florida; University of Florida
摘要:Categorizing individual cells into one of many known cell-type cate-gories, also known as cell-type annotation, is a critical step in the analy-sis of single-cell genomics data. The current process of annotation is time intensive and subjective, which has led to different studies describing cell types with labels of varying degrees of resolution. While supervised learn-ing approaches have provided automated solutions to annotation, there re-mains a significant challenge in fitting a unified mo...
-
作者:Pal, Suvra; Aselisewine, Wisdom
作者单位:University of Texas System; University of Texas Arlington
摘要:The promotion time cure rate model (PCM) is an extensively studied model for the analysis of time-to-event data in the presence of a cured sub-group. There are several strategies proposed in the literature to model the latency part of PCM. However, there aren't many strategies proposed to investigate the effects of covariates on the incidence part of PCM. In this regard most existing studies assume the boundary separating the cured and noncured subjects with respect to the covariates to be lin...
-
作者:Chen, Xinyuan; Li, Yiwei; Feng, Xiangnan; Chang, Joseph T.
作者单位:Mississippi State University; Lingnan University; Fudan University; Yale University
摘要:Nonhomogeneous hidden Markov models (NHMMs) are useful in mod-eling sequential and autocorrelated data. Bayesian approaches, particularly Markov chain Monte Carlo (MCMC) methods, are principal statistical in-ference tools for NHMMs. However, MCMC sampling is computationally demanding, especially for long observation sequences. We develop a vari-ational Bayes (VB) method for NHMMs, which utilizes a structured varia-tional family of Gaussian distributions with factorized covariance matrices to a...
-
作者:Peskoe, Sarah B.; Zhang, Ning; Spiegelman, Donna; Wang, Molin
作者单位:Duke University; Harvard University; Harvard T.H. Chan School of Public Health; Yale University
摘要:Researchers are often interested in estimating the effects of time-varying exposures on health outcomes. The latency period, defined as the critical pe-riod of susceptibility, can be an important component of exposure effect as-sessment. Although it is widely known that many environmental, nutritional, and other exposure measurements are prone to error and are also likely to act only during a critical time window of susceptibility, no one has yet considered the impact of this on the estimation...
-
作者:Yao, Yujing; Ogden, R. Todd; Zeng, Chubing; Chen, Qixuan
作者单位:Columbia University; University of Southern California
摘要:It is often of interest to combine available estimates of a similar quantity from multiple data sources. When the corresponding variances of each esti-mate are also available, a model should take into account the uncertainty of the estimates themselves as well as the uncertainty in the estimation of vari-ances. In addition, if there exists a strong association between estimates and their variances, the correlation between these two quantities should also be considered. In this paper we propose...
-
作者:Li, Yan; Chen, Kun; Yan, Jun; Zhan, Xuebin
作者单位:University of Connecticut; Environment & Climate Change Canada
摘要:Detection and attribution analyses play a central role in establishing the causal effect of human activities on global warming. The most commonly used method in such analyses, optimal fingerprinting, is a multiple regression where each covariate has a measurement error whose covariance matrix is the same as that of the regression error up to a known scale. Inferences about the regression coefficients are critical not only for making statements about detection and attribution but also for quant...
-
作者:Pedone, Matteo; Amedei, Amedeo; Stingo, Francesco C.
作者单位:University of Florence; University of Florence
摘要:Many environments within the human body host a collection of micro-organisms called microbiota. Recent findings have linked the composition of the microbiota to the development of different human diseases, includ-ing cancer. Motivated by a recent colorectal cancer (CRC) study, we inves-tigate the effect of clinical factors and diet-related covariates on the micro -biota compositions; for the patients enrolled in this study, microbiota abun-dance counts are collected from three different distri...
-
作者:Sun, Hong; Xu, Maochao; Zhao, Peng
作者单位:Lanzhou University; Illinois State University; Jiangsu Normal University
摘要:Data breaches in healthcare have become a substantial concern in recent years and cause millions of dollars in financial losses each year. It is fundamental for government regulators, insurance companies, and stakeholders to understand the breach frequency and the number of affected individuals in each state, as these are directly related to the federal Health Insurance Portability and Accountability Act (HIPAA) and state data breach laws. However, an obstacle to studying data breaches in heal...
-
作者:Dempsey, Walter
作者单位:University of Michigan System; University of Michigan
摘要:Coronavirus case-count data has influenced government policies and drives most epidemiological forecasts. Limited testing is cited as the key driver behind minimal information on the COVID-19 pandemic. While expanded testing is laudable, measurement error and selection bias are the two greatest problems limiting our understanding of the COVID-19 pandemic; neither can be fully addressed by increased testing capacity. In this paper we demonstrate their impact on estimation of point prevalence an...
-
作者:Elmasri, Mohamad; Labbe, Aurelie; Larocque, Denis; Charlin, Laurent
作者单位:University of Toronto; Universite de Montreal; HEC Montreal
摘要:Recent statistical methods fitted on large-scale GPS data can provide accurate estimations of the expected travel time between two points. However, little is known about the distribution of travel time, which is key to decision -making across a number of logistic problems. With sufficient data single road-segment travel time can be well approximated. The challenge lies in understanding how to aggregate such information over a route to arrive at the route-distribution of travel time. We develop...