-
作者:Lei, Jing; G'Sell, Max; Rinaldo, Alessandro; Tibshirani, Ryan J.; Wasserman, Larry
作者单位:Carnegie Mellon University
摘要:We develop a general framework for distribution-free predictive inference in regression, using conformal inference. The proposed methodology allows for the construction of a prediction band for the response variable using any estimator of the regression function. The resulting prediction band preserves the consistency properties of the original estimator under standard assumptions, while guaranteeing finite-sample marginal coverage even when these assumptions do not hold. We analyze and compar...
-
作者:Sarkar, Abhra; Chabout, Jonathan; Macopson, Joshua Jones; Jarvis, Erich D.; Dunson, David B.
作者单位:University of Texas System; University of Texas Austin; Howard Hughes Medical Institute; Rockefeller University; Duke University
摘要:Studying the neurological, genetic, and evolutionary basis of human vocal communication mechanisms using animal vocalization models is an important field of neuroscience. The datasets typically comprise structured sequences of syllables or songs produced by animals from different genotypes under different social contexts. It has been difficult to come up with sophisticated statistical methods that appropriately model animal vocal communication syntax. We address this need by developing a novel...
-
作者:Sun, BaoLuo; Tchetgen, Eric J. Tchetgen
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard T.H. Chan School of Public Health
摘要:The development of coherent missing data models to account for nonmonotone missing at random (MAR) data by inverse probability weighting (IPW) remains to date largely unresolved. As a consequence, IPW has essentially been restricted for use only in monotone MAR settings. We propose a class of models for nonmonotone missing data mechanisms that spans the MAR model, while allowing the underlying full data law to remain unrestricted. For parametric specifications within the proposed class, we int...
-
作者:Swihart, Bruce J.; Fay, Michael P.; Miura, Kazutoyo
作者单位:National Institutes of Health (NIH) - USA; NIH National Institute of Allergy & Infectious Diseases (NIAID); National Institutes of Health (NIH) - USA; NIH National Institute of Allergy & Infectious Diseases (NIAID)
摘要:Transmission blocking vaccines for malaria are not designed to directly protect vaccinated people from malaria disease, but to reduce the probability of infecting other people by interfering with the growth of the malaria parasite in mosquitoes. Standard membrane-feeding assays compare the growth of parasites in mosquitoes from a test sample (using antibodies from a vaccinated person) compared to a control sample. There is debate about whether to estimate the transmission reducing activity (TR...
-
作者:Kim, Sungduk; Albert, Paul S.
作者单位:National Institutes of Health (NIH) - USA; NIH National Cancer Institute (NCI); NIH National Cancer Institute- Division of Cancer Epidemiology & Genetics
摘要:Many researchers in biology and medicine have focused on trying to understand biological rhythms and their potential impact on disease. A common biological rhythm is circadian, where the cycle repeats itself every 24 hours. However, a disturbance of the circadian pattern may be indicative of future disease. In this article, we develop new statistical methodology for assessing the degree of disturbance or irregularity in a circadian pattern for count sequences that are observed over time in a p...
-
作者:Shi, Chengchun; Lu, Wenbin; Song, Rui
作者单位:North Carolina State University
摘要:The divide and conquer method is a common strategy for handling massive data. In this article, we study the divide and conquer method for cubic-rate estimators under the massive data framework. We develop a general theory for establishing the asymptotic distribution of the aggregated M-estimators using a weighted average with weights depending on the subgroup sample sizes. Under certain condition on the growing rate of the number of subgroups, the resulting aggregated estimators are shown to h...
-
作者:Wager, Stefan; Athey, Susan
作者单位:Stanford University
摘要:Many scientific and engineering challengesranging from personalized medicine to customized marketing recommendationsrequire an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asy...
-
作者:Zhao, Qingyuan; Small, Dylan S.; Rosenbaum, Paul R.
作者单位:University of Pennsylvania
摘要:We discuss observational studies that test many causal hypotheses, either hypotheses about many outcomes or many treatments. To be credible an observational study that tests many causal hypotheses must demonstrate that its conclusions are neither artifacts of multiple testing nor of small biases from nonrandom treatment assignment. In a sense that needs to be defined carefully, hidden within a sensitivity analysis for nonrandom assignment is an enormous correction for multiple testing: In the ...
-
作者:Backenroth, Daniel; Goldsmith, Jeff; Harran, Michelle D.; Cortes, Juan C.; Krakauer, John W.; Kitago, Tomoko
作者单位:Columbia University; Columbia University; Johns Hopkins University; Johns Hopkins University
摘要:We propose a novel method for estimating population-level and subject-specific effects of covariates on the variability of functional data. We extend the functional principal components analysis framework by modeling the variance of principal component scores as a function of covariates and subject-specific random effects. In a setting where principal components are largely invariant across subjects and covariate values, modeling the variance of these scores provides a flexible and interpretab...
-
作者:Chen, Xi; Irie, Kaoru; Banks, David; Haslinger, Robert; Thomas, Jewell; West, Mike
作者单位:Duke University; University of Tokyo
摘要:Traffic flow count data in networks arise in many applications, such as automobile or aviation transportation, certain directed social network contexts, and Internet studies. Using an example of Internet browser traffic flow through site-segments of an international news website, we present Bayesian analyses of two linked classes of models which, in tandem, allow fast, scalable, and interpretable Bayesian inference. We first develop flexible state-space models for streaming count data, able to...