-
作者:Oates, Chris J.; Karvonen, Toni; Teckentrup, Aretha L.; Strocchi, Marina; Niederer, Steven A.
作者单位:Newcastle University - UK; Lappeenranta-Lahti University of Technology LUT; University of Helsinki; University of Edinburgh; Heriot Watt University; University of Edinburgh; Imperial College London; University of London; King's College London
摘要:For over a century, extrapolation methods have provided a powerful tool to improve the convergence order of a numerical method. However, these tools are not well-suited to modern computer codes, where multiple continua are discretized and convergence orders are not easily analysed. To address this challenge, we present a probabilistic perspective on Richardson extrapolation, a point of view that unifies classical extrapolation methods with modern multi-fidelity modelling, and handles uncertain...
-
作者:Hore, Rohan; Barber, Rina Foygel
作者单位:University of Chicago
摘要:In this work, we consider the problem of building distribution-free prediction intervals with finite-sample conditional coverage guarantees. Conformal prediction (CP) is an increasingly popular framework for building such intervals with distribution-free guarantees, but these guarantees only ensure marginal coverage: the probability of coverage is averaged over both the training and test data, meaning that there might be substantial undercoverage within certain subpopulations. Instead, ideally...
-
作者:Dai, Xiaowu
作者单位:University of California System; University of California Los Angeles
摘要:Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically large dataset sizes for reliable conclusions. We develop an approach based on partial derivatives, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. This novel approach and computational algorithm could lead to methods useful to practitioners in many areas of science and engineering. Our theoretical results rev...
-
作者:Miao, Wang; Li, Xinyu; Zhang, Ping; Sun, Baoluo
作者单位:Peking University; National University of Singapore
摘要:Nonresponse arises frequently in surveys, and follow-ups are routinely made to increase the response rate. In order to monitor the follow-up process, callback data have been used in social sciences and survey studies for decades. In modern surveys, the availability of callback data is increasing because the response rate is decreasing, and follow-ups are essential to collect maximum information. Although callback data are helpful to reduce the bias in surveys, such data have not been widely us...
-
作者:Imbens, Guido; Kallus, Nathan; Mao, Xiaojie; Wang, Yuhao
作者单位:Stanford University; Cornell University; Tsinghua University; Tsinghua University; Shanghai Qi Zhi Institute
摘要:We study the identification and estimation of long-term treatment effects by combining short-term experimental data and long-term observational data subject to unobserved confounding. This problem arises often when concerned with long-term treatment effects since experiments are often short-term due to operational necessity while observational data can be more easily collected over longer time frames but may be subject to confounding. In this paper, we tackle the challenge of persistent confou...
-
作者:Jiang, Binyan; Lv, Jing; Li, Jialiang; Cheng, Ming-Yen
作者单位:Hong Kong Polytechnic University; Southwest University - China; National University of Singapore; Hong Kong Baptist University
摘要:Model averaging is an attractive ensemble technique to construct fast and accurate prediction. Despite of having been widely practiced in cross-sectional data analysis, its application to longitudinal data is rather limited so far. We consider model averaging for longitudinal response when the number of covariates is ultrahigh. To this end, we propose a novel two-stage procedure in which variable screening is first conducted and then followed by model averaging. In both stages, a robust rank-b...
-
作者:Li, Sai; Ye, Ting
作者单位:Renmin University of China; University of Washington; University of Washington Seattle
摘要:Mendelian randomization (MR) is a powerful method that uses genetic variants as instrumental variables to infer the causal effect of a modifiable exposure on an outcome. We study inference for bi-directional causal relationships and causal directions with possibly pleiotropic genetic variants. We show that assumptions for common MR methods are often impossible or too stringent given the potential bi-directional relationships. We propose a new focusing framework for testing bi-directional causa...
-
作者:Koenig, Claudia; Munk, Axel; Werner, Frank
作者单位:University of Gottingen; University of Gottingen; University of Wurzburg
摘要:We develop a multiscale scanning method to find anomalies in a d-dimensional random field in the presence of nuisance parameters. This covers the common situation that either the baseline-level or additional parameters such as the variance are unknown and have to be estimated from the data. We argue that state of the art approaches to determine asymptotically correct critical values for multiscale scanning statistics will in general fail when such parameters are naively replaced by plug-in est...
-
作者:Bellec, Pierre C.; Du, Jin-Hong; Koriyama, Takuya; Patil, Pratik; Tan, Kai
作者单位:Rutgers University System; Rutgers University New Brunswick; Carnegie Mellon University; Carnegie Mellon University; University of Chicago; University of California System; University of California Berkeley
摘要:Generalized cross-validation (GCV) is a widely used method for estimating the squared out-of-sample prediction risk that employs scalar degrees of freedom adjustment (in a multiplicative sense) to the squared training error. In this paper, we examine the consistency of GCV for estimating the prediction risk of arbitrary ensembles of penalized least-squares estimators. We show that GCV is inconsistent for any finite ensemble of size greater than one. Towards repairing this shortcoming, we ident...
-
作者:Kallus, Nathan; Mao, Xiaojie
作者单位:Cornell University; Tsinghua University
摘要:In many experimental and observational studies, the outcome of interest is often difficult or expensive to observe, reducing effective sample sizes for estimating average treatment effects (ATEs) even when identifiable. We study how incorporating data on units for which only surrogate outcomes not of primary interest are observed can increase the precision of ATE estimation. We refrain from imposing stringent surrogacy conditions, which permit surrogates as perfect replacements for the target ...