-
作者:Wang, Xiaoqin; Yin, Li
作者单位:University of Gavle; Karolinska Institutet
摘要:In sequential causal inference, two types of causal effects are of practical interest, namely, the causal effect of the treatment regime (called the sequential causal effect) and the blip effect of treatment on the potential outcome after the last treatment. The well-known G-formula expresses these causal effects in terms of the standard parameters. In this article, we obtain a new G-formula that expresses these causal effects in terms of the point observable effects of treatments similar to t...
-
作者:Rockova, Veronika; Van der Pas, Stephanie
作者单位:University of Chicago; Leiden University - Excl LUMC; Leiden University
摘要:Since their inception in the 1980s, regression trees have been one of the more widely used nonparametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet susceptible to overfitting. Strategies against overfitting have traditionally relied on pruning greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting thro...
-
作者:Zhao, Qingyuan; Wang, Jingshu; Hemani, Gibran; Bowden, Jack; Small, Dylan S.
作者单位:University of Cambridge; University of Chicago; University of Bristol; University of Exeter; University of Pennsylvania
摘要:Mendelian randomization (MR) is a method of exploiting genetic variation to unbiasedly estimate a causal effect in presence of unmeasured confounding. MR is being widely used in epidemiology and other related areas of population science. In this paper, we study statistical inference in the increasingly popular two-sample summary-data MR design. We show a linear model for the observed associations approximately holds in a wide variety of settings when all the genetic variants satisfy the exclus...
-
作者:Chong, Carsten
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
摘要:We consider the problem of estimating stochastic volatility for a class of second-order parabolic stochastic PDEs. Assuming that the solution is observed at high temporal frequency, we use limit theorems for multipower variations and related functionals to construct consistent nonparametric estimators and asymptotic confidence bounds for the integrated volatility process. As a byproduct of our analysis, we also obtain feasible estimators for the regularity of the spatial covariance function of...
-
作者:Hopkins, Samuel B.
作者单位:University of California System; University of California Berkeley
摘要:We study polynomial time algorithms for estimating the mean of a heavy-tailed multivariate random vector. We assume only that the random vector X has finite mean and covariance. In this setting, the radius of confidence intervals achieved by the empirical mean are large compared to the case that X is Gaussian or sub-Gaussian. We offer the first polynomial time algorithm to estimate the mean with sub-Gaussian-size confidence intervals under such mild assumptions. Our algorithm is based on a new...
-
作者:Lecue, Guillaume; Lerasle, Matthieu
作者单位:Institut Polytechnique de Paris; ENSAE Paris; Ecole Polytechnique; Universite Paris Saclay; Centre National de la Recherche Scientifique (CNRS)
摘要:Median-of-means (MOM) based procedures have been recently introduced in learning theory (Lugosi and Mendelson (2019); Lecue and Lerasle (2017)). These estimators outperform classical least-squares estimators when data are heavy-tailed and/or are corrupted. None of these procedures can be implemented, which is the major issue of current MOM procedures (Ann. Statist. 47 (2019) 783-794). In this paper, we introduce minmax MOM estimators and show that they achieve the same sub-Gaussian deviation b...
-
作者:Schweinberger, Michael; Stewart, Jonathan
作者单位:Rice University
摘要:Statistical inference for exponential-family models of random graphs with dependent edges is challenging. We stress the importance of additional structure and show that additional structure facilitates statistical inference. A simple example of a random graph with additional structure is a random graph with neighborhoods and local dependence within neighborhoods. We develop the first concentration and consistency results for maximum likelihood and M-estimators of a wide range of canonical and ...
-
作者:Comte, Fabienne; Genon-Catalot, Valentine
作者单位:Universite Paris Cite; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI)
摘要:We consider N independent stochastic processes (X-i(t), t is an element of [0, T]), i = 1, ..., N, defined by a one-dimensional stochastic differential equation, which are continuously observed throughout a time interval [0, T] where T is fixed. We study nonparametric estimation of the drift function on a given subset A of R. Projection estimators are defined on finite dimensional subsets of L-2 (A, dx). We stress that the set A may be compact or not and the diffusion coefficient may be bounde...
-
作者:Kuchibhotla, Arun K.; Brown, Lawrence D.; Buja, Andreas; Cai, Junhui; George, Edward, I; Zhao, Linda H.
作者单位:University of Pennsylvania
摘要:Modern data-driven approaches to modeling make extensive use of co-variate/model selection. Such selection incurs a cost: it invalidates classical statistical inference. A conservative remedy to the problem was proposed by Berk et al. (Ann. Statist. 41 (2013) 802-837) and further extended by Bachoc, Preinerstorfer and Steinberger (2016). These proposals, labeled PoSI methods, provide valid inference after arbitrary model selection. They are computationally NP-hard and have limitations in their...
-
作者:Mourtada, Jaouad; Gaiffas, Stephane; Scornet, Erwan
作者单位:Institut Polytechnique de Paris; Ecole Polytechnique; Universite Paris Cite
摘要:Introduced by Breiman (Mach. Learn. 45 (2001) 5-32), Random Forests are widely used classification and regression algorithms. While being initially designed as batch algorithms, several variants have been proposed to handle online learning. One particular instance of such forests is the Mondrian forest (In Adv. Neural Inf. Process. Syst. (2014) 3140-3148; In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AIS-TATS) (2016)), whose trees are built usin...