-
作者:Buecher, Axel; Pakzad, Cambyse
作者单位:Ruhr University Bochum; Inria; Centre National de la Recherche Scientifique (CNRS); Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA)
摘要:Testing for pairwise independence for the case where the number of variables may be of the same size or even larger than the sample size has received increasing attention in the recent years. We contribute to this branch of the literature by considering tests that allow to detect higher-order dependencies. The proposed methods are based on connecting the problem to copulas and making use of the Moebius transformation of the empirical copula process; an approach that is related to Lancaster int...
-
作者:Nickl, Richard; Titi, Edriss s.
作者单位:University of Cambridge; University of Cambridge
摘要:We consider a nonlinear Bayesian data assimilation model for the periodic two-dimensional Navier-Stokes equations with initial condition modelled by a Gaussian process prior. We show that if the system is updated with sufficiently many discrete noisy measurements of the velocity field, then the posterior distribution eventually concentrates near the ground truth solution of the time evolution equation, and in particular that the initial condition is recovered consistently by the posterior mean...
-
作者:Zhang, Zhengxin; Goldfeld, Ziv; Mroueh, Youssef; Sriperumbudur, Bharath K.
作者单位:Cornell University; Cornell University; International Business Machines (IBM); IBM USA; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:The Gromov-Wasserstein (GW) distance, rooted in optimal transport (OT) theory, quantifies dissimilarity between metric measure spaces and provides a framework for aligning heterogeneous datasets. While computational aspects of the GW problem have been widely studied, a duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the quadratic GW distance over Euclidean spaces of different dimensions dx x and d y ....
-
作者:Hagrass, Omar; Sriperumbudur, Bharath K.; Li, Bing
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Over the last decade, an approach that has gained a lot of popularity to mains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show the popular MMD (maximum mean discrepancy) twosample test to be not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the...
-
作者:Altmeyer, Randolf; Tiepner, Anton; Wahl, Martin
作者单位:University of Cambridge; Aarhus University; University of Bielefeld
摘要:The coefficients in a second order parabolic linear stochastic partial differential equation (SPDE) are estimated from multiple spatially localised measurements. Assuming that the spatial resolution tends to zero and the number of measurements is nondecreasing, the rate of convergence for each coefficient depends on its differential order and is faster for higher order coefficients. Based on an explicit analysis of the reproducing kernel Hilbert space of a general stochastic evolution equation...
-
作者:Doss, Charles R.; Weng, Guangwei; Wang, Lan; Moscovice, Ira; Chantarat, Tongtan
作者单位:University of Minnesota System; University of Minnesota Twin Cities; University of Miami; University of Minnesota System; University of Minnesota Twin Cities
摘要:The vast majority of literature on evaluating the significance of a treatment effect based on observational data has been confined to discrete treatments. These methods are not applicable to drawing inference for a continuous treatment, which arises in many important applications. To adjust for confounders when evaluating a continuous treatment, existing inference methods often rely on discretizing the treatment or using (possibly misspecified) parametric models for the effect curve. Recently,...
-
作者:Thepaut, Solene; Verzelen, Nicolas
作者单位:Safran S.A.; INRAE; Universite de Montpellier
摘要:We consider the twin problems of estimating the effective rank and the Schatten norms HAHs of a rectangular p x q matrix A from noisy observations. When s is an even integer, we introduce a polynomial-time estimator of HAHs that achieves the minimax rate (pq)(1/4). Interestingly, this optimal rate does not depend on the underlying rank of the matrix A. When s is not an even integer, the optimal rate is much slower. A simple thresholding estimator of the singular values achieves the rate (q boo...
-
作者:Ki, Dohyeong; Fang, Billy; Guntuboyina, Adityanand
作者单位:University of California System; University of California Berkeley; Alphabet Inc.; Google Incorporated
摘要:Multivariate adaptive regression splines (MARS) is a popular method for nonparametric regression introduced by Friedman in 1991. MARS fits simple nonlinear and non-additive functions to regression data. We propose and study a natural lasso variant of the MARS method. Our method is based on least squares estimation over a convex class of functions obtained by considering infinite-dimensional linear combinations of functions in the MARS basis and imposing a variation based complexity constraint....
-
作者:Pilipovic, Predrag; Samson, Adeline; Ditlevsen, Susanne
作者单位:University of Copenhagen; Centre National de la Recherche Scientifique (CNRS); Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA)
摘要:The likelihood functions for discretely observed nonlinear continuous time models based on stochastic differential equations are not available except for a few cases. Various parameter estimation techniques have been proposed, each with advantages, disadvantages and limitations depending on the application. Most applications still use the Euler-Maruyama discretization, despite many proofs of its bias. More sophisticated methods, such as Hermite expansions or MCMC methods, might be complex to i...
-
作者:Lundborg, Anton rask; Kim, Ilmun; Shah, Rajen d.; Samworth, Richard j.
作者单位:University of Copenhagen; Yonsei University; University of Cambridge
摘要:Testing the significance of a variable or group of variables X for predicting a response Y, given additional covariates Z, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for X is nonzero. However, when the model is misspecified, the test may have poor power, for example, when X is involved in complex interactions, or lead to many false rejections. In this work, we study the problem of testing the m...