-
作者:Zhang, Zhengxin; Goldfeld, Ziv; Mroueh, Youssef; Sriperumbudur, Bharath K.
作者单位:Cornell University; Cornell University; International Business Machines (IBM); IBM USA; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:The Gromov-Wasserstein (GW) distance, rooted in optimal transport (OT) theory, quantifies dissimilarity between metric measure spaces and provides a framework for aligning heterogeneous datasets. While computational aspects of the GW problem have been widely studied, a duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the quadratic GW distance over Euclidean spaces of different dimensions dx x and d y ....
-
作者:Hagrass, Omar; Sriperumbudur, Bharath K.; Li, Bing
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Over the last decade, an approach that has gained a lot of popularity to mains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal of our work is to understand the optimality of two-sample tests constructed based on this approach. First, we show the popular MMD (maximum mean discrepancy) twosample test to be not optimal in terms of the separation boundary measured in Hellinger distance. Second, we propose a modification to the...
-
作者:Altmeyer, Randolf; Tiepner, Anton; Wahl, Martin
作者单位:University of Cambridge; Aarhus University; University of Bielefeld
摘要:The coefficients in a second order parabolic linear stochastic partial differential equation (SPDE) are estimated from multiple spatially localised measurements. Assuming that the spatial resolution tends to zero and the number of measurements is nondecreasing, the rate of convergence for each coefficient depends on its differential order and is faster for higher order coefficients. Based on an explicit analysis of the reproducing kernel Hilbert space of a general stochastic evolution equation...
-
作者:Doss, Charles R.; Weng, Guangwei; Wang, Lan; Moscovice, Ira; Chantarat, Tongtan
作者单位:University of Minnesota System; University of Minnesota Twin Cities; University of Miami; University of Minnesota System; University of Minnesota Twin Cities
摘要:The vast majority of literature on evaluating the significance of a treatment effect based on observational data has been confined to discrete treatments. These methods are not applicable to drawing inference for a continuous treatment, which arises in many important applications. To adjust for confounders when evaluating a continuous treatment, existing inference methods often rely on discretizing the treatment or using (possibly misspecified) parametric models for the effect curve. Recently,...
-
作者:Thepaut, Solene; Verzelen, Nicolas
作者单位:Safran S.A.; INRAE; Universite de Montpellier
摘要:We consider the twin problems of estimating the effective rank and the Schatten norms HAHs of a rectangular p x q matrix A from noisy observations. When s is an even integer, we introduce a polynomial-time estimator of HAHs that achieves the minimax rate (pq)(1/4). Interestingly, this optimal rate does not depend on the underlying rank of the matrix A. When s is not an even integer, the optimal rate is much slower. A simple thresholding estimator of the singular values achieves the rate (q boo...
-
作者:Ki, Dohyeong; Fang, Billy; Guntuboyina, Adityanand
作者单位:University of California System; University of California Berkeley; Alphabet Inc.; Google Incorporated
摘要:Multivariate adaptive regression splines (MARS) is a popular method for nonparametric regression introduced by Friedman in 1991. MARS fits simple nonlinear and non-additive functions to regression data. We propose and study a natural lasso variant of the MARS method. Our method is based on least squares estimation over a convex class of functions obtained by considering infinite-dimensional linear combinations of functions in the MARS basis and imposing a variation based complexity constraint....
-
作者:Pilipovic, Predrag; Samson, Adeline; Ditlevsen, Susanne
作者单位:University of Copenhagen; Centre National de la Recherche Scientifique (CNRS); Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA)
摘要:The likelihood functions for discretely observed nonlinear continuous time models based on stochastic differential equations are not available except for a few cases. Various parameter estimation techniques have been proposed, each with advantages, disadvantages and limitations depending on the application. Most applications still use the Euler-Maruyama discretization, despite many proofs of its bias. More sophisticated methods, such as Hermite expansions or MCMC methods, might be complex to i...
-
作者:Lundborg, Anton rask; Kim, Ilmun; Shah, Rajen d.; Samworth, Richard j.
作者单位:University of Copenhagen; Yonsei University; University of Cambridge
摘要:Testing the significance of a variable or group of variables X for predicting a response Y, given additional covariates Z, is a ubiquitous task in statistics. A simple but common approach is to specify a linear model, and then test whether the regression coefficient for X is nonzero. However, when the model is misspecified, the test may have poor power, for example, when X is involved in complex interactions, or lead to many false rejections. In this work, we study the problem of testing the m...
-
作者:Tang, Yin; Li, Bing
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:Elliptical distribution is a basic assumption underlying many multivariate statistical methods. For example, in sufficient dimension reduction and statistical graphical models, this assumption is routinely imposed to simplify the data dependence structure. Before applying such methods, we need to decide whether the data are elliptically distributed. Currently existing tests either focus exclusively on spherical distributions, or rely on bootstrap to determine the null distribution, or require ...
-
作者:Zhang, Yunyi; Paparoditis, Efstathios; Politis, Dimitris n.
作者单位:The Chinese University of Hong Kong, Shenzhen; University of California System; University of California San Diego; University of California System; University of California San Diego
摘要:Strict stationarity is an assumption commonly used in time-series analysis in order to derive asymptotic distributional results for second-order statistics, like sample autocovariances and sample autocorrelations. Focusing on weak stationarity, this paper derives the asymptotic distribution of the maximum of sample autocovariances and sample autocorrelations under weak conditions by using Gaussian approximation techniques. The asymptotic theory for parameter estimators obtained by fitting a (l...