-
作者:Berrett, Thomas b.; Samworth, Richard j.
作者单位:University of Warwick; University of Cambridge
摘要:Given a set of incomplete observations, we study the nonparametric problem of testing whether data are Missing Completely At Random (MCAR). Our first contribution is to characterise precisely the set of alternatives that can be distinguished from the MCAR null hypothesis. This reveals interesting and novel links to the theory of Frechet classes (in particular, compatible distributions) and linear programming, that allow us to propose MCAR tests that are consistent against all detectable altern...
-
作者:Bhattacharjee, Satarupa; Muller, Hans-georg
作者单位:University of California System; University of California Davis
摘要:Single index models provide an effective dimension reduction tool in regression, especially for high-dimensional data, by projecting a general multivariate predictor onto a direction vector. We propose a novel single-index model for regression models where metric space-valued random object responses are coupled with multivariate Euclidean predictors. The responses in this regression model include complex, non-Euclidean data, including covariance matrices, graph Laplacians of networks and univa...
-
作者:Wang, Jiayi; Qi, Zhengling; Wong, Raymond K. W.
作者单位:University of Texas System; University of Texas Dallas; George Washington University; Texas A&M University System; Texas A&M University College Station
摘要:Off-policy evaluation is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and the covariate balancing idea in causal inference, we propose a novel estimator with approximately projected state...
-
作者:Fujiwara, Akio; Yamagata, Koichi
作者单位:University of Osaka; Research Organization of Information & Systems (ROIS); National Institute of Informatics (NII) - Japan
摘要:We herein establish an asymptotic representation theorem for locally asymptotically normal quantum statistical models. This theorem enables us to study the asymptotic efficiency of quantum estimators, such as quantum regular estimators and quantum minimax estimators, leading to a universal tight lower bound beyond the i.i.d. assumption. This formulation complements the theory of quantum contiguity developed in the previous paper [Fujiwara and Yamagata, Bernoulli 26 (2020) 2105-2141], providing...
-
作者:Doss, Natalie; Wu, Yihong; Yang, Pengkun; Zhou, Harrison H.
作者单位:Yale University; Tsinghua University
摘要:This paper studies the optimal rate of estimation in a finite Gaussian location mixture model in high dimensions without separation conditions. We assume that the number of components k is bounded and that the centers lie in a ball of bounded radius, while allowing the dimension d to be as large as the sample size n. Extending the one-dimensional result of Heinrich and Kahn (Ann. Statist. 46 (2018) 2844-2870), we show that the minimax rate of estimating the mixing distribution in Wasserstein d...
-
作者:Awan, Jordan; Vadhan, Salil
作者单位:Purdue University System; Purdue University
摘要:f-DP has recently been proposed as a generalization of differential pri-vacy allowing a lossless analysis of composition, post-processing, and pri-vacy amplification via subsampling. In the setting of f-DP, we propose the concept of a canonical noise distribution (CND), the first mechanism de-signed for an arbitrary f-DP guarantee. The notion of CND captures whether an additive privacy mechanism perfectly matches the privacy guarantee of a given f . We prove that a CND always exists, and give ...
-
作者:Ma, Cong; Pathak, Reese; Wainwright, Martin J.
作者单位:University of Chicago; University of California System; University of California Berkeley; Massachusetts Institute of Technology (MIT)
摘要:We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the source and target distributions. When the likelihood ratios are uniformly bounded, we prove that the kernel ridge regression (KRR) estimator with a carefully chosen regularization parameter is minimax rate-optimal (up to a log factor) for a large family of RKHS...
-
作者:Duan, Yaqi; Wang, Kaizheng
作者单位:New York University; Columbia University; Columbia University
摘要:We study the multitask learning problem that aims to simultaneously an-alyze multiple data sets collected from different sources and learn one model for each of them. We propose a family of adaptive methods that automatically utilize possible similarities among those tasks while carefully handling their differences. We derive sharp statistical guarantees for the methods and prove their robustness against outlier tasks. Numerical experiments on synthetic and real data sets demonstrate the effic...
-
作者:Mies, Fabian; Podolskij, Mark
作者单位:Delft University of Technology; University of Luxembourg
摘要:The linear fractional stable motion generalizes two prominent classes of stochastic processes, namely stable Levy processes, and fractional Brownian motion. For this reason, it may be regarded as a basic building block for con-tinuous time models. We study a stylized model consisting of a superposition of independent linear fractional stable motions and our focus is on parame-ter estimation of the model. Applying an estimating equations approach, we construct estimators for the whole set of pa...
-
作者:Fan, Jianqing; Masini, Ricardo P.; Medeiros, Marcelo C.
作者单位:Princeton University; University of California System; University of California Davis; University of Illinois System; University of Illinois Urbana-Champaign
摘要:Factor and sparse models are widely used to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows for efficiently exploring all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data with observable and/or latent common factors and idiosyncratic components. The mo...