-
作者:Pananjady, Ashwin; Mao, Cheng; Muthukumar, Vidya; Wainwright, Martin J.; Courtade, Thomas A.
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley; Yale University
摘要:Pairwise comparison data arises in many domains, including tournament rankings, web search and preference elicitation. Given noisy comparisons of a fixed subset of pairs of items, we study the problem of estimating the underlying comparison probabilities under the assumption of strong stochastic transitivity (SST). We also consider the noisy sorting subclass of the SST model. We show that when the assignment of items to the topology is arbitrary, these permutation-based models, unlike their pa...
-
作者:Tian, Xiaoying
作者单位:Stanford University
摘要:Estimation of the prediction error of a linear estimation rule is difficult if the data analyst also uses data to select a set of variables and constructs the estimation rule using only the selected variables. In this work, we propose an asymptotically unbiased estimator for the prediction error after model search. Under some additional mild assumptions, we show that our estimator converges to the true prediction error in L-2 at the rate of O(n(-1/2)), with n being the number of data points. O...
-
作者:Dubey, Paromita; Mueller, Hans-Georg
作者单位:University of California System; University of California Davis
摘要:We propose a method to infer the presence and location of change-points in the distribution of a sequence of independent data taking values in a general metric space, where change-points are viewed as locations at which the distribution of the data sequence changes abruptly in terms of either its Frechet mean, Frechet variance or both. The proposed method is based on comparisons of Frechet variances before and after putative change-point locations and does not require a tuning parameter, excep...
-
作者:Kohler, Michael; Langer, Sophie
作者单位:Technical University of Darmstadt
-
作者:Abbe, Emmanuel; Fan, Jianqing; Wang, Kaizheng; Zhong, Yiqiao
作者单位:Princeton University; Princeton University; Princeton University
摘要:Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behavi...
-
作者:Paindaveine, Davy; Verdebout, Thomas
作者单位:Universite Libre de Bruxelles; Universite de Toulouse; Universite Toulouse 1 Capitole
摘要:Motivated by the fact that circular or spherical data are often much concentrated around a location theta, we consider inference about theta under high concentration asymptotic scenarios for which the probability of any fixed spherical cap centered at theta converges to one as the sample size n diverges to infinity. Rather than restricting to Fisher-von Mises-Langevin distributions, we consider a much broader, semiparametric, class of rotationally symmetric distributions indexed by the locatio...
-
作者:Cannings, Timothy I.; Berrett, Thomas B.; Samworth, Richard J.
作者单位:University of Edinburgh; University of Cambridge
摘要:We derive a new asymptotic expansion for the global excess risk of a local-k-nearest neighbour classifier, where the choice of k may depend upon the test point. This expansion elucidates conditions under which the dominant contribution to the excess risk comes from the decision boundary of the optimal Bayes classifier, but we also show that if these conditions are not satisfied, then the dominant contribution may arise from the tails of the marginal distribution of the features. Moreover, we p...
-
作者:Xue, Kaijie; Yao, Fang
作者单位:Nankai University; Peking University
摘要:We propose a two-sample test for high-dimensional means that requires neither distributional nor correlational assumptions, besides some weak conditions on the moments and tail properties of the elements in the random vectors. This two-sample test based on a nontrivial extension of the one-sample central limit theorem (Ann. Probab. 45 (2017) 2309-2352) provides a practically useful procedure with rigorous theoretical guarantees on its size and power assessment. In particular, the proposed test...
-
作者:Schmidt-Hieber, Johannes
作者单位:University of Twente
摘要:Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to log n-factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architec...
-
作者:Cai, T. Tony; Wu, Yihong
作者单位:University of Pennsylvania; Yale University
摘要:This paper investigates the fundamental limits for detecting a high-dimensional sparse matrix contaminated by white Gaussian noise from both the statistical and computational perspectives. We consider p x p matrices whose rows and columns are individually k-sparse. We provide a tight characterization of the statistical and computational limits for sparse matrix detection, which precisely describe when achieving optimal detection is easy, hard or impossible, respectively. Although the sparse ma...