-
作者:Brownlees, Christian; Joly, Emilien; Lugosi, Gabor
作者单位:Pompeu Fabra University; ICREA; Pompeu Fabra University; Pompeu Fabra University
摘要:The purpose of this paper is to discuss empirical risk minimization when the losses are not necessarily bounded and may have a distribution with heavy tails. In such situations, usual empirical averages may fail to provide reliable estimates and empirical risk minimization may provide large excess risk. However, some robust mean estimators proposed in the literature may be used to replace empirical means. In this paper, we investigate empirical risk minimization based on a robust estimate prop...
-
作者:Spokoiny, Vladimir; Zhilova, Mayya
作者单位:Leibniz Association; Weierstrass Institute for Applied Analysis & Stochastics; Humboldt University of Berlin; Moscow Institute of Physics & Technology; Russian Academy of Sciences; HSE University (National Research University Higher School of Economics)
摘要:A multiplier bootstrap procedure for construction of likelihood-based confidence sets is considered for finite samples and a possible model misspecification. Theoretical results justify the bootstrap validity for a small or moderate sample size and allow to control the impact of the parameter dimension p: the bootstrap approximation works if p(3)/n is small. The main result about bootstrap validity continues to apply even if the underlying parametric model is misspecified under the so-called s...
-
作者:Castillo, Ismael
作者单位:Sorbonne Universite; Centre National de la Recherche Scientifique (CNRS); Centre National de la Recherche Scientifique (CNRS); Universite Paris Cite
-
作者:Mai, Qing; Zou, Hui
作者单位:State University System of Florida; Florida State University; University of Minnesota System; University of Minnesota Twin Cities
摘要:A new model-free screening method called the fused Kolmogorov filter is proposed for high-dimensional data analysis. This new method is fully nonparametric and can work with many types of covariates and response variables, including continuous, discrete and categorical variables. We apply the fused Kolmogorov filter to deal with variable screening problems emerging from a wide range of applications, such as multiclass classification, nonparametric regression and Poisson regression, among other...
-
作者:Yang, Yanrong; Pan, Guangming
作者单位:Monash University; Nanyang Technological University
摘要:This paper proposes a new statistic to test independence between two high dimensional random vectors X: p(1) x 1 and Y : p(2) x 1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X and Y. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p(1) and p(2...
-
作者:Bao, Zhigang; Pan, Guangming; Zhou, Wang
作者单位:Zhejiang University; Nanyang Technological University; National University of Singapore
摘要:This paper is aimed at deriving the universality of the largest eigenvalue of a class of high-dimensional real or complex sample covariance matrices of the form W-N = Sigma(XX)-X-1/2*E-1/2. Here, X = (xij)(M,N) is an M x N random matrix with independent entries x(ij), 1 <= i <= M, 1 <= j <= N such that Ex(ij) = 0, E vertical bar x(ij)vertical bar(2) = 1/N. On dimensionality, we assume that M = M(N) and N/M -> d is an element of(0, infinity) as N -> infinity. For a class of general deterministi...
-
作者:Hoffmann, Marc; Rousseau, Judith; Schmidt-Hieber, Johannes
作者单位:Universite PSL; Universite Paris-Dauphine; Leiden University - Excl LUMC; Leiden University
摘要:We investigate the problem of deriving posterior concentration rates under different loss functions in nonparametric Bayes. We first provide a lower bound on posterior coverages of shrinking neighbourhoods that relates the metric or loss under which the shrinking neighbourhood is considered, and an intrinsic pre-metric linked to frequentist separation rates. In the Gaussian white noise model, we construct feasible priors based on a spike and slab procedure reminiscent of wavelet thresholding t...
-
作者:Zheng, Qi; Peng, Limin; He, Xuming
作者单位:Emory University; University of Michigan System; University of Michigan
摘要:Quantile regression has become a valuable tool to analyze heterogeneous covaraite-response associations that are often encountered in practice. The development of quantile regression methodology for high-dimensional covariates primarily focuses on the examination of model sparsity at a single or multiple quantile levels, which are typically prespecified ad hoc by the users. The resulting models may be sensitive to the specific choices of the quantile levels, leading to difficulties in interpre...
-
作者:Lei, Jing; Rinaldo, Alessandro
作者单位:Carnegie Mellon University
摘要:We analyze the performance of spectral clustering for community extraction in stochastic block models. We show that, under mild conditions, spectral clustering applied to the adjacency matrix of the network can consistently recover hidden communities even when the order of the maximum expected degree is as small as log n, with n the number of nodes. This result applies to some popular polynomial time spectral clustering algorithms and is further extended to degree corrected stochastic block mo...
-
作者:Chatterjee, Yasachi; Guntuboyina, Adityanand; Sen, Bodhisattva
作者单位:University of Chicago; University of California System; University of California Berkeley; Columbia University
摘要:We consider the problem of estimating an unknown theta is an element of R-n from noisy observations under the constraint that theta belongs to certain convex polyhedral cones in R-n. Under this setting, we prove bounds for the risk of the least squares estimator (LSE). The obtained risk bound behaves differently depending on the true sequence theta which highlights the adaptive behavior of theta. As special cases of our general result, we derive risk bounds for the LSE in univariate isotonic a...