-
作者:Miolane, Leo; Montanari, Andrea
作者单位:New York University; New York University; Stanford University; Stanford University
摘要:The Lasso is a popular regression method for high-dimensional problems in which the number of parameters theta(1), ..., theta(N), is larger than the number n of samples: N > n. A useful heuristics relates the statistical properties of the Lasso estimator to that of a simple soft-thresholding denoiser, in a denoising problem in which the parameters (theta(i))(i <= N) are observed in Gaussian noise, with a carefully tuned variance. Earlier work confirmed this picture in the limit n, N -> infinit...
-
作者:van der Vaart, A. W.; Wellner, J. A.
作者单位:Leiden University; Leiden University - Excl LUMC; University of Washington; University of Washington Seattle
摘要:We revisit a paper by Charles Stein, and discuss its follow-up.
-
作者:Deng, Hang; Han, Qiyang; Zhang, Cun-Hui
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:We consider the problem of constructing pointwise confidence intervals in the multiple isotonic regression model. Recently, Han and Zhang (2020) obtained a pointwise limit distribution theory for the so-called block maxmin and min-max estimators (Fokianos, Leucht and Neumann (2020); Deng and Zhang (2020)) in this model, but inference remains a difficult problem due to the nuisance parameter in the limit distribution that involves multiple unknown partial derivatives of the true regression func...
-
作者:Lee, Donald K. K.; Chen, Ningyuan; Ishwaran, Hemant
作者单位:Emory University; Emory University; University of Toronto; University of Miami
摘要:Given functional data from a survival process with time-dependent covariates, we derive a smooth convex representation for its nonparametric loglikelihood functional and obtain its functional gradient. From this, we devise a generic gradient boosting procedure for estimating the hazard function nonparametrically. An illustrative implementation of the procedure using regression trees is described to show how to recover the unknown hazard. The generic estimator is consistent if the model is corr...
-
作者:Horvath, Lajos; Kokoszka, Piotr; Wang, Shixuan
作者单位:Utah System of Higher Education; University of Utah; Colorado State University System; Colorado State University Fort Collins; University of Reading
摘要:We propose a method for the detection of a change point in a sequence {F-i} of distributions, which are available through a large number of observations at each i >= 1. Under the null hypothesis, the distributions F-i are equal. Under the alternative hypothesis, there is a change point i * > 1, such that F-i = G for i >= i* and some unknown distribution G, which is not equal to F-1. The change point, if it exists, is unknown, and the distributions before and after the potential change point ar...
-
作者:Samworth, Richard J.; Yuan, Ming
作者单位:University of Cambridge; Columbia University
-
作者:Hanneke, Steve; Kontorovich, Aryeh; Sabato, Sivan; Weiss, Roi
作者单位:Toyota Technological Institute - Chicago; Ben-Gurion University of the Negev; Ariel University
摘要:We extend a recently proposed 1-nearest-neighbor based multiclass learning algorithm and prove that our modification is universally strongly Bayes consistent in all metric spaces admitting any such learner, making it an optimistically universal Bayes-consistent learner. This is the first learning algorithm known to enjoy this property; by comparison, the k-NN classifier and its variants are not generally universally Bayes consistent, except under additional structural assumptions, such as an i...
-
作者:Ye, Ting; Shao, Jun; Kang, Hyunseung
作者单位:University of Pennsylvania; East China Normal University; University of Wisconsin System; University of Wisconsin Madison
摘要:Mendelian randomization (MR) has become a popular approach to study the effect of a modifiable exposure on an outcome by using genetic variants as instrumental variables. A challenge in MR is that each genetic variant explains a relatively small proportion of variance in the exposure and there are many such variants, a setting known as many weak instruments. To this end, we provide a theoretical characterization of the statistical properties of two popular estimators in MR: the inverse-varianc...
-
作者:Bellec, Pierre C.; Zhang, Cun-Hui
作者单位:Rutgers University System; Rutgers University New Brunswick
摘要:Stein's formula states that a random variable of the form z(inverted perpendicular) f (z) - divf (z) is mean-zero for all functions f with integrable gradient. Here, div f is the divergence of the function f and z is a standard normal vector. This paper aims to propose a second-order Stein formula to characterize the variance of such random variables for all functions f (z) with square integrable gradient, and to demonstrate the usefulness of this second-order Stein formula in various applicat...
-
作者:Klochkov, Yegor; Kroshnin, Alexey; Zhivotovskiy, Nikita
作者单位:University of Cambridge; HSE University (National Research University Higher School of Economics); Russian Academy of Sciences; Kharkevich Institute for Information Transmission Problems of the RAS; Alphabet Inc.; Google Incorporated
摘要:We consider the robust algorithms for the k-means clustering problem where a quantizer is constructed based on N independent observations. Our main results are median of means based nonasymptotic excess distortion bounds that hold under the two bounded moments assumption in a general separable Hilbert space. In particular, our results extend the renowned asymptotic result of (Ann. Statist. 9 (1981) 135-140) who showed that the existence of two moments is sufficient for strong consistency of an...