-
作者:Jiang, Fei; Zhou, Yeqing; Liu, Jianxuan; Ma, Yanyuan
作者单位:University of California System; University of California San Francisco; Tongji University; Syracuse University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:We study estimation and testing in the Poisson regression model with noisy high-dimensional covariates, which has wide applications in analyz-ing noisy big data. Correcting for the estimation bias due to the covariate noise leads to a nonconvex target function to minimize. Treating the high -dimensional issue further leads us to augment an amenable penalty term to the target function. We propose to estimate the regression parameter through minimizing the penalized target function. We derive th...
-
作者:Celentano, Michael; Montanari, Andrea; Wei, Yuting
作者单位:University of California System; University of California Berkeley; Stanford University; University of Pennsylvania
摘要:The Lasso is a method for high-dimensional regression, which is now commonly used when the number of covariates p is of the same order or larger than the number of observations n. Classical asymptotic normality theory does not apply to this model due to two fundamental reasons: (1) The regularized risk is nonsmooth; (2) The distance between the estimator 0 ⠂and the true parameters vector 0* cannot be neglected. As a consequence, standard perturbative arguments that are the traditional basis f...
-
作者:Bilodeau, Blair; Foster, Dylan J.; Roy, Daniel M.
作者单位:University of Toronto; Microsoft
摘要:We consider the task of estimating a conditional density using i.i.d. sam-ples from a joint distribution, which is a fundamental problem with applica-tions in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. When applying these results to conditional density es-timation, the use of uniform entropy-...
-
作者:Aragam, Bryon; Yang, Ruiyi
作者单位:University of Chicago; Princeton University
摘要:We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density. We construct uniformly consistent estimators under general conditions while simultaneously highlighting several pain points in extending existing point -wise consistency results to uniform results. The ...
-
作者:Zrnic, Tijana; Jordan, Michael I.
作者单位:University of California System; University of California Berkeley
摘要:When the target of statistical inference is chosen in a data-driven manner, the guarantees provided by classical theories vanish. We propose a solution to the problem of inference after selection by building on the framework of algorithmic stability, in particular its branch with origins in the field of differential privacy. Stability is achieved via randomization of selection and it serves as a quantitative measure that is sufficient to obtain nontrivial post-selection corrections for classic...
-
作者:Barthelme, Simon; Amblard, Pierre-Oliviera; Remblay, Nicolas; Usevich, Konstantin
作者单位:Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Universite de Lorraine; Centre National de la Recherche Scientifique (CNRS)
摘要:Gaussian process (GP) regression is a fundamental tool in Bayesian statistics. It is also known as kriging and is the Bayesian counterpart to the frequentist kernel ridge regression. Most of the theoretical work on GP regression has focused on a large-n asymptotics, characterising the behaviour of GP regression as the amount of data increases. Fixed-sample analysis is much more difficult outside of simple cases, such as locations on a regular grid.In this work, we perform a fixed-sample analys...
-
作者:Bates, Stephen; Candes, Emmanuel; Lei, Lihua; Romano, Yaniv; Sesia, Matteo
作者单位:University of California System; University of California Berkeley; University of California System; University of California Berkeley; Stanford University; Stanford University; Stanford University; Technion Israel Institute of Technology; Technion Israel Institute of Technology; University of Southern California
摘要:This paper studies the construction of p-values for nonparametric out-lier detection, from a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a general framework yielding p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively de-pendent and enable exact false discovery r...
-
作者:Telschow, Fabian J. E.; Cheng, Dan; Pranav, Pratyush; Schwartzman, Armin
作者单位:Humboldt University of Berlin; Arizona State University; Arizona State University-Tempe; Universite Claude Bernard Lyon 1; Ecole Normale Superieure de Lyon (ENS de LYON); University of California System; University of California San Diego
摘要:The expected Euler characteristic (EEC) of excursion sets of a smooth Gaussian-related random field over a compact manifold approximates the dis-tribution of its supremum for high thresholds. Viewed as a function of the excursion threshold, the EEC of a Gaussian-related field is expressed by the Gaussian kinematic formula (GKF) as a finite sum of known functions multi-plied by the Lipschitz-Killing curvatures (LKCs) of the generating Gaussian field. This paper proposes consistent estimators of...
-
作者:Verzelen, Nicolas; Fromont, Magalie; Lerasle, Matthieu; Reynaud-Bouret, Patricia
作者单位:INRAE; Universite de Rennes; Institut Polytechnique de Paris; ENSAE Paris; Universite Cote d'Azur
摘要:Given a times series Y in Rn, with a piecewise constant mean and independent components, the twin problems of change-point detection and change-point localization, respectively amount to detecting the existence of times where the mean varies and estimating the positions of those changepoints. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable defini...
-
作者:Celentano, Michael; Fan, Zhou; Mei, Song
作者单位:University of California System; University of California Berkeley; Yale University
摘要:We study mean-field variational Bayesian inference using the TAP approach, for Z2-synchronization as a prototypical example of a high -dimensional Bayesian model. We show that for any signal strength & lambda; > 1 (the weak-recovery threshold), there exists a unique local minimizer of the TAP free energy functional near the mean of the Bayes posterior law. Furthermore, the TAP free energy in a local neighborhood of this minimizer is strongly con-vex. Consequently, a natural-gradient/mirror-des...