-
作者:Abraham, Kweku; Castillo, Ismael; Roquain, Etienne
作者单位:University of Cambridge; Universite Paris Cite; Sorbonne Universite
摘要:This work investigates multiple testing by considering minimax separation rates in the sparse sequence model, when the testing risk is measured as the sum FDR+FNR + FNR (False Discovery Rate plus False Negative Rate). First, using the popular beta-min separation condition, with all nonzero signals separated from 0 by at least some amount, we determine the sharp minimax testing risk asymptotically and thereby explicitly describe the transition from achievable multiple testing with vanishing ris...
-
作者:Chaudhuri, Anamitra; Fellouris, Georgios
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:The problem of joint sequential detection and isolation is considered in the context of multiple, not necessarily independent, data streams. A multiple testing framework is proposed, where each hypothesis corresponds to a different subset of data streams, the sample size is a stopping time of the observations, and the probabilities of four kinds of error are controlled below distinct, user-specified levels. Two of these errors reflect the detection component of the formulation, whereas the oth...
-
作者:Zhang, Anderson Y.; Zhou, Harrison Y.
作者单位:University of Pennsylvania; Yale University
摘要:The singular subspaces perturbation theory is of fundamental importance in probability and statistics. It has various applications across different fields. We consider two arbitrary matrices where one is a leave-one-column-out sub- matrix of the other one and establish a novel perturbation upper bound for the distance between the two corresponding singular subspaces. It is well suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds ...
-
作者:Axelrod, Brian; Garg, Shivam; Han, Yanjun; Sharan, Vatsal; Valiant, Gregory
作者单位:Stanford University; Microsoft; New York University; New York University; University of Southern California
摘要:The sample amplification problem formalizes the following question: Given n i.i.d. samples drawn from an unknown distribution P, when is it possible to produce a larger set of n + m samples which cannot be distinguished from n + m i.i.d. samples drawn from P? In this work, we provide a firm statistical foundation for this problem by deriving generally applicable amplification procedures, lower bound techniques and connections to existing statistical notions. Our techniques apply to a large cla...
-
作者:Chong, Carsten H.; Hoffmann, Marc; Liu, Yanghui; Rosenbaum, Mathieu; Szymansky, Gregoire
作者单位:Hong Kong University of Science & Technology; Universite PSL; Universite Paris-Dauphine; City University of New York (CUNY) System; Baruch College (CUNY); Institut Polytechnique de Paris; Ecole Polytechnique
摘要:In recent years, rough volatility models have gained considerable attention in quantitative finance. In this paradigm, the stochastic volatility of the price of an asset has quantitative properties similar to that of a fractional Brownian motion with small Hurst index H < 1/2. In this work, we provide the first rigorous statistical analysis of the problem of estimating H from historical observations of the underlying asset. We establish minimax lower bounds and design optimal procedures based ...
-
作者:Fan, Jianqian; Gu, Yihong; Zhou, Wen-Xin
作者单位:Princeton University; University of Illinois System; University of Illinois Chicago; University of Illinois Chicago Hospital
摘要:This paper investigates the stability of deep ReLU neural networks for nonparametric regression under the assumption that the noise has only a finite pth moment. We unveil how the optimal rate of convergence depends on p, the degree of smoothness and the intrinsic dimension in a class of nonparametric regression functions with hierarchical composition structure when both the adaptive Huber loss and deep ReLU neural networks are used. This optimal rate of convergence cannot be obtained by the o...
-
作者:Lee, Eun Ryung; Park, Seyoung; Mammen, Enno; Park, Byeong U.
作者单位:Sungkyunkwan University (SKKU); Yonsei University; Ruprecht Karls University Heidelberg; Seoul National University (SNU)
摘要:Smooth backfitting has been proposed and proved as a powerful nonparametric estimation technique for additive regression models in various settings. Existing studies are restricted to cases with a moderate number of covariates and are not directly applicable to high dimensional settings. In this paper, we develop new kernel estimators based on the idea of smooth backfitting for high dimensional additive models. We introduce a novel penalization scheme, combining the idea of functional Lasso wi...
-
作者:Li, Huiqin; Pan, Guangming; Yin, Yanqing; Zhou, Wang
作者单位:Chongqing University; Nanyang Technological University; National University of Singapore
摘要:Motivated by the statistical inference using the Gram matrix in the context of missing at random observations, this paper investigates the spectral resents a Hadamard random matrix with entries determined by independent Bernoulli variables D. Operating within the high-dimensional framework, we establish the convergence of the empirical spectral distribution of Sn to a well-defined limiting distribution. In addition, we explore the impact of the missing mechanism on the second-order properties ...
-
作者:Sell, Torben; Berrett, Thomas b.; Cannings, Timothy i.
作者单位:University of Edinburgh; Heriot Watt University; University of Edinburgh; University of Warwick
摘要:We introduce a new nonparametric framework for classification problems in the presence of missing data. The key aspect of our framework is that the regression function decomposes into an anova-type sum of orthogonal functions, of which some (or even many) may be zero. Working under a general missingness setting, which allows features to be missing not at random, our main goal is to derive the minimax rate for the excess risk in this problem. In addition to the decomposition property, the rate ...
-
作者:Pathak, Reese; Wainwright, Martin J.; Xiao, Lin
作者单位:University of California System; University of California Berkeley; Massachusetts Institute of Technology (MIT)
摘要:Estimation problems with constrained parameter spaces arise in various settings. In many of these problems, the observations available to the statistician can be modelled as arising from the noisy realization of the image of a random linear operator; an important special case is random design regression. We derive sharp rates of estimation for arbitrary compact elliptical parameter sets and demonstrate how they depend on the distribution of the random linear operator. Our main result is a func...