-
作者:Chakraborty, Rudrasis; Vemuri, Baba C.
作者单位:State University System of Florida; University of Florida
摘要:A Stiefel manifold of the compact type is often encountered in many fields of engineering including, signal and image processing, machine learning, numerical optimization and others. The Stiefel manifold is a Riemannian homogeneous space but not a symmetric space. In previous work, researchers have defined probability distributions on symmetric spaces and performed statistical analysis of data residing in these spaces. In this paper, we present original work involving definition of Gaussian di...
-
作者:Han, Qiyang; Wellner, Jon A.
作者单位:University of Washington; University of Washington Seattle
摘要:We study the performance of the least squares estimator (LSE) in a general nonparametric regression model, when the errors are independent of the covariates but may only have a pth moment (p >= 1). In such a heavy-tailed regression setting, we show that if the model satisfies a standard entropy condition with exponent alpha is an element of (0, 2), then the L-2 loss of the LSE converges at a rate O-P(n(-1/2+alpha) boolean OR n(-1/2+1/2p)). Such a rate cannot be improved under the entropy condi...
-
作者:Han, Xu
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University
摘要:Sure screening technique has been considered as a powerful tool to handle the ultrahigh dimensional variable selection problems, where the dimensionality p and the sample size n can satisfy the NP dimensionality log p = O(n(a)) for some a > 0 [J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 (2008) 849-911]. The current paper aims to simultaneously tackle the universality and effectiveness of sure screening procedures. For the universality, we develop a general and unified framework for nonparametr...
-
作者:Rothenhausler, Dominik; Buhlmann, Peter; Meinshausen, Nicolai
作者单位:Swiss Federal Institutes of Technology Domain; ETH Zurich
摘要:Causal inference is known to be very challenging when only observational data are available. Randomized experiments are often costly and impractical and in instrumental variable regression the number of instruments has to exceed the number of causal predictors. It was recently shown in Peters, Buhlmann and Meinshausen (2016) (J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 947-1012) that causal inference for the full model is possible when data from distinct observational environments are availabl...
-
作者:Aamari, Eddie; Levrard, Clement
作者单位:University of California System; University of California San Diego; Sorbonne Universite; Universite Paris Cite
摘要:Given a noisy sample from a submanifold M subset of R-D, we derive optimal rates for the estimation of tangent spaces TXM, the second fundamental form IIXM and the submanifold M. After motivating their study, we introduce a quantitative class of C-k-submanifolds in analogy with Holder classes. The proposed estimators are based on local polynomials and allow to deal simultaneously with the three problems at stake. Minimax lower bounds are derived using a conditional version of Assouad's lemma w...
-
作者:Li, Jia; Liu, Yunxiao; Xiu, Dacheng
作者单位:Duke University; University of North Carolina; University of North Carolina Chapel Hill; University of Chicago
摘要:We propose semiparametrically efficient estimators for general integrated volatility functionals of multivariate semimartingale processes. A plug-in method that uses nonparametric estimates of spot volatilities is known to induce high-order biases that need to be corrected to obey a central limit theorem. Such bias terms arise from boundary effects, the diffusive and jump movements of stochastic volatility and the sampling error from the nonparametric spot volatility estimation. We propose a n...
-
作者:Doss, Charles R.; Wellner, Jon A.
作者单位:University of Minnesota System; University of Minnesota Twin Cities; University of Washington; University of Washington Seattle
摘要:We study a likelihood ratio test for the location of the mode of a log-concave density. Our test is based on comparison of the log-likelihoods corresponding to the unconstrained maximum likelihood estimator of a log-concave density and the constrained maximum likelihood estimator where the constraint is that the mode of the density is fixed, say at m. The constrained estimation problem is studied in detail in Doss and Wellner (2018). Here, the results of that paper are used to show that, under...
-
作者:Chen, Yen-Chi
作者单位:University of Washington; University of Washington Seattle
摘要:In this paper we study the alpha-cluster tree (alpha-tree) under both singular and nonsingular measures. The alpha-tree uses probability contents within a set created by the ordering of points to construct a cluster tree so that it is well defined even for singular measures. We first derive the convergence rate for a density level set around critical points, which leads to the convergence rate for estimating an alpha-tree under nonsingular measures. For singular measures, we study how the kern...
-
作者:Carpentier, Alexandra; Verzelen, Nicolas
作者单位:Otto von Guericke University; INRAE; Institut Agro; Montpellier SupAgro; Universite de Montpellier
摘要:Consider the Gaussian vector model with mean value.. We study the twin problems of estimating the number parallel to theta parallel to(0) of nonzero components of. and testing whether parallel to theta parallel to(0) is smaller than some value. For testing, we establish the minimax separation distances for this model and introduce a minimax adaptive test. Extensions to the case of unknown variance are also discussed. Rewriting the estimation of parallel to theta parallel to(0) as a multiple te...
-
作者:Fan, Jianqing; Wang, Dong; Wang, Kaizheng; Zhu, Ziwei
作者单位:Princeton University; University of Michigan System; University of Michigan
摘要:Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. This paper proposes and studies a distributed PCA algorithm: each node machine computes the top K eigenvectors and transmits them to the centr...