-
作者:Fan, Jianqing; Liu, Han; Wang, Weichen
作者单位:Princeton University; Fudan University
摘要:We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high-level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properti...
-
作者:Efron, Bradley
作者单位:Stanford University
摘要:Maximum likelihood estimates are sufficient statistics in exponential families, but not in general. The theory of statistical curvature was introduced to measure the effects of MLE insufficiency in one-parameter families. Here, we analyze curvature in the more realistic venue of multiparameter families-more exactly, curved exponential families, a broad class of smoothly defined nonexponential family models. We show that within the set of observations giving the same value for the MLE, there is...
-
作者:Vu Dinh; Lam Si Tung Ho; Suchard, Marc A.; Matsen, Frederick A.
作者单位:Fred Hutchinson Cancer Center; University of California System; University of California Los Angeles; University of California System; University of California Los Angeles
摘要:It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct gene tree. Although the gene tree may deviate from the species tree due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that t...
-
作者:Gloter, Arnaud; Loukianova, Dasha; Mai, Hilmar
作者单位:Universite Paris Saclay; Institut Polytechnique de Paris; ENSAE Paris
摘要:The problem of drift estimation for the solution X of a stochastic differential equation with Levy-type jumps is considered under discrete high-frequency observations with a growing observation window. An efficient and asymptotically normal estimator for the drift parameter is constructed under minimal conditions on the jump behavior and the sampling scheme. In the case of a bounded jump measure density, these conditions reduce to n Delta(3-epsilon)(n)-> 0, where n is the number of observation...
-
作者:Cai, T. Tony; Guo, Zijian
作者单位:University of Pennsylvania; Rutgers University System; Rutgers University New Brunswick
摘要:This paper considers point and interval estimation of the l(q) loss of an estimator in high-dimensional linear regression with random design. We establish the minimax rate for estimating the l(q) loss and the minimax expected length of confidence intervals for the l(q) loss of rate-optimal estimators of the regression vector, including commonly used estimators such as Lasso, scaled Lasso, square-root Lasso and Dantzig Selector. Adaptivity of confidence intervals for the l(q) loss is also studi...
-
作者:Hirose, Masayo Yoshimori; Lahiri, Partha
作者单位:Research Organization of Information & Systems (ROIS); Institute of Statistical Mathematics (ISM) - Japan; University System of Maryland; University of Maryland College Park
摘要:The two-level normal hierarchical model (NHM) has played a critical role in statistical theory for the last several decades. In this paper, we propose random effects variance estimator that simultaneously (i) improves on the estimation of the related shrinkage factors, (ii) protects empirical best linear unbiased predictors (EBLUP) [same as empirical Bayes (EB)] of the random effects from the common overshrinkage problem, (iii) avoids complex bias correction in generating strictly positive sec...
-
作者:Sienkiewicz, Ela; Wang, Haonan
作者单位:Colorado State University System; Colorado State University Fort Collins
摘要:In this paper, we consider a set of unlabeled tree objects with topological and geometric properties. For each data object, two curve representations are developed to characterize its topological and geometric aspects. We further define the notions of topological and geometric medians as well as quantiles based on both representations. In addition, we take a novel approach to define the Pareto medians and quantiles through a multi-objective optimization problem. In particular, we study two dif...
-
作者:Donoho, David; Gavish, Matan; Johnstone, Iain
作者单位:Stanford University; Hebrew University of Jerusalem
摘要:We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the spiked covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker eta that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker et...
-
作者:Doss, Hani; Park, Yeonhee
作者单位:State University System of Florida; University of Florida; University of Texas System; UTMD Anderson Cancer Center
摘要:Consider a Bayesian situation in which we observe Y similar to p(theta), where theta is an element of Theta and we have a family {vh, h is an element of H} of potential prior distributions on Theta. Let g be a real-valued function of theta, and let I-g(h) be the posterior expectation of g(theta) when the prior is v(h) . We are interested in two problems: (i) selecting a particular value of h, and (ii) estimating the family of posterior expectations {I-g(h), h is an element of H}. Let m(y)(h) b...
-
作者:Groeneboom, Piet; Hendrickx, Kim
作者单位:Delft University of Technology; Hasselt University
摘要:We construct root n-consistent and asymptotically normal estimates for the finite dimensional regression parameter in the current status linear regression model, which do not require any smoothing device and are based on maximum likelihood estimates (MLEs) of the infinite dimensional parameter. We also construct estimates, again only based on these MLEs, which are arbitrarily close to efficient estimates, if the generalized Fisher information is finite. This type of efficiency is also derived ...