-
作者:Candes, Emmanuel; Tao, Terence
作者单位:California Institute of Technology; University of California System; University of California Los Angeles
摘要:In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = X beta + z, where beta epsilon R-p is a parameter vector of interest, X is a data matrix with possibly far fewer rows than columns, n << p, and the z(i)'s are i.i.d. N(0, sigma(2)). Is it possible to estimate beta reliably based on the noisy data y? To estimate beta, we introduce a new estimator-we call it the Dantzig s...
-
作者:Ben Hariz, Samir; Wylie, Jonathan J.; Zhang, Qiang
作者单位:Le Mans Universite; City University of Hong Kong
摘要:Let (X-i)(i)=1,..., n be a possibly nonstationary sequence such that L(X-i) = P-n, if i <= n theta and L(X-i) = Q(n), if i > n theta, where 0 < theta < 1 is the location of the change-point to be estimated. We construct a class of estimators based on the empirical measures and a seminorm on the space of measures defined through a family of functions F. We prove the consistency of the estimator and give rates of convergence under very general conditions. In particular, the 1/n rate is achieved ...
-
作者:Hall, Peter; Meister, Alexander
作者单位:University of Melbourne; University of Stuttgart
摘要:Kernel methods for deconvolution have attractive features, and prevail in the literature. However, they have disadvantages, which include the fact that they are usually suitable only for cases where the error distribution is infinitely supported and its characteristic function does not ever vanish. Even in these settings, optimal convergence rates are achieved by kernel estimators only when the kernel is chosen to adapt to the unknown smoothness of the target distribution. In this paper we sug...
-
作者:Nordman, Daniel J.; Lahiri, Soumendra N.
作者单位:Iowa State University
摘要:This paper introduces a version of empirical likelihood based on the periodogram and spectral estimating equations. This formulation handles dependent data through a data transformation (i.e., a Fourier transform) and is developed in terms of the spectral distribution rather than a time domain probability distribution. The asymptotic properties of frequency domain empirical likelihood are studied for linear time processes exhibiting both short- and long-range dependence. The method results in ...
-
作者:Einmahl, John H. J.; De Haan, Laurens; Li, Deyuan
作者单位:Tilburg University; University of Bern; Erasmus University Rotterdam; Erasmus University Rotterdam - Excl Erasmus MC
摘要:Consider n i.i.d. random vectors on R-2, with unknown, common distribution function F. Under a sharpening of the extreme value condition on F, we derive a weighted approximation of the corresponding tail copula process. Then we construct a test to check whether the extreme value condition holds by comparing two estimators of the limiting extreme value distribution, one obtained from the tail copula process and the other obtained by first estimating the spectral measure which is then used as a ...
-
作者:Geiger, Dan; Meek, Christopher; Sturmfels, Bernd
作者单位:Technion Israel Institute of Technology; Microsoft; University of California System; University of California Berkeley
摘要:We formulate necessary and sufficient conditions for an arbitrary discrete probability distribution to factor according to an undirected graphical model, or a log-linear model, or other more general exponential models. For decomposable graphical models these conditions are equivalent to a set of conditional independence statements similar to the Hammersley-Clifford theorem; however, we show that for nondecomposable graphical models they are not. We also show that nondecomposable models can hav...
-
作者:Cai, T. Tony; Hall, Peter
作者单位:University of Pennsylvania; Australian National University
摘要:There has been substantial recent work on methods for estimating the slope function in linear regression for functional data analysis. However, as in the case of more conventional finite-dimensional regression, much of the practical interest in the slope centers on its application for the purpose of prediction, rather than on its significance in its own right. We show that the problems of slope-function estimation, and of prediction from an estimator of the slope function, have very different ...
-
作者:Minary, Peter; Levitt, Michael
作者单位:Stanford University
摘要:Novel sampling algorithms can significantly impact open questions in computational biology, most notably the in silico protein folding problem. By using computational methods, protein folding aims to find the three-dimensional structure of a protein chain given the sequence of its amino acid building blocks. The complexity of the problem strongly depends on the protein representation and its energy function. The more detailed the model, the more complex its corresponding energy function and th...
-
作者:Chen, Yuguo; Dinwoodie, Ian H.; Sullivant, Seth
作者单位:University of Illinois System; University of Illinois Urbana-Champaign; Harvard University; Duke University
摘要:We describe an algorithm for the sequential sampling of entries in multiway contingency tables with given constraints. The algorithm can be used for computations in exact conditional inference. To justify the algorithm, a theory relates sampling values at each step to properties of the associated toric ideal using computational commutative algebra. In particular, the property of interval cell counts at each step is related to exponents on lead indeterminates of a lexicographic Grobner basis. A...
-
作者:Hallin, Marc; Oja, Hannu; Paindaveine, Davy
作者单位:Universite Libre de Bruxelles; Universite Libre de Bruxelles; Tampere University
摘要:A class of R-estimators based on the concepts of multivariate signed ranks and the optimal rank-based tests developed in Hallin and Paindaveine [Ann. Statist. 34 (2006) 2707-2756] is proposed for the estimation of the shape matrix of an elliptical distribution. These R-estimators are root-n consistent under any radial density g, without any moment assumptions, and semiparametrically efficient at some prespecified density f. When based on normal scores, they are uniformly more efficient than th...