-
作者:Wasserman, Larry; Roeder, Kathryn
作者单位:Carnegie Mellon University
摘要:This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as screening and the last stage as...
-
作者:Juditsky, Anatoli B.; Nemirovski, Arkadi S.
作者单位:University System of Georgia; Georgia Institute of Technology; Communaute Universite Grenoble Alpes; Institut National Polytechnique de Grenoble; Universite Grenoble Alpes (UGA); Centre National de la Recherche Scientifique (CNRS); Inria
摘要:The problem we concentrate on is as follows: given (1) a convex compact set X in R-n, an affine mapping x bar right arrow A(x), a parametric family {p(mu)(.)} of probability densities and (2) N i.i.d. observations of the random variable omega, distributed with the density p(A(x)) (.) for some (unknown) x is an element of X, estimate the value g(T)x of a given linear form at x. For several families {p(mu)(.)} with no additional assumptions on X and A, we develop computationally efficient estima...
-
作者:Candes, Emmanuel J.; Plan, Yaniv
作者单位:California Institute of Technology
摘要:We consider the fundamental problem of estimating the mean of a vector y = X beta + 7, where X is an n x p design matrix in which one can have far more variables than observations, and z is a stochastic error term-the so-called p > n setup. When beta is sparse, or, more generally, when there is a sparse subset of covariates providing a close approximation to the unknown mean vector, we ask whether or not it is possible to accurately, estimate X beta using a computationally tractable algorithm....
-
作者:Ait-Sahalia, Yacine; Jacod, Jean
作者单位:Princeton University; National Bureau of Economic Research; Centre National de la Recherche Scientifique (CNRS); CNRS - National Institute for Mathematical Sciences (INSMI); Universite Paris Cite; Sorbonne Universite
摘要:We define a generalized index of jump activity, propose estimators of that index for a discretely sampled process and derive the estimators' properties. These estimators are applicable despite the presence of Brownian volatility in the process, which makes it more challenging to infer the characteristics of the small, infinite activity jumps. When the method is applied to high frequency stock returns, we find evidence of infinitely active jumps in the data and estimate their index of activity.
-
作者:Chen, Jiahua; Li, Pengfei
作者单位:University of British Columbia; University of Alberta
摘要:Normal mixture distributions are arguably the most important mixture models. and also the most technically challenging. The likelihood function of the normal mixture model is unbounded based oil a set of random samples, unless an artificial bound is placed oil its component variance parameter. Moreover, the model is not strongly identifiable so it is hard to differentiate between over dispersion caused by the presence of a mixture and that caused by a large variance, and it has infinite Fisher...
-
作者:Tokdar, Surya T.; Martin, Ryan; Ghosh, Jayanta K.
作者单位:Carnegie Mellon University; Purdue University System; Purdue University; Indian Statistical Institute; Indian Statistical Institute Kolkata
摘要:Mixture models have received considerable attention recently and Newton [Sankhya Ser A 64 (2002) 306-322] proposed a fast recursive algorithm for estimating a mixing distribution. We prove almost sure consistency of this recursive estimate in the weak topology under mild conditions on the family of densities being mixed. This recursive estimate depends on the data ordering and a permutation-invariant modification is proposed, which is an average of the original over permutations of the data se...
-
作者:Grendar, Marian; Judge, George
作者单位:Matej Bel University; Slovak Academy of Sciences; Institute of Measurement Science, SAS; University of California System; University of California Berkeley
摘要:In this paper we are interested in empirical likelihood (EL)as a method of estimation, and we address the following two problems: (1) selecting among Various empirical discrepancies in an EL framework and (2) demonstrating that El. has a well-defined probabilistic interpretation that would justify its use in a Bayesian context. Using the large deviations approach, a Bayesian law of large numbers is developed that implies that EL and the Bayesian maximum a posteriori probability (MAP) estimator...
-
作者:Nan, Bin; Kalbfleisch, John D.; Yu, Menggang
作者单位:University of Michigan System; University of Michigan; Indiana University System; Indiana University Indianapolis
摘要:We consider a class of doubly weighted rank-based estimating methods for the transformation (or accelerated failure time) model with missing. data as arise, for example, in case-cohort studies. The weights considered may not be predictable its required in a martingale stochastic process formulation. We treat the general problem as a semi parametric estimating equation problem and provide proofs of asymptotic properties for the weighted estimators, with either true weights or estimated Weights....
-
作者:Zhang, Wenyang; Fan, Jianqing; Sun, Yan
作者单位:University of Bath; Princeton University; Shanghai University of Finance & Economics
摘要:In the analysis of cluster data, the regression coefficients are frequently assumed to be the same across all clusters. This hampers the ability to Study the varying impacts of factors on each cluster. In this paper, a semiparametric model is introduced to account for varying impacts of factors over clusters by using cluster-level covariates. It achieves the parsimony of parametrization and allows the explorations of nonlinear interactions. The random effect ill the semiparametric model also a...