-
作者:Lee, Stephen M. S.; Soleymani, Mehdi
作者单位:University of Hong Kong
摘要:Suppose that two estimators, (theta) over cap (S,n) and (theta) over cap (N,n), are available for estimating an unknown parameter theta, and are known to have convergence rates n(1/2) and r(n) = o(n(1/2)), respectively, based on a sample of size n. Typically, the more efficient estimator (theta) over cap (S,n) is less robust than (theta) over cap (N,n), and a definitive choice cannot be easily made between them under practical circumstances. We propose a simple mixture estimator, in the form o...
-
作者:Morgan, Kari Lock; Rubin, Donald B.
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; Harvard University
摘要:When conducting a randomized experiment, if an allocation yields treatment groups that differ meaningfully with respect to relevant covariates, groups should be rerandomized. The process involves specifying an explicit criterion for whether an allocation is acceptable, based on a measure of covariate balance, and rerandomizing units until an acceptable allocation is obtained. Here, we illustrate how rerandomization could have improved the design of an already conducted randomized experiment on...
-
作者:Xu, Jin; Chen, Jiajie; Qian, Peter Z. G.
作者单位:East China Normal University; Wells Fargo Company; University of Wisconsin System; University of Wisconsin Madison
摘要:The use of iteratively enlarged Latin hypercube designs for running computer experiments has recently gained popularity in practice. This approach conducts an initial experiment with a computer code using a Latin hypercube design and then runs a follow-up experiment with additional runs elaborately chosen so that the combined design set for the two experiments forms a larger Latin hypercube design. This augmenting process can be repeated multiple stages, where in each stage the augmented desig...
-
作者:Guo, Zifang; Li, Lexin; Lu, Wenbin; Li, Bing
作者单位:Merck & Company; Merck & Company USA; Stanford University; Li Ka Shing Center; University of California System; University of California Berkeley; North Carolina State University; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:The family of sufficient dimension reduction (SDR) methods that produce informative combinations of predictors, or indices, are particularly useful for high-dimensional regression analysis. In many such analyses, it becomes increasingly common that there is available a priori subject knowledge of the predictors; for example, they belong to different groups. While many recent SDR proposals have greatly expanded the scope of the methods' applicability, how to effectively incorporate the prior pr...
-
作者:Zhou, Jing; Bhattacharya, Anirban; Herring, Amy H.; Dunson, David B.
作者单位:University of North Carolina; University of North Carolina Chapel Hill; Texas A&M University System; Texas A&M University College Station; University of North Carolina; University of North Carolina Chapel Hill; Duke University
摘要:It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank k tensor as a sum of rank one tensors. In contingency table applications in which the sample size is massively less than the number of cells in...
-
作者:Belloni, Alexandre; Chernozhukov, Victor
作者单位:Duke University; Massachusetts Institute of Technology (MIT)
-
作者:Sewell, Daniel K.; Chen, Yuguo
作者单位:University of Illinois System; University of Illinois Urbana-Champaign
摘要:Dynamic networks are used in a variety of fields to represent the structure and evolution of the relationships between entities. We present a model which embeds longitudinal network data as trajectories in a latent Euclidean space. We propose Markov chain Monte Carlo (MCMC) algorithm to estimate the model parameters and latent positions of the actors in the network. The model yields meaningful visualization of dynamic networks, giving the researcher insight into the evolution and the structure...
-
作者:Zhu, Ruoqing; Zeng, Donglin; Kosorok, Michael R.
作者单位:University of North Carolina; University of North Carolina Chapel Hill
摘要:In this article, we introduce a new type of tree-based method, reinforcement learning trees (RLT), which exhibits significantly improved performance over traditional methods such as random forests (Breiman 2001) under high-dimensional settings. The innovations are threefold. First, the new method implements reinforcement learning at each selection of a splitting variable during the tree construction processes. By splitting on the variable that brings the greatest future improvement in later sp...
-
作者:Angrist, Joshua D.; Rokkanen, Miikka
作者单位:Massachusetts Institute of Technology (MIT); National Bureau of Economic Research; Columbia University
-
作者:Wang, Lan; Peng, Bo; Li, Runze
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of Minnesota System; University of Minnesota Twin Cities; Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park
摘要:This work is concerned with testing the population mean vector of nonnormal high-dimensional multivariate data. Several tests for high-dimensional mean vector, based on modifying the classical Hotelling T-2 test, have been proposed in the literature. Despite their usefulness, they tend to have unsatisfactory power performance for heavy-tailed multivariate data, which frequently arise in genomics and quantitative finance. This article proposes a novel high-dimensional nonparametric test for the...