-
作者:Wang, Minjie; Allen, Genevera, I
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Rice University
摘要:Structural learning of Gaussian graphical models in the presence of latent variables has long been a challenging problem. Chandrasekaran et al. (2012) proposed a convex program for estimating a sparse graph plus a low-rank term that adjusts for latent variables; however, this approach poses challenges from both computational and statistical perspectives. We propose an alternative, simple solution: apply a hard-thresholding operator to existing graph selection methods. Conceptually simple and c...
-
作者:Basse, Guillaume W.; Ding, Yi; Toulis, Panos
作者单位:Stanford University; Massachusetts Institute of Technology (MIT); University of Chicago
摘要:In many modern settings, such as an online marketplace, randomized experiments need to be executed over multiple time periods. In such temporal experiments, it has been observed that the effects of an intervention on an experimental unit may be large when the unit is first exposed to it, but then it attenuates after repeated exposures. This is typically due to units' habituation to the intervention, or some other form of learning, such as when users gradually start to ignore repeated mails sen...
-
作者:Azriel, D.
作者单位:Technion Israel Institute of Technology
摘要:This work studies an experimental design problem where the values of a predictor variable, denoted by x, are to be determined with the goal of estimating a function m(x), which is observed with noise. A linear model is fitted to m(x), but it is not assumed that the model is correctly specified. It follows that the quantity of interest is the best linear approximation of m(x), which is denoted by l(x). It is shown that in this framework the ordinary least squares estimator typically leads to an...
-
作者:Rigon, Tommaso; Herring, Amy H.; Dunson, David B.
作者单位:University of Milano-Bicocca; Duke University
摘要:Loss-based clustering methods, such as k-means clustering and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative approach, but such methods face computational problems and are highly sensitive to the choice of kernel. In this article we propose a generalized Bayes framework that bridges between these paradigms through t...
-
作者:Yan, Jian; Zhang, Xianyang
作者单位:Texas A&M University System; Texas A&M University College Station
摘要:Motivated by the increasing use of kernel-based metrics for high-dimensional and large-scale data, we study the asymptotic behaviour of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We focus on the maximum mean discrepancy using an isotropic kernel, which includes maximum mean discrepancy with the Gaussian kernel and the Laplace kernel, and the energy distance as special cases. We derive asymptotic expansions of the kernel two-sample statistics, based on...
-
作者:Fomichov, V; Ivanovs, J.
作者单位:Aarhus University
摘要:There is growing empirical evidence that spherical k-means clustering performs well at identifying groups of concomitant extremes in high dimensions, thereby leading to sparse models. We provide one of the first theoretical results supporting this approach, but also demonstrate some pitfalls. Furthermore, we show that an alternative cost function may be more appropriate for identifying concomitant extremes, and it results in a novel spherical k-principal-components clustering algorithm. Our ma...
-
作者:Lunde, Robert; Sarkar, Purnamrita
作者单位:University of Michigan System; University of Michigan; University of Texas System; University of Texas Austin
摘要:We study the properties of two subsampling procedures for networks, vertex subsampling and p-subsampling, under the sparse graphon model. The consistency of network subsampling is demonstrated under the minimal assumptions of weak convergence of the corresponding network statistics and an expected subsample size growing to infinity more slowly than the number of vertices in the network. Furthermore, under appropriate sparsity conditions, we derive limiting distributions for the nonzero eigenva...
-
作者:Yu, Miao; Lu, Wenbin; Yang, Shu; Ghosh, Pulak
作者单位:North Carolina State University; Indian Institute of Management (IIM System); Indian Institute of Management Bangalore
摘要:Zero-inflated nonnegative outcomes are common in many applications. In this work, motivated by freemium mobile game data, we propose a class of multiplicative structural nested mean models for zero-inflated nonnegative outcomes which flexibly describes the joint effect of a sequence of treatments in the presence of time-varying confounders. The proposed estimator solves a doubly robust estimating equation, where the nuisance functions, namely the propensity score and conditional outcome means ...
-
作者:Kwon, Yeil; Zhao, Zhigen
作者单位:University of Central Arkansas; Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University
摘要:We consider the problem of empirical Bayes estimation of multiple variances when provided with sample variances. Assuming an arbitrary prior on the variances, we derive different versions of the Bayes estimators using different loss functions. For one particular loss function, the resulting Bayes estimator relies on the marginal cumulative distribution function of the sample variances only. When replacing it with the empirical distribution function, we obtain an empirical Bayes version called ...
-
作者:Marrs, F. W.; Fosdick, B. K.; Mccormick, T. H.
作者单位:United States Department of Energy (DOE); Los Alamos National Laboratory; Colorado State University System; Colorado State University Fort Collins; University of Washington; University of Washington Seattle
摘要:Relational arrays represent measures of association between pairs of actors, often in varied contexts or over time. Trade flows between countries, financial transactions between individuals, contact frequencies between school children in classrooms and dynamic protein-protein interactions are all examples of relational arrays. Elements of a relational array are often modelled as a linear function of observable covariates. Uncertainty estimates for regression coefficient estimators, and ideally...