-
作者:Rigon, Tommaso; Herring, Amy H.; Dunson, David B.
作者单位:University of Milano-Bicocca; Duke University
摘要:Loss-based clustering methods, such as k-means clustering and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative approach, but such methods face computational problems and are highly sensitive to the choice of kernel. In this article we propose a generalized Bayes framework that bridges between these paradigms through t...
-
作者:Yan, Jian; Zhang, Xianyang
作者单位:Texas A&M University System; Texas A&M University College Station
摘要:Motivated by the increasing use of kernel-based metrics for high-dimensional and large-scale data, we study the asymptotic behaviour of kernel two-sample tests when the dimension and sample sizes both diverge to infinity. We focus on the maximum mean discrepancy using an isotropic kernel, which includes maximum mean discrepancy with the Gaussian kernel and the Laplace kernel, and the energy distance as special cases. We derive asymptotic expansions of the kernel two-sample statistics, based on...
-
作者:Fomichov, V; Ivanovs, J.
作者单位:Aarhus University
摘要:There is growing empirical evidence that spherical k-means clustering performs well at identifying groups of concomitant extremes in high dimensions, thereby leading to sparse models. We provide one of the first theoretical results supporting this approach, but also demonstrate some pitfalls. Furthermore, we show that an alternative cost function may be more appropriate for identifying concomitant extremes, and it results in a novel spherical k-principal-components clustering algorithm. Our ma...
-
作者:Lunde, Robert; Sarkar, Purnamrita
作者单位:University of Michigan System; University of Michigan; University of Texas System; University of Texas Austin
摘要:We study the properties of two subsampling procedures for networks, vertex subsampling and p-subsampling, under the sparse graphon model. The consistency of network subsampling is demonstrated under the minimal assumptions of weak convergence of the corresponding network statistics and an expected subsample size growing to infinity more slowly than the number of vertices in the network. Furthermore, under appropriate sparsity conditions, we derive limiting distributions for the nonzero eigenva...
-
作者:Yu, Miao; Lu, Wenbin; Yang, Shu; Ghosh, Pulak
作者单位:North Carolina State University; Indian Institute of Management (IIM System); Indian Institute of Management Bangalore
摘要:Zero-inflated nonnegative outcomes are common in many applications. In this work, motivated by freemium mobile game data, we propose a class of multiplicative structural nested mean models for zero-inflated nonnegative outcomes which flexibly describes the joint effect of a sequence of treatments in the presence of time-varying confounders. The proposed estimator solves a doubly robust estimating equation, where the nuisance functions, namely the propensity score and conditional outcome means ...
-
作者:Kwon, Yeil; Zhao, Zhigen
作者单位:University of Central Arkansas; Pennsylvania Commonwealth System of Higher Education (PCSHE); Temple University
摘要:We consider the problem of empirical Bayes estimation of multiple variances when provided with sample variances. Assuming an arbitrary prior on the variances, we derive different versions of the Bayes estimators using different loss functions. For one particular loss function, the resulting Bayes estimator relies on the marginal cumulative distribution function of the sample variances only. When replacing it with the empirical distribution function, we obtain an empirical Bayes version called ...
-
作者:Marrs, F. W.; Fosdick, B. K.; Mccormick, T. H.
作者单位:United States Department of Energy (DOE); Los Alamos National Laboratory; Colorado State University System; Colorado State University Fort Collins; University of Washington; University of Washington Seattle
摘要:Relational arrays represent measures of association between pairs of actors, often in varied contexts or over time. Trade flows between countries, financial transactions between individuals, contact frequencies between school children in classrooms and dynamic protein-protein interactions are all examples of relational arrays. Elements of a relational array are often modelled as a linear function of observable covariates. Uncertainty estimates for regression coefficient estimators, and ideally...
-
作者:Chu, J.; Lu, W.; Yang, S.
作者单位:North Carolina State University
摘要:Personalized decision-making, aiming to derive optimal treatment regimes based on individual characteristics, has recently attracted increasing attention in many fields, such as medicine, social services and economics. Current literature mainly focuses on estimating treatment regimes from a single source population. In real-world applications, the distribution of a target population can be different from that of the source population. Therefore, treatment regimes learned by existing methods ma...
-
作者:Dunn, Robin; Ramdas, Aaditya; Balakrishnan, Sivaraman; Wasserman, Larry
作者单位:Novartis; Novartis USA; Carnegie Mellon University
摘要:The classical likelihood ratio test based on the asymptotic chi-squared distribution of the log-likelihood is one of the fundamental tools of statistical inference. A recent universal likelihood ratio test approach based on sample splitting provides valid hypothesis tests and confidence sets in any setting for which we can compute the split likelihood ratio statistic, or, more generally, an upper bound on the null maximum likelihood. The universal likelihood ratio test is valid in finite sampl...
-
作者:Duanmu, Haosui; Roy, Daniel M.; Smith, Aaron
作者单位:Harbin Institute of Technology; University of California System; University of California Berkeley; University of Toronto; University of Ottawa
摘要:A matching prior at level 1 - a is a prior such that an associated 1 - a credible region is also a 1- a confidence set. We study the existence of matching priors for general families of credible regions. Our main result gives topological conditions under which matching priors for specific families of credible regions exist. Informally, we prove that, on compact parameter spaces, a matching prior exists if the so-called rejection-probability function is jointly continuous when we adopt the Wass...