-
作者:Wang, Shulei; Cai, T. Tony; Li, Hongzhe
作者单位:University of Pennsylvania; University of Pennsylvania
摘要:The weighted UniFrac distance, a plug-in estimator of the Wasserstein distance of read counts on a tree, has been widely used to measure the microbial community difference in microbiome studies. Our investigation however shows that such a plug-in estimator, although intuitive and commonly used in practice, suffers from potential bias. Motivated by this finding, we study the problem of optimal estimation of the Wasserstein distance between two distributions on a tree from the sampled data in th...
-
作者:Jacob, Pierre E.; Gong, Ruobin; Edlefsen, Paul T.; Dempster, Arthur P.
作者单位:Harvard University; Rutgers University System; Rutgers University New Brunswick; Fred Hutchinson Cancer Center; ESSEC Business School
摘要:We are very grateful to all commenters for their stimulating remarks, questions, as well as useful pointers to the literature which span a wide range of statistical methods over decades of research. We have neither the space nor the knowledge to answer many of the questions raised, and we only aim to offer some clarifications. We hope that readers will be as enthusiastic as ourselves about research on the topics discussed by the commenters. In the following, we refer to Diaconis and Wang as DW...
-
作者:Tang, Xiwei; Xue, Fei; Qu, Annie
作者单位:University of Virginia; University of Pennsylvania
摘要:In this article, we propose a heterogeneous modeling framework which achieves individual-wise feature selection and heterogeneous covariates' effects subgrouping simultaneously. In contrast to conventional model selection approaches, the new approach constructs a separation penalty with multidirectional shrinkages, which facilitates individualized modeling to distinguish strong signals from noisy ones and selects different relevant variables for different individuals. Meanwhile, the proposed m...
-
作者:Jacob, Pierre E.; Gong, Ruobin; Edlefsen, Paul T.; Dempster, Arthur P.
作者单位:Harvard University; Rutgers University System; Rutgers University New Brunswick; Fred Hutchinson Cancer Center
摘要:We present a Gibbs sampler for the Dempster-Shafer (DS) approach to statistical inference for categorical distributions. The DS framework extends the Bayesian approach, allows in particular the use of partial prior information, and yields three-valued uncertainty assessments representing probabilities for, against, and don't know about formal assertions of interest. The proposed algorithm targets the distribution of a class of random convex polytopes which encapsulate the DS inference. The sam...
-
作者:Ferrari, Federico; Dunson, David B.
作者单位:Duke University
摘要:This article is motivated by the problem of inference on interactions among chemical exposures impacting human health outcomes. Chemicals often co-occur in the environment or in synthetic mixtures and as a result exposure levels can be highly correlated. We propose a latent factor joint model, which includes shared factors in both the predictor and response components while assuming conditional independence. By including a quadratic regression in the latent variables in the response component,...
-
作者:Qiu, Yixuan; Wang, Xiao
作者单位:Carnegie Mellon University; Purdue University System; Purdue University
摘要:Latent variable models cover a broad range of statistical and machine learning models, such as Bayesian models, linear mixed models, and Gaussian mixture models. Existing methods often suffer from two major challenges in practice: (a) a proper latent variable distribution is difficult to be specified; (b) making an exact likelihood inference is formidable due to the intractable computation. We propose a novel framework for the inference of latent variable models that overcomes these two limita...
-
作者:Delaigle, Aurore; Hall, Peter; Huang, Wei; Kneip, Alois
作者单位:University of Melbourne; University of Melbourne; University of Bonn; University of Bonn
摘要:We consider the problem of estimating the covariance function of functional data which are only observed on a subset of their domain, such as fragments observed on small intervals or related types of functional data. We focus on situations where the data enable to compute the empirical covariance function or smooth versions of it only on a subset of its domain which contains a diagonal band. We show that estimating the covariance function consistently outside that subset is possible as long as...
-
作者:Sarkar, Abhra; Pati, Debdeep; Mallick, Bani K.; Carroll, Raymond J.
作者单位:University of Texas System; University of Texas Austin; Texas A&M University System; Texas A&M University College Station; University of Technology Sydney
摘要:Estimating the marginal and joint densities of the long-term average intakes of different dietary components is an important problem in nutritional epidemiology. Since these variables cannot be directly measured, data are usually collected in the form of 24-hr recalls of the intakes, which show marked patterns of conditional heteroscedasticity. Significantly compounding the challenges, the recalls for episodically consumed dietary components also include exact zeros. The problem of estimating ...
-
作者:Hu, Jianwei; Zhang, Jingfei; Qin, Hong; Yan, Ting; Zhu, Ji
作者单位:Central China Normal University; University of Miami; Zhongnan University of Economics & Law; University of Michigan System; University of Michigan
摘要:The stochastic block model is widely used for detecting community structures in network data. How to test the goodness of fit of the model is one of the fundamental problems and has gained growing interests in recent years. In this article, we propose a novel goodness-of-fit test based on the maximum entry of the centered and rescaled adjacency matrix for the stochastic block model. One noticeable advantage of the proposed test is that the number of communities can be allowed to grow linearly ...
-
作者:Guo, Xinzhou; He, Xuming
作者单位:University of Michigan System; University of Michigan
摘要:When existing clinical trial data suggest a promising subgroup, we must address the question of how good the selected subgroup really is. The usual statistical inference applied to the selected subgroup, assuming that the subgroup is chosen independent of the data, may lead to an overly optimistic evaluation of the selected subgroup. In this article, we address the issue of selection bias and develop a de-biasing bootstrap inference procedure for the best selected subgroup effect. The proposed...