-
作者:Vansteelandt, Stijn; Dukes, Oliver
作者单位:Ghent University; University of London; London School of Hygiene & Tropical Medicine
摘要:Inference for the parameters indexing generalised linear models is routinely based on the assumption that the model is correct and a priori specified. This is unsatisfactory because the chosen model is usually the result of a data-adaptive model selection process, which may induce excess uncertainty that is not usually acknowledged. Moreover, the assumptions encoded in the chosen model rarely represent some a priori known, ground truth, making standard inferences prone to bias, but also failin...
-
作者:Hennig, Christian
作者单位:University of Bologna
-
作者:Biswas, Niloy; Bhattacharya, Anirban; Jacob, Pierre E.; Johndrow, James E.
作者单位:Harvard University; Texas A&M University System; Texas A&M University College Station; ESSEC Business School; University of Pennsylvania
摘要:We consider Markov chain Monte Carlo (MCMC) algorithms for Bayesian high-dimensional regression with continuous shrinkage priors. A common challenge with these algorithms is the choice of the number of iterations to perform. This is critical when each iteration is expensive, as is the case when dealing with modern data sets, such as genome-wide association studies with thousands of rows and up to hundreds of thousands of columns. We develop coupling techniques tailored to the setting of high-d...
-
作者:Han, Rungang; Luo, Yuetian; Wang, Miaoyan; Zhang, Anru R.
作者单位:University of Wisconsin System; University of Wisconsin Madison; Duke University; Duke University; Duke University; Duke University
摘要:High-order clustering aims to identify heterogeneous substructures in multiway datasets that arise commonly in neuroimaging, genomics, social network studies, etc. The non-convex and discontinuous nature of this problem pose significant challenges in both statistics and computation. In this paper, we propose a tensor block model and the computationally efficient methods, high-order Lloyd algorithm (HLloyd), and high-order spectral clustering (HSC), for high-order clustering. The convergence gu...
-
作者:Fan, Jianqing; Fan, Yingying; Han, Xiao; Lv, Jinchi
作者单位:Princeton University; University of Southern California; Chinese Academy of Sciences; University of Science & Technology of China, CAS
摘要:Network data are prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. Yet a simple fundamental question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this paper, we propose the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree-corrected mixed me...
-
作者:Dong, Jinshuo; Roth, Aaron; Su, Weijie J.
作者单位:University of Pennsylvania; University of Pennsylvania; University of Pennsylvania
摘要:In the past decade, differential privacy has seen remarkable success as a rigorous and practical formalization of data privacy. This privacy definition and its divergence based relaxations, however, have several acknowledged weaknesses, either in handling composition of private algorithms or in analysing important primitives like privacy amplification by suhsampling. Inspired by the hypothesis testing formulation of privacy, this paper proposes a new relaxation of differential privacy, which w...
-
作者:Mateu, Jorge
作者单位:Universitat Jaume I
-
作者:Wang, Ruodu; Ramdas, Aaditya
作者单位:University of Waterloo; Carnegie Mellon University; Carnegie Mellon University
摘要:E-values have gained attention as potential alternatives to p-values as measures of uncertainty, significance and evidence. In brief, e-values are realized by random variables with expectation at most one under the null; examples include betting scores, (point null) Bayes factors, likelihood ratios and stopped supermartingales. We design a natural analogue of the Benjamini-Hochberg (BH) procedure for false discovery rate (FDR) control that utilizes e-values, called the e-BH procedure, and comp...
-
作者:Zhao, Qingyuan; Small, Dylan S.; Ertefaie, Ashkan
作者单位:University of Cambridge; University of Pennsylvania; University of Rochester
摘要:Effect modification occurs when the effect of the treatment on an outcome varies according to the level of other covariates and often has important implications in decision-making. When there are tens or hundreds of covariates, it becomes necessary to use the observed data to select a simpler model for effect modification and then make valid statistical inference. We propose a two-stage procedure to solve this problem. First, we use Robinson's transformation to decouple the nuisance parameters...
-
作者:Karmakar, Bikram
作者单位:State University System of Florida; University of Florida
摘要:Blocked randomized designs are used to improve the precision of treatment effect estimates compared to a completely randomized design. A block is a set of units that are relatively homogeneous and consequently would tend to produce relatively similar outcomes if the treatment had no effect. The problem of finding the optimal blocking of the units into equal sized blocks of any given size larger than two is known to be a difficult problem-there is no polynomial time method guaranteed to find th...