-
作者:Frot, Benjamin; Jostins, Luke; McVean, Gilean
作者单位:University of Oxford; University of Oxford; Wellcome Centre for Human Genetics; University of Oxford; Kennedy Institute for Rheumatology; University of Oxford
摘要:We consider the problem of learning a conditional Gaussian graphical model in the presence of latent variables. Building on recent advances in this field, we suggest a method that decomposes the parameters of a conditional Markov random field into the sum of a sparse and a low-rank matrix. We derive convergence bounds for this estimator and show that it is well-behaved in the high-dimensional regime as well as sparsistent (i.e., capable of recovering the graph structure). We then show how prox...
-
作者:Luo, Xiangyu; Wei, Yingying
作者单位:Chinese University of Hong Kong
摘要:High-throughput experimental data are accumulating exponentially in public databases. Unfortunately, however, mining valid scientific discoveries from these abundant resources is hampered by technical artifacts and inherent biological heterogeneity. The former are usually termed batch effects, and the latter is often modeled by subtypes. Existing methods either tackle batch effects provided that subtypes are known or cluster subtypes assuming that batch effects are absent. Consequently, there ...
-
作者:Zhao, Qingyuan
作者单位:University of Pennsylvania
摘要:This article proposes a new quantity called the sensitivity value, which is defined as the minimum strength of unmeasured confounders needed to change the qualitative conclusions of a naive analysis assuming no unmeasured confounder. We establish the asymptotic normality of the sensitivity value in pair-matched observational studies. The theoretical results are then used to approximate the power of a sensitivity analysis and select the design of a study. We explore the potential to use sensiti...
-
作者:Einmahl, Jesson J.; Einmahl, John H. J.; de Haan, Laurens
作者单位:Tilburg University; Tilburg University; Tilburg University; Erasmus University Rotterdam; Erasmus University Rotterdam - Excl Erasmus MC; Universidade de Lisboa
摘要:There is no scientific consensus on the fundamental question whether the probability distribution of the human life span has a finite endpoint or not and, if so, whether this upper limit changes over time. Our study uses a unique dataset of the ages at death-in days-of all (about 285,000) Dutch residents, born in the Netherlands, who died in the years 1986-2015 at a minimum age of 92 years and is based on extreme value theory, the coherent approach to research problems of this type. Unlike som...
-
作者:Nie, Xiao; Chien, Peter; Morgan, Dane; Kaczmarowski, Amy
作者单位:University of Wisconsin System; University of Wisconsin Madison; University of Wisconsin System; University of Wisconsin Madison
摘要:Statistical design and analysis of computer experiments is a growing area in statistics. Computer models with structural invariance properties now appear frequently in materials science, physics, biology, and other fields. These properties are consequences of dependency on structural geometry, and cannot be accommodated by standard statistical emulation methods. In this article, we propose a statistical framework for building emulators to preserve invariance. The framework uses a weighted comp...
-
作者:Carone, Marco; Luedtke, Alexander R.; van der Laan, Mark J.
作者单位:University of Washington; University of Washington Seattle; Fred Hutchinson Cancer Center; University of California System; University of California Berkeley
摘要:Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are simple and convenient to use. In particular, efficient estimation procedures in parametric models are easy to describe and implement. Unfortunately, the same cannot be said of semiparametric and nonparametric models. While the latter often reflect the level of available scientific knowledge more appropriately, performing efficient inference in these models is ge...
-
作者:Johndrow, James E.; Smith, Aaron; Pillai, Natesh; Dunson, David B.
作者单位:Stanford University; University of Ottawa; Harvard University; Duke University
摘要:Many modern applications collect highly imbalanced categorical data, with some categories relatively rare. Bayesian hierarchical models combat data sparsity by borrowing information, while also quantifying uncertainty. However, posterior computation presents a fundamental barrier to routine use; a single class of algorithms does not work well in all settings and practitioners waste time trying different types of Markov chain Monte Carlo (MCMC) approaches. This article was motivated by an appli...
-
作者:Belloni, Alexandre; Chernozhukov, Victor; Kato, Kengo
作者单位:Duke University; Massachusetts Institute of Technology (MIT); University of Tokyo
摘要:This work proposes new inference methods for a regression coefficient of interest in a (heterogenous) quantile regression model. We consider a high-dimensional model where the number of regressors potentially exceeds the sample size but a subset of them suffices to construct a reasonable approximation to the conditional quantile function. The proposed methods are (explicitly or implicitly) based on orthogonal score functions that protect against moderate model selection mistakes, which are oft...
-
作者:Guerrier, Stephane; Dupuis-Lozeron, Elise; Ma, Yanyuan; Victoria-Feser, Maria-Pia
作者单位:Pennsylvania Commonwealth System of Higher Education (PCSHE); Pennsylvania State University; Pennsylvania State University - University Park; University of Geneva
摘要:Along with the ever increasing data size and model complexity, an important challenge frequently encountered in constructing new estimators or in implementing a classical one such as the maximum likelihood estimator, is the computational aspect of the estimation procedure. To carry out estimation, approximate methods such as pseudo-likelihood functions or approximated estimating equations are increasingly used in practice as these methods are typically easier to implement numerically although ...
-
作者:Kong, Efang; Xia, Yingcun; Zhong, Wei
作者单位:University of Electronic Science & Technology of China; National University of Singapore; Xiamen University; Xiamen University
摘要:In this article, we propose to measure the dependence between two random variables through a composite coefficient of determination (CCD) of a set of nonparametric regressions. These regressions take consecutive binarizations of one variable as the response and the other variable as the predictor. The resulting measure is invariant to monotonic marginal variable transformation, rendering it robust against heavy-tailed distributions and outliers, and convenient for independent testing. Estimati...