-
作者:Kuroki, Manabu; Pearl, Judea
作者单位:Research Organization of Information & Systems (ROIS); Institute of Statistical Mathematics (ISM) - Japan; University of California System; University of California Los Angeles
摘要:This paper highlights several areas where graphical techniques can be harnessed to address the problem of measurement errors in causal inference. In particular, it discusses the control of unmeasured confounders in parametric and nonparametric models and the computational problem of obtaining bias-free effect estimates in such models. We derive new conditions under which causal effects can be restored by observing proxy variables of unmeasured confounders with/without external studies.
-
作者:Petrone, S.; Rousseau, J.; Scricciolo, C.
作者单位:Bocconi University; Institut Polytechnique de Paris; ENSAE Paris
摘要:Bayesian inference is attractive due to its internal coherence and for often having good frequentist properties. However, eliciting an honest prior may be difficult, and common practice is to take an empirical Bayes approach using an estimate of the prior hyperparameters. Although not rigorous, the underlying idea is that, for a sufficiently large sample size, empirical Bayes methods should lead to similar inferential answers as a proper Bayesian inference. However, precise mathematical result...
-
作者:Zhao, Sihai Dave; Cai, T. Tony; Li, Hongzhe
作者单位:University of Pennsylvania; University of Pennsylvania
摘要:It is often of interest to understand how the structure of a genetic network differs between two conditions. In this paper, each condition-specific network is modelled using the precision matrix of a multivariate normal random vector, and a method is proposed to directly estimate the difference of the precision matrices. In contrast to other approaches, such as separate or joint estimation of the individual matrices, direct estimation does not require those matrices to be sparse, and thus can ...
-
作者:Jansen, Maarten
作者单位:Universite Libre de Bruxelles
摘要:The optimization of an information criterion in a variable selection procedure leads to an additional bias, which can be substantial for sparse, high-dimensional data. One can compensate for the bias by applying shrinkage while estimating within the selected models. This paper presents modified information criteria for use in variable selection and estimation without shrinkage. The analysis motivating the modified criteria follows two routes. The first, which we explore for signal-plus-noise o...
-
作者:Lei, Jing
作者单位:Carnegie Mellon University
摘要:A framework for classification is developed with a notion of confidence. In this framework, a classifier consists of two tolerance regions in the predictor space, with a specified coverage level for each class. The classifier also produces an ambiguous region where the classification needs further investigation. Theoretical analysis reveals interesting structures of the confidence-ambiguity trade-off, and the optimal solution is characterized by extending the Neyman-Pearson lemma. We provide g...
-
作者:Fan, Yingying; Lv, Jinchi
作者单位:University of Southern California
摘要:Two important goals of high-dimensional modelling are prediction and variable selection. In this article, we consider regularization with combined L-1 and concave penalties, and study the sampling properties of the global optimum of the suggested method in ultrahigh-dimensional settings. The L-1 penalty provides the minimum regularization needed for removing noise variables in order to achieve oracle prediction risk, while a concave penalty imposes additional regularization to control model sp...
-
作者:Tchetgen, E. J. Tchetgen; Shpitser, I.
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; University of Southampton
摘要:Establishing cause-effect relationships is a standard goal of empirical science. Once the existence of a causal relationship is established, the precise causal mechanism involved becomes a topic of interest. A particularly popular type of mechanism analysis concerns questions of mediation, i.e., to what extent an effect is direct, and to what extent it is mediated by a third variable. A semiparametric theory has recently been proposed that allows multiply robust estimation of direct and mediat...
-
作者:Song, Rui; Lu, Wenbin; Ma, Shuangge; Jeng, X. Jessie
作者单位:North Carolina State University; Yale University
摘要:In modern statistical applications, the dimension of covariates can be much larger than the sample size. In the context of linear models, correlation screening (Fan & Lv, J. R. Statist. Soc. B, 70, 849-911, 2008) has been shown to reduce the dimension of such data effectively while achieving the sure screening property, i.e., all of the active variables can be retained with high probability. However, screening based on the Pearson correlation does not perform well when applied to contaminated ...
-
作者:Lee, Seunggeun; Zou, Fei; Wright, Fred A.
作者单位:University of Michigan System; University of Michigan; University of North Carolina; University of North Carolina Chapel Hill; North Carolina State University
摘要:The development of high-throughput biomedical technologies has led to increased interest in the analysis of high-dimensional data where the number of features is much larger than the sample size. In this paper, we investigate principal component analysis under the ultra-high dimensional regime, where both the number of features and the sample size increase as the ratio of the two quantities also increases. We bridge the existing results from the finite and the high-dimension low sample size re...
-
作者:Wu, Hsin-Ping; Stufken, John
作者单位:University System of Georgia; University of Georgia
摘要:Finding optimal designs for generalized linear models is a challenging problem. Recent research has identified the structure of optimal designs for generalized linear models with single or multiple unrelated explanatory variables that appear as first-order terms in the predictor. We consider generalized linear models with a single-variable quadratic polynomial as the predictor under a popular family of optimality criteria. When the design region is unrestricted, our results establish that opti...