-
作者:Fasano, Augusto; Durante, Daniele; Zanella, Giacomo
作者单位:Bocconi University
摘要:Modern methods for Bayesian regression beyond the Gaussian response setting are often computationally impractical or inaccurate in high dimensions. In fact, as discussed in recent literature, bypassing such a trade-off is still an open problem even in routine binary regression models, and there is limited theory on the quality of variational approximations in high-dimensional settings. To address this gap, we study the approximation accuracy of routinely used mean-field variational Bayes solut...
-
作者:Ghodrati, Laya; Panaretos, Victor M.
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne
摘要:We present a framework for performing regression when both covariate and response are probability distributions on a compact interval. Our regression model is based on the theory of optimal transportation, and links the conditional Frechet mean of the response to the covariate via an optimal transport map. We define a Frechet-least-squares estimator of this regression map, and establish its consistency and rate of convergence to the true map, under both full and partial observations of the reg...
-
作者:Yin, J.; Markes, S.; Richardson, T. S.; Wang, L.
作者单位:University of Washington; University of Washington Seattle; University of Toronto; University of Washington; University of Washington Seattle
摘要:Generalized linear models, such as logistic regression, are widely used to model the association between a treatment and a binary outcome as a function of baseline covariates. However, the coefficients of a logistic regression model correspond to log odds ratios, while subject-matter scientists are often interested in relative risks. Although odds ratios are sometimes used to approximate relative risks, this approximation is appropriate only when the outcome of interest is rare for all levels ...
-
作者:Liu, Molei; Katsevich, Eugene; Janson, Lucas; Ramdas, Aaditya
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; University of Pennsylvania; Harvard University; Carnegie Mellon University
摘要:We consider the problem of conditional independence testing: given a response Y and covariates (X, Z), we test the null hypothesis that Y perpendicular to X | Z. The conditional randomization test was recently proposed as a way to use distributional information about X | Z to exactly and nonasymptotically control Type-I error using any test statistic in any dimensionality without assuming anything about Y | (X, Z). This flexibility, in principle, allows one to derive powerful test statistics f...
-
作者:Loper, J. H.; Lei, L.; Fithian, W.; Tansey, W.
作者单位:Columbia University; Stanford University; University of California System; University of California Berkeley; Memorial Sloan Kettering Cancer Center
摘要:We consider the problem of multiple hypothesis testing when there is a logical nested structure to the hypotheses. When one hypothesis is nested inside another, the outer hypothesis must be false if the inner hypothesis is false. We model the nested structure as a directed acyclic graph, including chain and tree graphs as special cases. Each node in the graph is a hypothesis and rejecting a node requires also rejecting all of its ancestors. We propose a general framework for adjusting node-lev...
-
作者:Da Silva, D. N.; Skinner, C. J.
作者单位:University of London; London School Economics & Political Science
摘要:Paradata refers to survey variables which are not of direct interest themselves, but are related to the quality of data on survey variables which are of interest. We focus on a categorical paradata variable, which reflects the presence of measurement error in a variable of interest. We propose a quasi-score test of the hypothesis of no measurement error bias in the estimation of regression coefficients under models for paradata. We also propose a regression-based test, analogous to a simple te...
-
作者:Kuffner, T. A.; Lee, S. M. S.; Young, G. A.
作者单位:Washington University (WUSTL); University of Hong Kong; Imperial College London
摘要:We establish a general theory of optimality for block bootstrap distribution estimation for sample quantiles under mild strong mixing conditions. In contrast to existing results, we study the block bootstrap for varying numbers of blocks. This corresponds to a hybrid between the subsampling bootstrap and the moving block bootstrap, in which the number of blocks is between 1 and the ratio of sample size to block length. The hybrid block bootstrap is shown to give theoretical benefits, and start...
-
作者:Nie, X.; Wager, S.
作者单位:Stanford University; Stanford University
摘要:Flexible estimation of heterogeneous treatment effects lies at the heart of many statistical applications, such as personalized medicine and optimal resource allocation. In this article we develop a general class of two-step algorithms for heterogeneous treatment effect estimation in observational studies. First, we estimate marginal effects and treatment propensities to form an objective function that isolates the causal component of the signal. Then, we optimize this data-adaptive objective ...
-
作者:Fang, Junhan; Yi, Grace Y.
作者单位:University of Waterloo; Western University (University of Western Ontario)
摘要:Measurement error in covariates has been extensively studied in many conventional regression settings where covariate information is typically expressed in a vector form. However, there has been little work on error-prone matrix-variate data, which commonly arise from studies with imaging, spatial-temporal structures, etc. We consider analysis of error-contaminated matrix-variate data. We particularly focus on matrix-variate logistic measurement error models. We examine the biases induced from...
-
作者:Rotnitzky, A.; Smucler, E.; Robins, J. M.
作者单位:Universidad Torcuato Di Tella; Universidad Torcuato Di Tella; Harvard University; Harvard T.H. Chan School of Public Health
摘要:We study a class of parameters with the so-called mixed bias property. For parameters with this property, the bias of the semiparametric efficient one-step estimator is equal to the mean of the product of the estimation errors of two nuisance functions. In nonparametric models, parameters with the mixed bias property admit so-called rate doubly robust estimators, i.e., estimators that are consistent and asymptotically normal when one succeeds in estimating both nuisance functions at sufficient...