-
作者:Qiu, Hongxiang; Carone, Marco; Sadikova, Ekaterina; Petukhova, Maria; Kessler, Ronald C.; Luedtke, Alex
作者单位:University of Washington; University of Washington Seattle; Harvard University; Harvard Medical School; University of Washington; University of Washington Seattle
摘要:There is an extensive literature on the estimation and evaluation of optimal individualized treatment rules in settings where all confounders of the effect of treatment on outcome are observed. We study the development of individualized decision rules in settings where some of these confounders may not have been measured but a valid binary instrument is available for a binary treatment. We first consider individualized treatment rules, which will naturally be most interesting in settings where...
-
作者:Ferrari, Federico; Dunson, David B.
作者单位:Duke University
摘要:This article is motivated by the problem of inference on interactions among chemical exposures impacting human health outcomes. Chemicals often co-occur in the environment or in synthetic mixtures and as a result exposure levels can be highly correlated. We propose a latent factor joint model, which includes shared factors in both the predictor and response components while assuming conditional independence. By including a quadratic regression in the latent variables in the response component,...
-
作者:Chau, Joris; von Sachs, Rainer
作者单位:Universite Catholique Louvain
摘要:Intrinsic wavelet transforms and wavelet estimation methods are introduced for curves in the non-Euclidean space of Hermitian positive definite matrices, with in mind the application to Fourier spectral estimation of multivariate stationary time series. The main focus is on intrinsic average-interpolation wavelet transforms in the space of positive definite matrices equipped with an affine-invariant Riemannian metric, and convergence rates of linear wavelet thresholding are derived for intrins...
-
作者:Qiu, Yixuan; Wang, Xiao
作者单位:Carnegie Mellon University; Purdue University System; Purdue University
摘要:Latent variable models cover a broad range of statistical and machine learning models, such as Bayesian models, linear mixed models, and Gaussian mixture models. Existing methods often suffer from two major challenges in practice: (a) a proper latent variable distribution is difficult to be specified; (b) making an exact likelihood inference is formidable due to the intractable computation. We propose a novel framework for the inference of latent variable models that overcomes these two limita...
-
作者:Lin, Kevin Z.; Lei, Jing; Roeder, Kathryn
作者单位:University of Pennsylvania; Carnegie Mellon University
摘要:Scientists often embed cells into a lower-dimensional space when studying single-cell RNA-seq data for improved downstream analyses such as developmental trajectory analyses, but the statistical properties of such nonlinear embedding methods are often not well understood. In this article, we develop the exponential-family SVD (eSVD), a nonlinear embedding method for both cells and genes jointly with respect to a random dot product model using exponential-family distributions. Our estimator use...
-
作者:Mohan, Karthika; Pearl, Judea
作者单位:University of California System; University of California Berkeley; University of California System; University of California Los Angeles
摘要:This article reviews recent advances in missing data research using graphical models to represent multivariate dependencies. We first examine the limitations of traditional frameworks from three different perspectives: transparency, estimability, and testability. We then show how procedures based on graphical models can overcome these limitations and provide meaningful performance guarantees even when data are missing not at random (MNAR). In particular, we identify conditions that guarantee c...
-
作者:Eckles, Dean; Bakshy, Eytan
作者单位:Massachusetts Institute of Technology (MIT); Massachusetts Institute of Technology (MIT); Facebook Inc
摘要:Peer effects, in which an individual's behavior is affected by peers' behavior, are posited by multiple theories in the social sciences. Randomized field experiments that identify peer effects, however, are often expensive or infeasible, so many studies of peer effects use observational data, which is expected to suffer from confounding. Here we show, in the context of information and media diffusion, that high-dimensional adjustment of a nonexperimental control group (660 million observations...
-
作者:Xie, Fangzheng; Xu, Yanxun
作者单位:Johns Hopkins University
摘要:We develop a Bayesian approach called the Bayesian projected calibration to address the problem of calibrating an imperfect computer model using observational data from an unknown complex physical system. The calibration parameter and the physical system are parameterized in an identifiable fashion via the L-2-projection. The physical system is imposed a Gaussian process prior distribution, which naturally induces a prior distribution on the calibration parameter through the L-2-projection con...
-
作者:Delaigle, Aurore; Hall, Peter; Huang, Wei; Kneip, Alois
作者单位:University of Melbourne; University of Melbourne; University of Bonn; University of Bonn
摘要:We consider the problem of estimating the covariance function of functional data which are only observed on a subset of their domain, such as fragments observed on small intervals or related types of functional data. We focus on situations where the data enable to compute the empirical covariance function or smooth versions of it only on a subset of its domain which contains a diagonal band. We show that estimating the covariance function consistently outside that subset is possible as long as...
-
作者:Jiang, Bei; Raftery, Adrian E.; Steele, Russell J.; Wang, Naisyin
作者单位:University of Alberta; University of Washington; University of Washington Seattle; McGill University; University of Michigan System; University of Michigan
摘要:There is a growing expectation that data collected by government-funded studies should be openly available to ensure research reproducibility, which also increases concerns about data privacy. A strategy to protect individuals' identity is to release multiply imputed (MI) synthetic datasets with masked sensitivity values. However, information loss or incorrectly specified imputation models can weaken or invalidate the inferences obtained from the MI-datasets. We propose a new masking framework...