-
作者:Fan, Jianqing; Guo, Yongyi; Wang, Kaizheng
作者单位:Princeton University; Columbia University
摘要:When the data are stored in a distributed manner, direct applications of traditional statistical inference procedures are often prohibitive due to communication costs and privacy concerns. This article develops and investigates two communication-efficient accurate statistical estimators (CEASE), implemented through iterative algorithms for distributed optimization. In each iteration, node machines carry out computation in parallel and communicate with the central processor, which then broadcas...
-
作者:Chen, Hao; Xia, Yin
作者单位:University of California System; University of California Davis; Fudan University
摘要:Many statistical methodologies for high-dimensional data assume the population is normal. Although a few multivariate normality tests have been proposed, to the best of our knowledge, none of them can properly control the Type I error when the dimension is larger than the number of observations. In this work, we propose a novel nonparametric test that uses the nearest neighbor information. The proposed method guarantees the asymptotic Type I error control under the high-dimensional setting. Si...
-
作者:Diaz, Ivan; Williams, Nicholas; Hoffman, Katherine L.; Schenck, Edward J.
作者单位:Cornell University; Weill Cornell Medicine; Cornell University; Weill Cornell Medicine
摘要:Most causal inference methods consider counterfactual variables under interventions that set the exposure to a fixed value. With continuous or multi-valued treatments or exposures, such counterfactuals may be of little practical interest because no feasible intervention can be implemented that would bring them about. Longitudinal modified treatment policies (LMTPs) are a recently developed nonparametric alternative that yield effects of immediate practical relevance with an interpretation in t...
-
作者:McFowland, Edward, III; Shalizi, Cosma Rohilla
作者单位:University of Minnesota System; University of Minnesota Twin Cities; Carnegie Mellon University; The Santa Fe Institute
摘要:Social influence cannot be identified from purely observational data on social networks, because such influence is generically confounded with latent homophily, that is, with a node's network partners being informative about the node's attributes and therefore its behavior. If the network grows according to either a latent community (stochastic block) model, or a continuous latent space model, then latent homophilous attributes can be consistently estimated from the global pattern of social ti...
-
作者:Zhou, Jie; Sun, Will Wei; Zhang, Jingfei; Li, Lexin
作者单位:University of Miami; Purdue University System; Purdue University; University of California System; University of California Berkeley
摘要:In modern data science, dynamic tensor data prevail in numerous applications. An important task is to characterize the relationship between dynamic tensor datasets and external covariates. However, the tensor data are often only partially observed, rendering many existing methods inapplicable. In this article, we develop a regression model with a partially observed dynamic tensor as the response and external covariates as the predictor. We introduce the low-rankness, sparsity, and fusion struc...
-
作者:Chandra, Noirrit Kiran; Sarkar, Abhra; de Groot, John F.; Yuan, Ying; Mueller, Peter
作者单位:University of Texas System; University of Texas Dallas; University of Texas System; University of Texas Austin; University of California System; University of California San Francisco; University of Texas System; UTMD Anderson Cancer Center; University of Texas System; University of Texas Austin
摘要:The availability of electronic health records (EHR) has opened opportunities to supplement increasingly expensive and difficult to carry out randomized controlled trials (RCT) with evidence from readily available real world data. In this paper, we use EHR data to construct synthetic control arms for treatment-only single arm trials. We propose a novel nonparametric Bayesian common atoms mixture model that allows us to find equivalent population strata in the EHR and the treatment arm and then ...
-
作者:Guo, Xinzhou; Wei, Waverly; Liu, Molei; Cai, Tianxi; Wu, Chong; Wang, Jingshen
作者单位:Hong Kong University of Science & Technology; University of California System; University of California Berkeley; Harvard University; Harvard T.H. Chan School of Public Health; University of Texas System; UTMD Anderson Cancer Center
摘要:There have been increased concerns that the use of statins, one of the most commonly prescribed drugs for treating coronary artery disease, is potentially associated with the increased risk of new-onset type II diabetes (T2D). Nevertheless, to date, there is no robust evidence supporting as to whether and what kind of populations are indeed vulnerable for developing T2D after taking statins. In this case study, leveraging the biobank and electronic health record data in the Partner Health Syst...
-
作者:Dubey, Paromita; Muller, Hans-Georg
作者单位:University of Southern California; University of California System; University of California Davis
-
作者:Henderson, Nicholas C.; Varadhan, Ravi; Louis, Thomas A.
作者单位:University of Michigan System; University of Michigan; Johns Hopkins University; Johns Hopkins Medicine; Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health
摘要:Shrinkage estimates of small domain parameters typically use a combination of a noisy direct estimate that only uses data from a specific small domain and a more stable regression estimate. When the regression model is misspecified, estimation performance for the noisier domains can suffer due to substantial shrinkage toward a poorly estimated regression surface. In this article, we introduce a new class of robust, empirically-driven regression weights that target estimation of the small domai...
-
作者:Stensrud, Mats J.; Robins, James M.; Sarveta, Aaron; Tchetgen, Eric J. Tchetgen; Young, Jessica G.
作者单位:Swiss Federal Institutes of Technology Domain; Ecole Polytechnique Federale de Lausanne; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard T.H. Chan School of Public Health; University of Pennsylvania; Harvard University; Harvard Medical School; Harvard Pilgrim Health Care
摘要:Researchers are often interested in treatment effects on outcomes that are only defined conditional on posttreatment events. For example, in a study of the effect of different cancer treatments on quality of life at end of follow-up, the quality of life of individuals who die during the study is undefined. In these settings, naive contrasts of outcomes conditional on posttreatment events are not average causal effects, even in randomized experiments. Therefore, the effect in the principal stra...