-
作者:Gibbs, Isaac; Cherian, John J.; Candes, Emmanuel J.
作者单位:Stanford University; Stanford University
摘要:We consider the problem of constructing distribution-free prediction sets with finite-sample conditional guarantees. Prior work has shown that it is impossible to provide exact conditional coverage universally in finite samples. Thus, most popular methods only guarantee marginal coverage over the covariates or are restricted to a limited set of conditional targets, e.g. coverage over a finite set of prespecified subgroups. This paper bridges this gap by defining a spectrum of problems that int...
-
作者:Jiang, Binyan; Lv, Jing; Li, Jialiang; Cheng, Ming-Yen
作者单位:Hong Kong Polytechnic University; Southwest University - China; National University of Singapore; Hong Kong Baptist University
摘要:Model averaging is an attractive ensemble technique to construct fast and accurate prediction. Despite of having been widely practiced in cross-sectional data analysis, its application to longitudinal data is rather limited so far. We consider model averaging for longitudinal response when the number of covariates is ultrahigh. To this end, we propose a novel two-stage procedure in which variable screening is first conducted and then followed by model averaging. In both stages, a robust rank-b...
-
作者:Whitehouse, Michael; Whiteley, Nick; Rimella, Lorenzo
-
作者:Bruns-Smith, David; Dukes, Oliver; Feller, Avi; Ogburn, Elizabeth L.
作者单位:Stanford University; Ghent University; University of California System; University of California Berkeley; Johns Hopkins University; Johns Hopkins Bloomberg School of Public Health
摘要:We provide a novel characterization of augmented balancing weights, also known as automatic debiased machine learning. These popular doubly robust estimators combine outcome modelling with balancing weights-weights that achieve covariate balance directly instead of estimating and inverting the propensity score. When the outcome and weighting models are both linear in some (possibly infinite) basis, we show that the augmented estimator is equivalent to a single linear model with coefficients th...
-
作者:Chen, Yuexin; Zhu, Lixing; Xu, Wangli
作者单位:Renmin University of China; Renmin University of China; Beijing Normal University
摘要:This article proposes a calibrated empirical likelihood test for ultra-high dimensional means that incorporates multiple projections. Under weak moment conditions on the distributions of data, we analyse all possible asymptotic distributions of the proposed test statistic in different scenarios. To determine the critical value and enhance test power, we employ the random symmetrization method based on the group of sign flips and use multiple selected projections. The test can still maintain th...
-
作者:Javanmard, Adel; Shao, Simeng; Bien, Jacob
作者单位:University of Southern California; Amazon.com
摘要:Large datasets make it possible to build predictive models that can capture heterogenous relationships between the response variable and features. The mixture of high-dimensional linear experts model posits that observations come from a mixture of high-dimensional linear regression models, where the mixture weights are themselves feature-dependent. In this article, we show how to construct valid prediction sets for an & ell;1-penalized mixture of experts model in the high-dimensional setting. ...
-
作者:Jones, Jeremiah; Ertefaie, Ashkan; Strawderman, Robert L.
作者单位:University of Rochester; Eli Lilly
摘要:Researchers are often interested in learning not only the effect of treatments on outcomes, but also the mechanisms that transmit these effects. A mediator is a variable that is affected by treatment and subsequently affects outcome. Existing methods for penalized mediation analyses may lead to ignoring important mediators and either assume that finite-dimensional linear models are sufficient to remove confounding bias, or perform no confounding control at all. In practice, these assumptions m...
-
作者:Li, Sai; Ye, Ting
作者单位:Renmin University of China; University of Washington; University of Washington Seattle
摘要:Mendelian randomization (MR) is a powerful method that uses genetic variants as instrumental variables to infer the causal effect of a modifiable exposure on an outcome. We study inference for bi-directional causal relationships and causal directions with possibly pleiotropic genetic variants. We show that assumptions for common MR methods are often impossible or too stringent given the potential bi-directional relationships. We propose a new focusing framework for testing bi-directional causa...
-
作者:Zhang, Chenlin; Zhou, Ling; Guo, Bin; Lin, Huazhen
-
作者:Xu, Zhiwei; Gan, Ziming; Zhou, Doudou; Shen, Shuting; Lu, Junwei; Cai, Tianxi
作者单位:University of Michigan System; University of Michigan; University of Chicago; National University of Singapore; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; Harvard Medical School
摘要:The effective analysis of high-dimensional Electronic Health Record (EHR) data, with substantial potential for healthcare research, presents notable methodological challenges. Employing predictive modeling guided by a knowledge graph (KG), which enables efficient feature selection, can enhance both statistical efficiency and interpretability. While various methods have emerged for constructing KGs, existing techniques often lack statistical certainty concerning the presence of links between en...