-
作者:Ye, Ting; Chen, Kan; Small, Dylan
作者单位:University of Washington; University of Washington Seattle; Harvard University; University of Pennsylvania
摘要:Does having firearms in the home increase suicide risk? To test this hypothesis, a matched case-control study can be performed, in which suicide case subjects are compared to living controls who are similar in observed covariates in terms of their retrospective exposure to firearms at home. In this application, cases can be defined using a broad case definition (suicide) or a narrow case definition (suicide occurred at home). The broad case definition offers a larger number of cases, but the n...
-
作者:Zhou, Wenzhuo; Qu, Annie; Cooper, Keiland W.; Fortin, Norbert; Shahbaba, Babak
作者单位:University of California System; University of California Irvine; University of California System; University of California Irvine
摘要:Graph Neural Networks (GNNs) have achieved promising performance in a variety of graph-focused tasks. Despite their success, however, existing GNNs suffer from two significant limitations: a lack of interpretability in their results due to their black-box nature, and an inability to learn representations of varying orders. To tackle these issues, we propose a novel Model-agnostic Graph Neural Network (MaGNet) framework, which is able to effectively integrate information of various orders, extr...
-
作者:Duan, Yunshan; Guo, Shuai; Wang, Wenyi; Mueller, Peter
作者单位:University of Texas System; University of Texas Austin; University of Texas System; UTMD Anderson Cancer Center
摘要:Comparison of transcriptomic data across different conditions is of interest in many biomedical studies. In this article, we consider comparative immune cell profiling for early-onset (EO) versus late-onset (LO) colorectal cancer (CRC). EOCRC, diagnosed between ages 18-45, is a rising public health concern that needs to be urgently addressed. However, its etiology remains poorly understood. We work toward filling this gap by identifying homogeneous T cell sub-populations that show significantl...
-
作者:Tian, Ye; Feng, Yang
作者单位:Columbia University; New York University
摘要:Most existing classification methods aim to minimize the overall misclassification error rate. However, in applications such as loan default prediction, different types of errors can have varying consequences. To address this asymmetry issue, two popular paradigms have been developed: the Neyman-Pearson (NP) paradigm and the cost-sensitive (CS) paradigm. Previous studies on the NP paradigm have primarily focused on the binary case, while the multi-class NP problem poses a greater challenge due...
-
作者:Gao, Zhaoxing; Tsay, Ruey S.
作者单位:University of Electronic Science & Technology of China; Zhejiang University; University of Chicago
摘要:This paper proposes a novel dynamic forecasting method using a new supervised Principal Component Analysis (PCA) when a large number of predictors are available. The new supervised PCA provides an effective way to bridge the gap between predictors and the target variable of interest by scaling and combining the predictors and their lagged values, resulting in an effective dynamic forecasting. Unlike the traditional diffusion-index approach, which does not learn the relationships between the pr...
-
作者:Castillo-Mateo, Jorge; Gelfand, Alan E.; Gracia-Tabuenca, Zeus; Asin, Jesus; Cebrian, Ana C.
作者单位:University of Zaragoza; Duke University
摘要:Record-breaking temperature events are now very frequently in the news, viewed as evidence of climate change. With this as motivation, we undertake the first substantial spatial modeling investigation of temperature record-breaking across years for any given day within the year. We work with a dataset consisting of over 60 years (1960-2021) of daily maximum temperatures across peninsular Spain. Formal statistical analysis of record-breaking events is an area that has received attention primari...
-
作者:Bonas, Matthew; Richter, David H.; Castruccio, Stefano
作者单位:University of Notre Dame; University of Notre Dame
摘要:When a fluid flows over a solid surface, it creates a thin boundary layer where the flow velocity is influenced by the surface through viscosity, and can transition from laminar to turbulent at sufficiently high speeds. Understanding and forecasting the fluid dynamics under these conditions is one of the most challenging scientific problems in fluid dynamics. It is therefore of high interest to formulate models able to capture the nonlinear spatio-temporal velocity structure as well as produce...
-
作者:del Barrio, Eustasio; Sanz, Alberto Gonzalez; Hallin, Marc
作者单位:Universidad de Valladolid; Columbia University; Universite Libre de Bruxelles; Universite Libre de Bruxelles; Czech Academy of Sciences; Institute of Information Theory & Automation of the Czech Academy of Sciences
摘要:Building on recent measure-transportation-based concepts of multivariate quantiles, we are considering the problem of nonparametric multiple-output quantile regression. Our approach defines nested conditional center-outward quantile regression contours and regions with given conditional probability content, the graphs of which constitute nested center-outward quantile regression tubes with given unconditional probability content; these (conditional and unconditional) probability contents do no...
-
作者:Ohnishi, Yuki; Karmakar, Bikram; Kar, Wreetabrata
作者单位:Yale University; State University System of Florida; University of Florida; State University of New York (SUNY) System; University at Buffalo, SUNY
摘要:Organizations are increasingly relying on digital communications, such as targeted e-mails and mobile notifications, to engage with their audiences. Despite the evident advantages like cost-effectiveness and customization, assessing the effectiveness of such communications from observational data poses various statistical challenges. An immediate challenge is to adjust for targeting rules used in these communications. When digital communications involve a sequence of e-mails or notifications, ...
-
作者:Pensia, Ankit; Jog, Varun; Loh, Po-Ling
作者单位:University of California System; University of California Berkeley; University of Cambridge
摘要:We study the problem of linear regression where both covariates and responses are potentially (i) heavy-tailed and (ii) adversarially contaminated. Several computationally efficient estimators have been proposed for the simpler setting where the covariates are sub-Gaussian and uncontaminated; however, these estimators may fail when the covariates are either heavy-tailed or contain outliers. In this work, we show how to modify the Huber regression, least trimmed squares, and least absolute devi...