-
作者:Liu, Dungang; Lin, Zewei; Zhang, Heping
作者单位:University System of Ohio; University of Cincinnati; Texas State University System; Texas State University San Marcos; Yale University; Yale University; Yale University
摘要:Model diagnostics is an indispensable component in regression analysis, yet it has not been well addressed in generalized linear models (GLMs). When outcome data are discrete, classical Pearson and deviance residuals have limited utility in generating diagnostic insights. This article establishes a novel diagnostic framework for GLMs and their extensions. Unlike the convention of using a point statistic as a residual, we propose to use a function as a vehicle to retain residual information. In...
-
作者:Chong, Carsten H.; Todorov, Viktor
作者单位:Hong Kong University of Science & Technology; Northwestern University
摘要:We develop a nonparametric test for deciding whether volatility of an asset follows a standard semimartingale process, with paths of finite quadratic variation, or a rough process with paths of infinite quadratic variation. The test uses the fact that volatility is rough if and only if volatility increments are negatively autocorrelated at high frequencies. It is based on the sample autocovariance of increments of spot volatility estimates computed from high-frequency asset return data. By sho...
-
作者:Ma, Huijuan; Zhao, Wei; Hanfelt, John; Peng, Limin
作者单位:East China Normal University; Shandong University; Emory University
摘要:Chronic disease studies often collect data on biological and clinical markers at follow-up visits to monitor disease progression. Viewing such longitudinal measurements governed by latent continuous trajectories, we develop a new dynamic regression framework to investigate the heterogeneity pattern of certain features of the latent individual trajectory that may carry substantive information on disease risk or status. Employing the strategy of multi-level modeling, we formulate the latent indi...
-
作者:Jing, Kaili; Khalili, Abbas; Xu, Chen
作者单位:Xi'an Jiaotong University; McGill University; Peng Cheng Laboratory
摘要:Finite mixture of regression models are ubiquitous for analyzing complex data. They aim to detect heterogeneity in the effects of a set of features on a response over a finite number of latent classes. When the number of features is large, a direct fitting of mixture regressions can be computationally infeasible and often leads to a poor interpretative value. One practical strategy is to screen out most irrelevant features before an in-depth analysis. In this article, we propose a novel method...
-
作者:Du, Mingyue; Lou, Yichen; Sun, Jianguo
作者单位:Jilin University; Chinese University of Hong Kong; University of Missouri System; University of Missouri Columbia
摘要:Motivated by a breast cancer study, we consider regression analysis of interval-censored failure time data in the presence of a random change point. Although a great deal of literature on interval-censored data has been established, there does not seem to exist an established method that can allow for the existence of random change points. Such data can occur in, for example, clinical trials where the risk of a disease may dramatically change when some biological indexes of the human body exce...
-
作者:Sun, Ryan; Mccaw, Zachary R.; Lin, Xihong
作者单位:University of Texas System; UTMD Anderson Cancer Center; Harvard University; Harvard T.H. Chan School of Public Health; Harvard University
摘要:Causal mediation, pleiotropy, and replication analyses are three highly popular genetic study designs. Although these analyses address different scientific questions, the underlying statistical inference problems all involve large-scale testing of composite null hypotheses. The goal is to determine whether all null hypotheses-as opposed to at least one-in a set of individual tests should simultaneously be rejected. Recently, various methods have been proposed for each of these situations, incl...
-
作者:Agterberg, Joshua; Zhang, Anru R.
作者单位:University of Illinois System; University of Illinois Urbana-Champaign; Duke University; Duke University; Duke University; Duke University
摘要:Higher-order multiway data is ubiquitous in machine learning and statistics and often exhibits community-like structures, where each component (node) along each different mode has a community membership associated with it. In this article we propose the sub-Gaussian) tensor mixed-membership blockmodel, a generalization of the tensor blockmodel positing that memberships need not be discrete, but instead are convex combinations of latent communities. We establish the identifiability of our model...
-
作者:Liu, Weidong; Mao, Xiaojun; Zhang, Xiaofei; Zhang, Xin
作者单位:Shanghai Jiao Tong University; Shanghai Jiao Tong University; Zhongnan University of Economics & Law; Iowa State University; Zhongnan University of Economics & Law
摘要:Federated learning (FL) is an emerging topic due to its advantage in collaborative learning with distributed data. Due to the heterogeneity in the local data-generating mechanism, it is important to consider personalization when developing federated learning methods. In this work, we propose a personalized federated learning (PFL) method to address the robust regression problem. Specifically, we aim to learn the regression weight by solving a Huber loss with the sparse fused penalty. Additiona...
-
作者:Cao, Jian; Katzfuss, Matthias
作者单位:University of Houston System; University of Houston; University of Wisconsin System; University of Wisconsin Madison
摘要:Multivariate normal (MVN) probabilities arise in myriad applications, but they are analytically intractable and need to be evaluated via Monte Carlo-based numerical integration. For the state-of-the-art minimax exponential tilting (MET) method, we show that the complexity of each of its components can be greatly reduced through an integrand parameterization that uses the sparse inverse Cholesky factor produced by the Vecchia approximation, whose approximation error is often negligible relative...
-
作者:He, Chenxuan; Chen, Canyi; Zhu, Liping
作者单位:Renmin University of China; University of Michigan System; University of Michigan
摘要:Black-box learners have demonstrated remarkable success across various fields due to their high predictive accuracy. However, the complexity of their learning procedures poses significant challenges in evaluating whether a given learner has achieved optimal performance on datasets with unknown data-generating mechanisms. We propose a general goodness-of-fit test for assessing different learning procedures involving high-dimensional predictors, encompassing methods from classical linear regress...