-
作者:Drechsler, Joerg
作者单位:University System of Maryland; University of Maryland College Park
摘要:Government agencies typically need to take potential risks of disclosure into account whenever they publish statistics based on their data or give external researchers access to collected data. In this context, the promise of formal privacy guarantees offered by concepts such as differential privacy seems to be the panacea enabling the agencies to quantify and control the privacy loss incurred by any data release exactly. Nevertheless, despite the excitement in academia and industry, most agen...
-
作者:Wang, Wenbo; Qiao, Xingye
作者单位:State University of New York (SUNY) System; Binghamton University, SUNY
摘要:This article concerns cautious classification models that are allowed to predict a set of class labels or reject to make a prediction when the uncertainty in the prediction is high. This set-valued classification approach is equivalent to the task of acceptance region learning, which aims to identify subsets of the input space, each of which guarantees to cover observations in a class with at least a predetermined probability. We propose to directly learn the acceptance regions through risk mi...
-
作者:Lee, Sze Ming; Sit, Tony; Xu, Gongjun
作者单位:Chinese University of Hong Kong; University of Michigan System; University of Michigan
摘要:Censored quantile regression (CQR) has received growing attention in survival analysis because of its flexibility in modeling heterogeneous effect of covariates. Advances have been made in developing various inferential procedures under different assumptions and settings. Under the conditional independence assumption, many existing CQR methods can be characterized either by stochastic integral-based estimating equations (see, e.g., Peng and Huang) or by locally weighted approaches to adjust fo...
-
作者:Gao, Zhaoxing; Tsay, Ruey S.
作者单位:Zhejiang University; University of Chicago
摘要:This article proposes a hierarchical approximate-factor approach to analyzing high-dimensional, large-scale heterogeneous time series data using distributed computing. The new method employs a multiple-fold dimension reduction procedure using Principal Component Analysis (PCA) and shows great promises for modeling large-scale data that cannot be stored nor analyzed by a single machine. Each computer at the basic level performs a PCA to extract common factors among the time series assigned to i...
-
作者:Pan, Yinghao; Laber, Eric B.; Smith, Maureen A.; Zhao, Ying-Qi
作者单位:University of North Carolina; University of North Carolina Charlotte; North Carolina State University; University of Wisconsin System; University of Wisconsin Madison; Fred Hutchinson Cancer Center
摘要:Uncontrolled glycated hemoglobin (HbA1c) levels are associated with adverse events among complex diabetic patients. These adverse events present serious health risks to affected patients and are associated with significant financial costs. Thus, a high-quality predictive model that could identify high-risk patients so as to inform preventative treatment has the potential to improve patient outcomes while reducing healthcare costs. Because the biomarker information needed to predict risk is cos...
-
作者:Zhou, Le; Zou, Hui
作者单位:University of Minnesota System; University of Minnesota Twin Cities
摘要:There is a vast amount of work on high-dimensional regression. The common starting point for the existing theoretical work is to assume the data generating model is a homoscedastic linear regression model with some sparsity structure. In reality the homoscedasticity assumption is often violated, and hence understanding the heteroscedasticity of the data is of critical importance. In this article we systematically study the estimation of a high-dimensional heteroscedastic regression model. In p...
-
作者:Park, Jaewoo
作者单位:Yonsei University
-
作者:Chan, Kwun Chuen Gary
作者单位:University of Washington; University of Washington Seattle
-
作者:Guggisberg, Michael
摘要:This article presents a Bayesian approach to multiple-output quantile regression. The prior can be elicited as ex-ante knowledge of the distance of the tau-Tukey depth contour to the Tukey median, the first prior of its kind. The parametric model is proven to be consistent and a procedure to obtain confidence intervals is proposed. A proposal for nonparametric multiple-output regression is also presented. These results add to the literature of misspecified Bayesian modeling, consistency, and p...
-
作者:Zhang, Yichi; Shen, Weining; Kong, Dehan
作者单位:North Carolina State University; University of California System; University of California Irvine; University of Toronto
摘要:Covariance estimation for matrix-valued data has received an increasing interest in applications. Unlike previous works that rely heavily on matrix normal distribution assumption and the requirement of fixed matrix size, we propose a class of distribution-free regularized covariance estimation methods for high-dimensional matrix data under a separability condition and a bandable covariance structure. Under these conditions, the original covariance matrix is decomposed into a Kronecker product ...