-
作者:She, Yiyuan; Li, Shijie; Wu, Dapeng
作者单位:State University System of Florida; Florida State University
摘要:Recently, the robustification of principal component analysis (PCA) has attracted lots of attention from statisticians, engineers, and computer scientists. In this work, we study the type of outliers that are not necessarily apparent in the original observation space but can seriously affect the principal sub-space estimation. Based on a mathematical formulation of such transformed outliers, a novel robust orthogonal complement principal component analysis (ROC-PCA) is proposed. The framework ...
-
作者:Tibshirani, Ryan J.; Taylor, Jonathan; Lockhart, Richard; Tibshirani, Robert
作者单位:Carnegie Mellon University; Carnegie Mellon University
-
作者:Wei, Ying; Song, Xiaoyu; Liu, Mengling; Ionita-Laza, Iuliana; Reibman, Joan
作者单位:Columbia University; Columbia University; New York University; New York University
摘要:Case-control design is widely used in epidemiology and other fields to identify factors associated with a disease. Data collected from existing case-control studies can also provide a cost-effective way to investigate the association of risk factors with secondary outcomes. When the secondary outcome is a continuous random variable, most of the existing methods focus on the statistical inference on the mean of the secondary outcome. In this article, we propose a quantile-based approach to faci...
-
作者:Cai, Tianxi; Tian, Lu
作者单位:Harvard University; Stanford University
-
作者:Luedtke, Alexander R.; van der Laan, Mark J.
作者单位:Fred Hutchinson Cancer Center; University of California System; University of California Berkeley
-
作者:Murray, Jared S.; Reiter, Jerome P.
作者单位:Carnegie Mellon University; Duke University
摘要:We present a nonparametric Bayesian joint model for multivariate continuous and categorical variables, with the intention of developing a flexible engine for multiple imputation of missing values. The model fuses Dirichlet process mixtures of multinomial distributions for categorical variables with Dirichlet process mixtures of multivariate normal distributions for continuous variables. We incorporate dependence between the continuous-and categorical variables by (1) modeling the means of the ...
-
作者:Sun, Will Wei; Qiao, Xingye; Cheng, Guang
作者单位:Yahoo! Inc; University of Miami; State University of New York (SUNY) System; Binghamton University, SUNY; Purdue University System; Purdue University
摘要:The stability of statistical analysis is an important indicator for reproducibility, which is one main principle of the scientific method. It entails that similar statistical conclusions can be reached based on independent samples from the same underlying population. In this article, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Interestingly, the asymptotic CIS of any weighted nearest ...
-
作者:Xu, Yanxun; Muller, Peter; Wahed, Abdus S.; Thall, Peter
作者单位:University of Texas System; University of Texas Austin; University of Texas System; University of Texas Austin; Pennsylvania Commonwealth System of Higher Education (PCSHE); University of Pittsburgh; University of Texas System; UTMD Anderson Cancer Center
-
作者:Yu, Guan; Liu, Yufeng
作者单位:University of North Carolina; University of North Carolina Chapel Hill; University of North Carolina; University of North Carolina Chapel Hill
摘要:With the abundance of high-dimensional data in various disciplines, sparse regularized techniques are very popular these days. In this article, we make use of the structure information among predictors to improve sparse regression models. Typically, such structure information can be modeled by the connectivity of an undirected graph using all predictors as nodes of the graph. Most existing methods use this undirected graph edge-by-edge to encourage the regression coefficients of corresponding ...
-
作者:Jiang, Ci-Ren; Aston, John A. D.; Wang, Jane-Ling
作者单位:University of Cambridge; University of California System; University of California Davis
摘要:Positron emission tomography (PET) is an imaging technique which can be used to investigate chemical changes in human biological processes such as cancer development or neurochemical reactions. Most dynamic PET scans are currently analyzed based on the assumption that linear first-order kinetics can be used to adequately describe the system under observation. However, there has recently been strong evidence that this is not the case. To provide an analysis of PET data which is free from this c...