-
作者:Mukhopadhyay, Minerva; Dunson, David B.
作者单位:Indian Institute of Technology System (IIT System); Indian Institute of Technology (IIT) - Kanpur; Duke University
摘要:We consider the problem of computationally efficient prediction with high dimensional and highly correlated predictors when accurate variable selection is effectively impossible. Direct application of penalization or Bayesian methods implemented with Markov chain Monte Carlo can be computationally daunting and unstable. A common solution is first stage dimension reduction through screening or projecting the design matrix to a lower dimensional hyper-plane. Screening is highly sensitive to thre...
-
作者:Su, Ryan; Lin, Xihong
作者单位:Harvard University; Harvard T.H. Chan School of Public Health; Harvard University; University of Texas System; UTMD Anderson Cancer Center
摘要:Studying the effects of groups of single nucleotide polymorphisms (SNPs), as in a gene, genetic pathway, or network, can provide novel insight into complex diseases such as breast cancer, uncovering new genetic associations and augmenting the information that can be gleaned from studying SNPs individually. Common challenges in set-based genetic association testing include weak effect sizes, correlation between SNPs in a SNP-set, and scarcity of signals, with individual SNP effects often rangin...
-
作者:Henderson, Robin; Makarenko, Irina; Bushby, Paul; Fletcher, Andrew; Shukurov, Anvar
作者单位:Newcastle University - UK
摘要:We use topological methods to investigate the small-scale variation and local spatial characteristics of the interstellar medium (ISM) in three regions of the southern sky. We demonstrate that there are circumstances where topological methods can identify differences in distributions when conventional marginal or correlation analyses may not. We propose a nonparametric method for comparing two fields based on the counts of topological features and the geometry of the associated persistence dia...
-
作者:Dette, Holger; Goesnnann, Josua
作者单位:Ruhr University Bochum
摘要:In this article, we propose a new approach for sequential monitoring of a general class of parameters of a d-dimensional time series, which can be estimated by approximately linear functionals of the empirical distribution function. We consider a closed-end method, which is motivated by the likelihood ratio test principle and compare the new method with two alternative procedures. We also incorporate self-normalization such that estimation of the long-run variance is not necessary. We prove th...
-
作者:Yang, Hojin; Baladandayuthapani, Veerabhadran; Rao, Arvind U. K.; Morris, Jeffrey S.
作者单位:University of Texas System; UTMD Anderson Cancer Center; University of Texas System; UTMD Anderson Cancer Center
摘要:Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected features. In this paper, we present methods to model the entire marginal distribution of pixel intensiti...
-
作者:Ma, Xinwei; Wang, Jingshen
作者单位:University of California System; University of California San Diego; University of California System; University of California Berkeley
摘要:Inverse probability weighting (IPW) is widely used in empirical work in economics and other disciplines. As Gaussian approximations perform poorly in the presence of small denominators, trimming is routinely employed as a regularization strategy. However, ad hoc trimming of the observations renders usual inference procedures invalid for the target estimand, even in large samples. In this article, we first show that the IPW estimator can have different (Gaussian or non-Gaussian) asymptotic dist...
-
作者:Efron, Bradley
作者单位:Stanford University
摘要:The scientific needs and computational limitations of the twentieth century fashioned classical statistical methodology. Both the needs and limitations have changed in the twenty-first, and so has the methodology. Large-scale prediction algorithms-neural nets, deep learning, boosting, support vector machines, random forests-have achieved star status in the popular press. They are recognizable as heirs to the regression tradition, but ones carried out at enormous scale and on titanic datasets. ...
-
作者:Kafadar, Karen
作者单位:University of Virginia
摘要:What does statistics have to offer science and society, in this age of massive data, machine learning algorithms, and multiple online sources of tools for data analysis? I recall a few situations where statistics made a real difference and reinforced the impact of our discipline on society. Sometimes the difference lay in the insightful analysis and inference enabled by ground-breaking methods in our field like hypothesis testing, likelihood ratios, Bayesian models, jackknife, and bootstrap. B...
-
作者:Wager, Stefan
作者单位:Stanford University; Stanford University
-
作者:Wang, Qing
作者单位:Wellesley College