-
作者:Chernoff, Herman; Lo, Shaw-Hwa; Zheng, Tian
作者单位:Harvard University; Columbia University
摘要:A trend in all scientific disciplines, based on advances in technology, is the increasing availability of high dimensional data in which are buried important information. A current urgent challenge to statisticians is to develop effective methods of finding the useful information from the vast amounts of messy and noisy data available, most of which are noninformative. This paper presents a general computer intensive approach, based on a method pioneered by Lo and Zheng for detecting which, of...
-
作者:Loh, Wet-Yin
作者单位:University of Wisconsin System; University of Wisconsin Madison
摘要:Besides serving as prediction models, classification trees are useful for finding important predictor variables and identifying interesting subgroups in the data. These functions can be compromised by weak split selection algorithms that have variable selection biases or that fail to search beyond local main effects at each node of the tree. The resulting models may include many irrelevant variables or select too few of the important ones. Either eventuality can lead to erroneous conclusions. ...
-
作者:Scott, James G.
作者单位:University of Texas System; University of Texas Austin
摘要:This paper describes a framework for flexible multiple hypothesis testing of autoregressive time series. The modeling approach is Bayesian, though a blend of frequentist and Bayesian reasoning is used to evaluate procedures. Nonparametric characterizations of both the null and alternative hypotheses will be shown to be the key robustification step necessary to ensure reasonable Type-I error performance. The methodology is applied to part of a large database containing up to 50 years of corpora...
-
作者:Szekely, Gabor J.; Rizzo, Maria L.
作者单位:University System of Ohio; Bowling Green State University; Hungarian Academy of Sciences; HUN-REN; HUN-REN Alfred Renyi Institute of Mathematics
-
作者:Gretton, Arthur; Fukumizu, Kenji; Sriperumbudur, Bharath K.
作者单位:Carnegie Mellon University; Max Planck Society; Research Organization of Information & Systems (ROIS); Institute of Statistical Mathematics (ISM) - Japan; University of California System; University of California San Diego; Max Planck Society; University of California System; University of California San Diego
-
作者:Yuan, Ming; Joseph, V. Roshan; Zou, Hui
作者单位:University System of Georgia; Georgia Institute of Technology; University of Minnesota System; University of Minnesota Twin Cities
摘要:In linear regression problems with related predictors, it is desirable to do variable selection and estimation by maintaining the hierarchical or structural relationships among predictors. In this paper we propose non-negative garrote methods that can naturally incorporate such relationships defined through effect heredity principles or marginality principles. We show that the methods are very easy to compute and enjoy nice theoretical properties. We also show that the methods can be easily ex...
-
作者:Szekely, Gabor J.; Rizzo, Maria L.
作者单位:University System of Ohio; Bowling Green State University; Hungarian Academy of Sciences; HUN-REN; HUN-REN Alfred Renyi Institute of Mathematics
摘要:Distance correlation is a new class of multivariate dependence coefficients applicable to random vectors of arbitrary and not necessarily equal dimension. Distance covariance and distance correlation are analogous to product-moment covariance and correlation, but generalize and extend these classical bivariate measures of dependence. Distance correlation characterizes independence: it is zero if and only if the random vectors are independent. The notion of covariance with respect to a stochast...
-
作者:Baggerly, Keith A.; Coombes, Kevin R.
作者单位:University of Texas System; UTMD Anderson Cancer Center
摘要:High-throughput biological assays such as microarrays let us ask very detailed questions about how diseases operate, and promise to let us personalize therapy. Data processing, however, is often not described well enough to allow for exact reproduction of the results, leading to exercises in forensic bioinformatics where aspects of raw data and reported results are used to infer what methods must have been employed. Unfortunately, poor documentation can shift from an inconvenience to an active...
-
作者:Martinez, Josue G.; Huang, Jianhua Z.; Burghardt, Robert C.; Barhoumi, Rola; Carroll, Raymond J.
作者单位:Texas A&M University System; Texas A&M University College Station; Texas A&M University System; Texas A&M University College Station
摘要:We compare calcium ion signaling (Ca2+) between two exposures; the data are present as movies, or, more prosaically, time series of images. This paper describes novel uses of singular value decompositions (SVD) and weighted versions of them (WSVD) to extract the signals from such movies, in a way that is semi-automatic and tuned closely to the actual data and their many complexities. These complexities include the following. First, the images themselves are of no interest: all interest focuses...
-
作者:van den Broek, Jan; Nishiura, Hiroshi
作者单位:Utrecht University
摘要:This study proposes a nonhomogeneous birth-death model which captures the dynamics of a directly transmitted infectious disease. Our model accounts for an important aspect of observed epidemic data in which only symptomatic infecteds are observed. The nonhomogeneous birth-death process depends on survival distributions of reproduction and removal, which jointly yield an estimate of the effective reproduction number R(t) as a function of epidemic time. We employ the Burr distribution family for...